[ovs-dev] [PATCH] netdev-dpdk: add sflow support for vhost-user ports

Neil McKee neil.mckee at inmon.com
Thu Jun 16 13:52:36 UTC 2016


Reusing an ifIndex from time to time is no problem from the sFlow standard
point of view either.   In fact we used a hash approach in the host-sflow
project to assign index numbers to containers  (by open-hashing the
container UUID into a range of possible index numbers):
https://github.com/sflow/host-sflow/blob/master/src/Linux/hsflowd.c#L373


------
Neil McKee
InMon Corp.
http://www.inmon.com

On Thu, Jun 16, 2016 at 4:16 AM, Lal, PrzemyslawX <przemyslawx.lal at intel.com
> wrote:

> Hi Mark,
>
> RFC 2863 standard describes in "Interface Numbering" chapter (
> https://tools.ietf.org/html/rfc2863#page-10) that if there's possibility
> to reuse ifindex after reinitialization then it should be reused. And in
> our case it's definitely possible. I also agree with you that limitation to
> 1024 vhostuser interfaces is a major problem here and I should avoid such
> arbitrary values. I think the solution might be to use hashmap or OVS list
> implementation or maybe even dynamic allocation of new vhost ifindex
> entries.
>
> I have also analyzed current vhost-cuse initialization and adding support
> for ifindex assignment to this type of port should be a trivial task and it
> could be easily introduced in V2 of this patch.
>
> Please share your thoughts.
>
> Thanks,
> Przemek
>
> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Wednesday, May 11, 2016 11:56 AM
> To: Lal, PrzemyslawX <przemyslawx.lal at intel.com>; dev at openvswitch.org
> Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user ports
>
> >
> >Hi Mark,
> >
> >Replies inline prefixed with [PL].
> >
> >Thanks,
> >Przemek
> >
> >-----Original Message-----
> >From: Kavanagh, Mark B
> >Sent: Thursday, May 5, 2016 2:19 PM
> >To: Lal, PrzemyslawX <przemyslawx.lal at intel.com>; dev at openvswitch.org
> >Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user
> >ports
> >
> >Hi Przemek,
> >
> >Some additional comments/queries inline.
> >
> >Thanks again,
> >Mark
> >
> >>
> >>This patch adds sFlow support for DPDK vHost-user interfaces by
> >>assigning ifindex value. Values of ifindexes for vHost-user interfaces
> >>start with 2000 to avoid overlapping with kernel datapath interfaces.
> >>
> >>Patch also fixes issue with 'dpdk0' interface being ignored by sFlow
> >>agent, because of ifindex 0. Ifindexes values for physical DPDK
> >>interfaces start with 1000 to avoid overlapping with kernel datapath
> interfaces.
> >>
> >>Signed-off-by: Przemyslaw Lal <przemyslawx.lal at intel.com>
> >>---
> >> lib/netdev-dpdk.c | 70
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> >> 1 file changed, 69 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> >>208c5f5..1b60c5a 100644
> >>--- a/lib/netdev-dpdk.c
> >>+++ b/lib/netdev-dpdk.c
> >>@@ -115,6 +115,18 @@ static char *vhost_sock_dir = NULL;   /* Location
> of vhost-user sockets
> >>*/
> >>  */
> >> #define VHOST_ENQ_RETRY_USECS 100
> >>
> >>+/* For DPDK ETH interfaces use ifindex values starting with 1000
> >>+ * to avoid overlapping with kernel-space interfaces.
> >>+ * Also starting with 0 would cause sFlow to ignore 'dpdk0' interface.
> >>+ */
> >>+#define DPDK_PORT_ID_TO_IFINDEX(port_id) ((port_id) + 1000)
> >>+
> >>+/* For DPDK vhost-user interfaces use ifindexes starting with 2000.
> >>+ */
> >>+#define VHOST_ID_TO_IFINDEX(port_id) ((port_id) + 2000)
> >>+
> >>+#define VHOST_IDS_MAX_LEN 1024
> >>+
> >> static const struct rte_eth_conf port_conf = {
> >>     .rxmode = {
> >>         .mq_mode = ETH_MQ_RX_RSS,
> >>@@ -149,6 +161,7 @@ enum dpdk_dev_type {  static int rte_eal_init_ret
> >>= ENODEV;
> >>
> >> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER;
> >>+static struct ovs_mutex vhost_mutex = OVS_MUTEX_INITIALIZER;
> >>
> >> /* Quality of Service */
> >>
> >>@@ -790,6 +803,50 @@ netdev_dpdk_vhost_cuse_construct(struct netdev
> *netdev)
> >>     return err;
> >> }
> >>
> >>+/* Counter for VHOST interfaces as DPDK library doesn't provide
> >>+mechanism
> >>+ * similar to rte_eth_dev_count for vhost-user sockets.
> >>+ */
> >>+static int vhost_counter OVS_GUARDED_BY(vhost_mutex) = 0;
> >>+
> >>+/* Array storing vhost_ids, so their ifindex can be reused after
> >>+socket
> >>+ * recreation.
> >>+ */
> >>+static char vhost_ids[VHOST_IDS_MAX_LEN][PATH_MAX]
> >>+OVS_GUARDED_BY(vhost_mutex);
> >>+
> >>+/* Simple lookup in vhost_ids array.
> >>+ * On success returns id of vhost_id stored in the array,
> >>+ * otherwise returns -1.
> >>+ */
> >>+static int
> >>+netdev_dpdk_lookup_vhost_id(struct netdev_dpdk *dev)
> >>+OVS_REQUIRES(vhost_mutex) {
> >>+    for (int i = 0; i < vhost_counter; i++) {
> >>+        if (strncmp(vhost_ids[i], dev->vhost_id, strlen(dev->vhost_id))
> == 0) {
> >>+            return i;
> >>+        }
> >>+    }
> >>+    return -1;
> >>+}
> >>+
> >>+/* Inserts vhost_id at the first free position in the vhost_ids array.
> >>+ */
> >>+static void
> >>+netdev_dpdk_push_vhost_id(struct netdev_dpdk *dev) {
> >>+    ovs_mutex_lock(&vhost_mutex);
> >>+    if (netdev_dpdk_lookup_vhost_id(dev) < 0) {
> >>+        if (vhost_counter < VHOST_IDS_MAX_LEN) {
> >>+            strncpy(vhost_ids[vhost_counter++], dev->vhost_id,
> >>+                    strlen(dev->vhost_id));
> >>+        } else {
> >>+            VLOG_WARN("Could not assign ifindex to \"%s\" port. "
> >>+                      "List of vhost IDs list is full.",
> >>+                      dev->vhost_id);
> >>+        }
> >>+    }
> >>+    ovs_mutex_unlock(&vhost_mutex);
> >>+}
> >>+
> >> static int
> >> netdev_dpdk_vhost_user_construct(struct netdev *netdev)  { @@ -825,6
> >>+882,8 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev)
> >>         err = vhost_construct_helper(netdev);
> >>     }
> >>
> >>+    netdev_dpdk_push_vhost_id(dev);
> >>+
> >>     ovs_mutex_unlock(&dpdk_mutex);
> >>     return err;
> >> }
> >>@@ -1773,9 +1832,18 @@ netdev_dpdk_get_ifindex(const struct netdev
> >>*netdev)  {
> >>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >>     int ifindex;
> >>+    int ret;
> >>
> >>     ovs_mutex_lock(&dev->mutex);
> >>-    ifindex = dev->port_id;
> >>+    if (dev->type == DPDK_DEV_ETH) {
> >>+        ifindex = DPDK_PORT_ID_TO_IFINDEX(dev->port_id);
> >>+    } else {
> >
> >
> >This 'else' statement is executed in both the case of vhost user
> >devices and also vhost cuse devices.
> >In the latter case, do the port numbers returned by VHOST_ID_IFINDEX
> >make sense? (i.e. can sFlow 'talk' to vhost-cuse ports? If not, then this
> code has a bug).
> >
> >[PL] At this point for vhost cuse there's always '-1' returned from
> >lookup function, so in the nested if statement you'll always hit 'else'
> >and 0 will be returned as ifindex for vhost cuse devices.
>
> Agreed. But then all vhost-cuse devices share index of 0 - this is surely
> problematic?
>
> >Anyway setting ifindexes for vhost cuse devices should be a trivial
> >task and if vhost cuse is supported by sflow I can handle it - but then
> >it will require changing patch subject and message.
>
> Have you tested this patch against vhost-cuse? I'd be interested as to how
> it behaves.
>
> >
> >>+        if ((ret = netdev_dpdk_lookup_vhost_id(dev)) >= 0) {
> >>+            ifindex = VHOST_ID_TO_IFINDEX(ret);
> >>+        } else {
> >>+            ifindex = 0;
> >
> >I don't think that 0 is an appropriate value to return here, since it
> >overlaps with a kernel interface's index.
> >
> >[PL] IMHO it's ok to use 0 here - it will be assigned to ifindex only
> >when lookup for vhost device failed.
>
> Have you seen Ben's comment on this?
> http://openvswitch.org/pipermail/dev/2016-May/070626.html
>
> >
> >>+        }
> >>+    }
> >>     ovs_mutex_unlock(&dev->mutex);
> >>
> >>     return ifindex;
> >>--
> >>2.1.0
> >
> >[MK] There's a gap in this code - when the vhost-user device is
> >destroyed, its entry should be removed from 'vhost_ids'.
> >
> >[PL] Actually the whole idea behind "caching" these vhost_ids is to
> >have them stored to survive destroy and recreation of vhost devices,
> >for instance when VM is hard-rebooted, to keep the same ifindex and not
> >assign a new one, which could be confusing (for example one could have
> >vhu1 port with ifindex 2001 visible in sFlow stats, but then
> >hard-reboots this VM and vhu1 gets ifindex 2002 - it's the same vhost
> >user socket, the same VM, but somehow it gets new ifindex and gets new
> entry in sflow). But if you think I can be improve this, don't hesitate to
> share your ideas.
>
> I completely understand what your intention is here. There is still an
> inherent issue with the design though, in that once vhost ports have been
> added to 'vhost_ids', they are never removed.
>
> Consider the following scenario:
> - multiple vhost ports have been added to the bridge, such that
> vhost_counter = VHOST_IDS_MAX_LEN (I'm not sure what the use case for
> adding 1024 ports to a bridge would be, but I digress)
> - one or more of the previously-added ports are no longer needed, so we
> bring the VMs that they are attached to down
> - we want to add another vhost port, described by a vhost_id that is not
> already present in 'vhost_ids', and connect it to a VM => Now, however, we
> cannot add the additional port, due to the fact that vhost_counter was
> never decremented, such that it is still equal to VHOST_IDS_MAX_LEN: the
> bounds check in netdev_dpdk_push_vhost_id fails, and the user is warned
> that their port cannot be added.
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>



More information about the dev mailing list