[ovs-dev] [PATCH] netdev-dpdk: add sflow support for vhost-user ports
Lal, PrzemyslawX
przemyslawx.lal at intel.com
Tue May 10 12:21:03 UTC 2016
Hi Mark,
Replies inline prefixed with [PL].
Thanks,
Przemek
-----Original Message-----
From: Kavanagh, Mark B
Sent: Thursday, May 5, 2016 2:19 PM
To: Lal, PrzemyslawX <przemyslawx.lal at intel.com>; dev at openvswitch.org
Subject: RE: [PATCH] netdev-dpdk: add sflow support for vhost-user ports
Hi Przemek,
Some additional comments/queries inline.
Thanks again,
Mark
>
>This patch adds sFlow support for DPDK vHost-user interfaces by
>assigning ifindex value. Values of ifindexes for vHost-user interfaces
>start with 2000 to avoid overlapping with kernel datapath interfaces.
>
>Patch also fixes issue with 'dpdk0' interface being ignored by sFlow
>agent, because of ifindex 0. Ifindexes values for physical DPDK
>interfaces start with 1000 to avoid overlapping with kernel datapath interfaces.
>
>Signed-off-by: Przemyslaw Lal <przemyslawx.lal at intel.com>
>---
> lib/netdev-dpdk.c | 70
>++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 69 insertions(+), 1 deletion(-)
>
>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>208c5f5..1b60c5a 100644
>--- a/lib/netdev-dpdk.c
>+++ b/lib/netdev-dpdk.c
>@@ -115,6 +115,18 @@ static char *vhost_sock_dir = NULL; /* Location of vhost-user sockets
>*/
> */
> #define VHOST_ENQ_RETRY_USECS 100
>
>+/* For DPDK ETH interfaces use ifindex values starting with 1000
>+ * to avoid overlapping with kernel-space interfaces.
>+ * Also starting with 0 would cause sFlow to ignore 'dpdk0' interface.
>+ */
>+#define DPDK_PORT_ID_TO_IFINDEX(port_id) ((port_id) + 1000)
>+
>+/* For DPDK vhost-user interfaces use ifindexes starting with 2000.
>+ */
>+#define VHOST_ID_TO_IFINDEX(port_id) ((port_id) + 2000)
>+
>+#define VHOST_IDS_MAX_LEN 1024
>+
> static const struct rte_eth_conf port_conf = {
> .rxmode = {
> .mq_mode = ETH_MQ_RX_RSS,
>@@ -149,6 +161,7 @@ enum dpdk_dev_type { static int rte_eal_init_ret =
>ENODEV;
>
> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER;
>+static struct ovs_mutex vhost_mutex = OVS_MUTEX_INITIALIZER;
>
> /* Quality of Service */
>
>@@ -790,6 +803,50 @@ netdev_dpdk_vhost_cuse_construct(struct netdev *netdev)
> return err;
> }
>
>+/* Counter for VHOST interfaces as DPDK library doesn't provide
>+mechanism
>+ * similar to rte_eth_dev_count for vhost-user sockets.
>+ */
>+static int vhost_counter OVS_GUARDED_BY(vhost_mutex) = 0;
>+
>+/* Array storing vhost_ids, so their ifindex can be reused after
>+socket
>+ * recreation.
>+ */
>+static char vhost_ids[VHOST_IDS_MAX_LEN][PATH_MAX]
>+OVS_GUARDED_BY(vhost_mutex);
>+
>+/* Simple lookup in vhost_ids array.
>+ * On success returns id of vhost_id stored in the array,
>+ * otherwise returns -1.
>+ */
>+static int
>+netdev_dpdk_lookup_vhost_id(struct netdev_dpdk *dev)
>+OVS_REQUIRES(vhost_mutex) {
>+ for (int i = 0; i < vhost_counter; i++) {
>+ if (strncmp(vhost_ids[i], dev->vhost_id, strlen(dev->vhost_id)) == 0) {
>+ return i;
>+ }
>+ }
>+ return -1;
>+}
>+
>+/* Inserts vhost_id at the first free position in the vhost_ids array.
>+ */
>+static void
>+netdev_dpdk_push_vhost_id(struct netdev_dpdk *dev) {
>+ ovs_mutex_lock(&vhost_mutex);
>+ if (netdev_dpdk_lookup_vhost_id(dev) < 0) {
>+ if (vhost_counter < VHOST_IDS_MAX_LEN) {
>+ strncpy(vhost_ids[vhost_counter++], dev->vhost_id,
>+ strlen(dev->vhost_id));
>+ } else {
>+ VLOG_WARN("Could not assign ifindex to \"%s\" port. "
>+ "List of vhost IDs list is full.",
>+ dev->vhost_id);
>+ }
>+ }
>+ ovs_mutex_unlock(&vhost_mutex);
>+}
>+
> static int
> netdev_dpdk_vhost_user_construct(struct netdev *netdev) { @@ -825,6
>+882,8 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev)
> err = vhost_construct_helper(netdev);
> }
>
>+ netdev_dpdk_push_vhost_id(dev);
>+
> ovs_mutex_unlock(&dpdk_mutex);
> return err;
> }
>@@ -1773,9 +1832,18 @@ netdev_dpdk_get_ifindex(const struct netdev
>*netdev) {
> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> int ifindex;
>+ int ret;
>
> ovs_mutex_lock(&dev->mutex);
>- ifindex = dev->port_id;
>+ if (dev->type == DPDK_DEV_ETH) {
>+ ifindex = DPDK_PORT_ID_TO_IFINDEX(dev->port_id);
>+ } else {
This 'else' statement is executed in both the case of vhost user devices and also vhost cuse devices.
In the latter case, do the port numbers returned by VHOST_ID_IFINDEX make sense? (i.e. can sFlow 'talk' to vhost-cuse ports? If not, then this code has a bug).
[PL] At this point for vhost cuse there's always '-1' returned from lookup function, so in the nested if statement you'll always hit 'else' and 0 will be returned as ifindex for vhost cuse devices.
Anyway setting ifindexes for vhost cuse devices should be a trivial task and if vhost cuse is supported by sflow I can handle it - but then it will require changing patch subject and message.
>+ if ((ret = netdev_dpdk_lookup_vhost_id(dev)) >= 0) {
>+ ifindex = VHOST_ID_TO_IFINDEX(ret);
>+ } else {
>+ ifindex = 0;
I don't think that 0 is an appropriate value to return here, since it overlaps with a kernel interface's index.
[PL] IMHO it's ok to use 0 here - it will be assigned to ifindex only when lookup for vhost device failed.
>+ }
>+ }
> ovs_mutex_unlock(&dev->mutex);
>
> return ifindex;
>--
>2.1.0
[MK] There's a gap in this code - when the vhost-user device is destroyed, its entry should be removed from 'vhost_ids'.
[PL] Actually the whole idea behind "caching" these vhost_ids is to have them stored to survive destroy and recreation of vhost devices, for instance when VM is hard-rebooted, to keep the same ifindex and not assign a new one, which could be confusing (for example one could have vhu1 port with ifindex 2001 visible in sFlow stats, but then hard-reboots this VM and vhu1 gets ifindex 2002 - it's the same vhost user socket, the same VM, but somehow it gets new ifindex and gets new entry in sflow). But if you think I can be improve this, don't hesitate to share your ideas.
More information about the dev
mailing list