[ovs-dev] [PATCH] netdev: Dynamic per-port Flow API.

Ophir Munk ophirmu at mellanox.com
Tue Nov 19 17:20:00 UTC 2019


Hi Ilya,
Thanks for the patch set which adds the dpif hint inside the netdev struct. It is really helpful.
Our goal is to have flow_put() calls on vxlan devices, using the existing dpdk flow API.
We added logic inside netdev_dpdk_flow_api_supported() to accept vxlan devices and indeed 
we see in the log: 
"netdev_offload|INFO|vxlan0: Assigned flow API 'dpdk_flow_api'".
However, there is no flow_put() on the vxlan device. 
We see our own printouts in dpif-netdev such as:
"dpif_netdev(dp_netdev_flow_5)|ERR|vxlan_sys_4789: calling netdev flow_put()" 
but inside netdev_flow_put the flow_api poiner is NULL.
Any ideas why?

Regards,
Ophir

> -----Original Message-----
> From: Ilya Maximets <i.maximets at ovn.org>
> Sent: Friday, November 15, 2019 10:26 PM
> To: Ophir Munk <ophirmu at mellanox.com>; Ilya Maximets
> <i.maximets at ovn.org>; ovs-dev at openvswitch.org
> Cc: Ameer Mahagneh <ameerm at mellanox.com>; Roni Bar Yanai
> <roniba at mellanox.com>; Eveline Raine <eveliner at mellanox.com>; Oz
> Shlomo <ozsh at mellanox.com>; Eli Britstein <elibr at mellanox.com>
> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> 
> On 10.11.2019 21:17, Ophir Munk wrote:
> > Hi Ilya,
> > Thanks you for taking care of this.
> > Assigning the correct vport flow API is a critical feature for offloading.
> >
> > It seems hard to address all different vport configuration scenarios (kernel
> only, userspace only or both) by just relying on the individual netdev and
> without knowing the dpif on top.
> 
> Yes you're right.  The only module that knows the real mapping of 'netdev's
> to 'dpif's is 'ofproto' and we need to get this information somehow.
> 
> > Maybe can move the flow API assignment to the point where dp is actually
> adding the port netdev and give a hint about the dp.
> 
> Since we already have a hint from the upper layers about the 'dpif_class' I'm
> suggesting to replace it with 'dpif_type'.
> In pair with storing this hint inside the netdev we could have a flexible
> system without exposing internals to higher layers or trying to use higher
> layer from the library code.
> 
> Please, take a look at the new version of patches:
> 
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.
> openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-
> November%2F364735.html&data=02%7C01%7Cophirmu%40mellanox.c
> om%7Caba8a2cdff2342aa536b08d76a09fc61%7Ca652971c7d2e4d9ba6a4d149
> 256f461b%7C0%7C0%7C637094463434958586&sdata=taEV4B7BA83%2Fu
> faHoW7Y7F824SumWEnEXhVUpUON5eA%3D&reserved=0
> 
> > It is also confusing that userspace vport names are "sys_vxlan_4789" and
> not "usr_vxlan_4789" for example.
> 
> Yes, this is a bit confusing, but we, probably, could not just change it.  At least
> that we'll have to rewrite a big part of system tests that relies on the port
> names in the flow dumps.
> We will have to duplicate them for kernel and userspace.
> Some external management/monitoring tools could be affected too because
> they will have to treat kernel and userspace tunneling differently.
> 
> Best regards, Ilya Maximets.
> 
> > What do you think?
> >
> > Regards,
> > Ophir
> >
> >> -----Original Message-----
> >> From: Ilya Maximets <i.maximets at ovn.org>
> >> Sent: Friday, November 8, 2019 5:57 PM
> >> To: Ophir Munk <ophirmu at mellanox.com>; Ilya Maximets
> >> <i.maximets at ovn.org>; ovs-dev at openvswitch.org
> >> Cc: Ameer Mahagneh <ameerm at mellanox.com>; Roni Bar Yanai
> >> <roniba at mellanox.com>; Eveline Raine <eveliner at mellanox.com>
> >> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> >>
> >> On 06.11.2019 18:11, Ophir Munk wrote:
> >>> Hi Ilya,
> >>> A few months ago we discussed the missing functionality of vports
> >> offloading under user space bridges.
> >>> Commit [1] was added to explicitly avoid installing userspace vport
> >>> flows (to
> >> avoid confusion with the vport kernel).
> >>> When reverting commit [1] - we are left with this missing functionality.
> >>> Applying your suggested patch netdev_vport_has_system_port() API)
> >> does not seem to work.
> >>> It always fails when it tries to look for the interface name (say
> >>> "vxlan10") in
> >> the system list where vxlan interfaces are renamed (say
> "vxlan_sys_4789").
> >>
> >> Hi Ophir,
> >>
> >> No-one ever tried to run this code, so I'm not surprised.
> >> I could take a look at this issue on next week.
> >>
> >> Best regards, Ilya Maximets.
> >>
> >>>
> >>> You wrote:
> >>>> On master, after applying dynamic flow API patch-set, we'll be able
> >>>> to use patch that I suggested in previous mail to properly handle
> >>>> this situation and enable vport offloading.
> >>>
> >>> It would be appreciated if we can resume the efforts in fixing this
> >>> missing
> >> functionality.
> >>> Please advise on the next steps.
> >>>
> >>> [1]
> >>> Commit 0da667e345 ("dpif-netdev: Forbid vport offloading attempts.")
> >>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Ilya Maximets <i.maximets at samsung.com>
> >>>> Sent: Monday, May 13, 2019 3:33 PM
> >>>> To: Ophir Munk <ophirmu at mellanox.com>; ovs-
> dev at openvswitch.org
> >>>> Cc: Ian Stokes <ian.stokes at intel.com>; Flavio Leitner
> >>>> <fbl at sysclose.org>; Kevin Traynor <ktraynor at redhat.com>; Roni Bar
> >>>> Yanai <roniba at mellanox.com>; Finn Christensen <fc at napatech.com>;
> >> Ben
> >>>> Pfaff <blp at ovn.org>; Simon Horman
> <simon.horman at netronome.com>;
> >> Olga
> >>>> Shern <olgas at mellanox.com>; Asaf Penso <asafp at mellanox.com>; Oz
> >>>> Shlomo <ozsh at mellanox.com>; Majd Dibbiny <majd at mellanox.com>
> >>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> >>>>
> >>>> On 13.05.2019 14:41, Ophir Munk wrote:
> >>>>> Hi Ilya,
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Ilya Maximets <i.maximets at samsung.com>
> >>>>>> Sent: Monday, May 13, 2019 12:22 PM
> >>>>>> To: Ophir Munk <ophirmu at mellanox.com>; ovs-
> >> dev at openvswitch.org
> >>>>>> Cc: Ian Stokes <ian.stokes at intel.com>; Flavio Leitner
> >>>>>> <fbl at sysclose.org>; Kevin Traynor <ktraynor at redhat.com>; Roni
> Bar
> >>>>>> Yanai <roniba at mellanox.com>; Finn Christensen
> <fc at napatech.com>;
> >>>> Ben
> >>>>>> Pfaff <blp at ovn.org>; Simon Horman
> >> <simon.horman at netronome.com>;
> >>>> Olga
> >>>>>> Shern <olgas at mellanox.com>; Asaf Penso <asafp at mellanox.com>;
> Oz
> >>>>>> Shlomo <ozsh at mellanox.com>
> >>>>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> >>>>>>
> >>>>>> On 09.05.2019 1:59, Ophir Munk wrote:
> >>>>>>> Hi Ilya,
> >>>>>>>
> >>>>> ...
> >>>>>>> I am recreating this scenario in my setup.
> >>>>>>
> >>>>>> I see. Yes, you're right. And I think that this case could be
> >>>>>> reproduced on current master without any patches. So, it's a bug
> >>>>>> that we
> >>>> need to fix.
> >>>>>> Otherwise userspace datapath will try to offload its flows to the
> >>>>>> unrelated system interfaces. For now we could just forbid
> >>>>>> offloading to vports in dpif- netdev. I'll prepare the patch.
> >>>>>> This fix also should be
> >>>> backported.
> >>>>>>
> >>>>>
> >>>>> We need a way to enable offloading of vports in dpif-netdev. So if
> >>>>> you can address It with your new patch - that would be appreciated.
> >>>>
> >>>> I'm suggesting disabling because it'll be easy to backport to older
> >> branches.
> >>>> On master, after applying dynamic flow API patch-set, we'll be able
> >>>> to use patch that I suggested in previous mail to properly handle
> >>>> this situation and enable vport offloading.
> >>>>
> >>>
> >>>>>
> >>>>>>>
> >>>>>>> This patch series is important but one of its main goals is to
> >>>>>>> enable different
> >>>>>> offloads for different vports under Kernel/userspace.
> >>>>>>> Can you please advise how this goal can be achieved?
> >>>>>>
> >>>>>> It looks like there is no way to distinguish system and userspace
> >>>>>> vports by looking only on netdev. We should check with dpif type.
> >>>>>
> >>>>> Agreed. But then looking at your sample patch below you are basing
> >>>>> your decision on the existence of system port (and not on dpif type).
> >>>>
> >>>> Right now 'ifindex' is used for checking, so this is equally racy.
> >>>>
> >>>>>   I think it is risky because
> >>>>> you are dependent on the order of operations: Creation of the
> >>>>> system port
> >>>> versus checking for system port existence.
> >>>>> Which was first (under different scenarios)?
> >>>>
> >>>> Actually, port creation and checking will always happen in the same
> >>>> thread context, so creation and checking will be serialized. Kernel
> >>>> should guarantee the port existence. No races here.
> >>>>
> >>>>>
> >>>>> What do you say about the following patch:
> >>>>>
> >>>>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c int
> >>>>> netdev_ports_insert(struct netdev *netdev, const struct dpif_class
> >>>>> *dpif_class,
> >>>>>                       struct dpif_port *dpif_port) @@ -547,8
> >>>>> +555,9 @@ netdev_ports_insert(struct netdev *netdev, const struct
> >>>>> dpif_class *dpif_class,
> >>>>>                   netdev_ports_hash(dpif_port->port_no, dpif_class));
> >>>>>       hmap_insert(&ifindex_to_port, &data->ifindex_node, ifindex);
> >>>>>       ovs_mutex_unlock(&netdev_hmap_mutex);
> >>>>>
> >>>>> -    netdev_init_flow_api(netdev);
> >>>>> +    netdev_init_flow_api(netdev, dpif_class->type); /* Pass class
> >>>>> + type "netdev" or "syste" */
> >>>>>
> >>>>>       return 0;
> >>>>> }
> >>>>>
> >>>>> It's not a full solution. It is just a hint how to pass the dpif
> >>>>> type ("netdev" or
> >>>> "system")  when initializing the flow api.
> >>>>> We can use the dpif type when initializing vport offload.
> >>>>
> >>>> This is, actually, the first thing I tried. However, this requires
> >>>> exposing the internals of 'dpif-provider', which is bad from the
> >>>> point of
> >> code architecture.
> >>>>
> >>>> For the below patch code, I missed actual port requesting from the
> >> datapath.
> >>>> Following incremental needed:
> >>>>
> >>>> ---
> >>>> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index
> >>>> 82943c102..1c88a91ed 100644
> >>>> --- a/lib/netdev-vport.c
> >>>> +++ b/lib/netdev-vport.c
> >>>> @@ -124,13 +124,28 @@ netdev_vport_class_get_dpif_port(const
> struct
> >>>> netdev_class *class)  bool  netdev_vport_has_system_port(const
> >>>> struct netdev *netdev)  {
> >>>> -    bool found;
> >>>> +    bool found = false;
> >>>> +    const char *name;
> >>>> +    const char *type = "system";
> >>>>       struct sset names = SSET_INITIALIZER(&names);
> >>>>
> >>>>       ovs_assert(is_vport_class(netdev_get_class(netdev)));
> >>>>
> >>>> -    dp_enumerate_names("system", &names);
> >>>> -    found = sset_contains(&names, netdev_get_name(netdev));
> >>>> +    dp_enumerate_names(type, &names);
> >>>> +    SSET_FOR_EACH (name, &names) {
> >>>> +        struct dpif *dpifp = NULL;
> >>>> +
> >>>> +        if (dpif_open(name, type, &dpifp)) {
> >>>> +            continue;
> >>>> +        }
> >>>> +
> >>>> +        found = dpif_port_exists(dpifp, netdev_get_name(netdev));
> >>>> +
> >>>> +        dpif_close(dpifp);
> >>>> +        if (found) {
> >>>> +            break;
> >>>> +        }
> >>>> +    }
> >>>>       sset_destroy(&names);
> >>>>
> >>>>       return found;
> >>>> ---
> >>>>
> >>>>>
> >>>>>> Something like this (not tested):
> >>>>>>
> >>>>>> diff --git a/lib/netdev-offload-dpdk.c
> >>>>>> b/lib/netdev-offload-dpdk.c index 01e900461..b7b0616ec 100644
> >>>>>> --- a/lib/netdev-offload-dpdk.c
> >>>>>> +++ b/lib/netdev-offload-dpdk.c
> >>>>>> @@ -22,6 +22,7 @@
> >>>>>>   #include "dpif-netdev.h"
> >>>>>>   #include "netdev-offload-provider.h"
> >>>>>>   #include "netdev-provider.h"
> >>>>>> +#include "netdev-vport.h"
> >>>>>>   #include "openvswitch/match.h"
> >>>>>>   #include "openvswitch/vlog.h"
> >>>>>>   #include "packets.h"
> >>>>>> @@ -752,6 +753,13 @@ netdev_offload_dpdk_flow_del(struct
> netdev
> >>>>>> *netdev, const ovs_u128 *ufid,  static int
> >>>>>> netdev_offload_dpdk_init_flow_api(struct netdev *netdev)  {
> >>>>>> +    if (netdev_vport_is_vport_class(netdev->netdev_class)
> >>>>>> +        && netdev_vport_has_system_port(netdev)) {
> >>>>>> +        VLOG_DBG("%s: vport has backing system interface.
> Skipping.",
> >>>>>> +                 netdev_get_name(netdev));
> >>>>>> +        return EOPNOTSUPP;
> >>>>>> +    }
> >>>>>> +
> >>>>>>       return netdev_dpdk_flow_api_supported(netdev) ? 0 :
> >>>>>> EOPNOTSUPP; }
> >>>>>>
> >>>>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> >>>>>> index
> >>>>>> 498aae369..e4d121ed7 100644
> >>>>>> --- a/lib/netdev-offload-tc.c
> >>>>>> +++ b/lib/netdev-offload-tc.c
> >>>>>> @@ -31,6 +31,7 @@
> >>>>>>   #include "netdev-linux.h"
> >>>>>>   #include "netdev-offload-provider.h"
> >>>>>>   #include "netdev-provider.h"
> >>>>>> +#include "netdev-vport.h"
> >>>>>>   #include "netlink.h"
> >>>>>>   #include "netlink-socket.h"
> >>>>>>   #include "odp-netlink.h"
> >>>>>> @@ -1560,6 +1561,13 @@ netdev_tc_init_flow_api(struct netdev
> >>>> *netdev)
> >>>>>>       int ifindex;
> >>>>>>       int error;
> >>>>>>
> >>>>>> +    if (netdev_vport_is_vport_class(netdev->netdev_class)
> >>>>>> +        && !netdev_vport_has_system_port(netdev)) {
> >>>>>> +        VLOG_DBG("%s: vport has no backing system interface.
> >> Skipping.",
> >>>>>> +                 netdev_get_name(netdev));
> >>>>>> +        return EOPNOTSUPP;
> >>>>>> +    }
> >>>>>> +
> >>>>>>       ifindex = netdev_get_ifindex(netdev);
> >>>>>>       if (ifindex < 0) {
> >>>>>>           VLOG_INFO("init: failed to get ifindex for %s: %s",
> >>>>>> diff --git a/lib/netdev- vport.c b/lib/netdev-vport.c index
> >>>>>> 92a256af1..82943c102 100644
> >>>>>> --- a/lib/netdev-vport.c
> >>>>>> +++ b/lib/netdev-vport.c
> >>>>>> @@ -44,6 +44,7 @@
> >>>>>>   #include "simap.h"
> >>>>>>   #include "smap.h"
> >>>>>>   #include "socket-util.h"
> >>>>>> +#include "sset.h"
> >>>>>>   #include "unaligned.h"
> >>>>>>   #include "unixctl.h"
> >>>>>>   #include "openvswitch/vlog.h"
> >>>>>> @@ -120,6 +121,21 @@ netdev_vport_class_get_dpif_port(const
> >> struct
> >>>>>> netdev_class *class)
> >>>>>>       return is_vport_class(class) ?
> >>>>>> vport_class_cast(class)->dpif_port : NULL;  }
> >>>>>>
> >>>>>> +bool
> >>>>>> +netdev_vport_has_system_port(const struct netdev *netdev) {
> >>>>>> +    bool found;
> >>>>>> +    struct sset names = SSET_INITIALIZER(&names);
> >>>>>> +
> >>>>>> +    ovs_assert(is_vport_class(netdev_get_class(netdev)));
> >>>>>> +
> >>>>>> +    dp_enumerate_names("system", &names);
> >>>>>> +    found = sset_contains(&names, netdev_get_name(netdev));
> >>>>>> +    sset_destroy(&names);
> >>>>>> +
> >>>>>> +    return found;
> >>>>>> +}
> >>>>>> +
> >>>>>>   const char *
> >>>>>>   netdev_vport_get_dpif_port(const struct netdev *netdev,
> >>>>>>                              char namebuf[], size_t bufsize) diff
> >>>>>> --git a/lib/netdev-vport.h b/lib/netdev-vport.h index
> >>>>>> 9d756a265..4be1b92ec 100644
> >>>>>> --- a/lib/netdev-vport.h
> >>>>>> +++ b/lib/netdev-vport.h
> >>>>>> @@ -40,6 +40,7 @@ void netdev_vport_inc_tx(const struct netdev
> *,
> >>>>>>                            const struct dpif_flow_stats *);
> >>>>>>
> >>>>>>   bool netdev_vport_is_vport_class(const struct netdev_class *);
> >>>>>> +bool netdev_vport_has_system_port(const struct netdev *);
> >>>>>>   const char *netdev_vport_class_get_dpif_port(const struct
> >>>>>> netdev_class *);
> >>>>>>
> >>>>>>   #ifndef _WIN32
> >>>>>> ---
> >>>>>>
> >>>>>> I'll format this as a proper patch and send along with patch for
> >>>>>> enabling offloading for ports without correct ifindex.
> >>>>>>
> >>>>>> Best regards, Ilya Maximets.
> >>>>>
> >>>>> Best regards,
> >>>>> Ophir
> >>>>>


More information about the dev mailing list