[ovs-dev] [PATCH] netdev: Dynamic per-port Flow API.

Ophir Munk ophirmu at mellanox.com
Sun Nov 10 20:17:06 UTC 2019


Hi Ilya,
Thanks you for taking care of this.
Assigning the correct vport flow API is a critical feature for offloading.

It seems hard to address all different vport configuration scenarios (kernel only, userspace only or both) by just relying on the individual netdev and without knowing the dpif on top.
Maybe can move the flow API assignment to the point where dp is actually adding the port netdev and give a hint about the dp.
It is also confusing that userspace vport names are "sys_vxlan_4789" and not "usr_vxlan_4789" for example.
What do you think?

Regards,
Ophir

> -----Original Message-----
> From: Ilya Maximets <i.maximets at ovn.org>
> Sent: Friday, November 8, 2019 5:57 PM
> To: Ophir Munk <ophirmu at mellanox.com>; Ilya Maximets
> <i.maximets at ovn.org>; ovs-dev at openvswitch.org
> Cc: Ameer Mahagneh <ameerm at mellanox.com>; Roni Bar Yanai
> <roniba at mellanox.com>; Eveline Raine <eveliner at mellanox.com>
> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> 
> On 06.11.2019 18:11, Ophir Munk wrote:
> > Hi Ilya,
> > A few months ago we discussed the missing functionality of vports
> offloading under user space bridges.
> > Commit [1] was added to explicitly avoid installing userspace vport flows (to
> avoid confusion with the vport kernel).
> > When reverting commit [1] - we are left with this missing functionality.
> > Applying your suggested patch netdev_vport_has_system_port() API)
> does not seem to work.
> > It always fails when it tries to look for the interface name (say "vxlan10") in
> the system list where vxlan interfaces are renamed (say "vxlan_sys_4789").
> 
> Hi Ophir,
> 
> No-one ever tried to run this code, so I'm not surprised.
> I could take a look at this issue on next week.
> 
> Best regards, Ilya Maximets.
> 
> >
> > You wrote:
> >> On master, after applying dynamic flow API patch-set, we'll be able
> >> to use patch that I suggested in previous mail to properly handle
> >> this situation and enable vport offloading.
> >
> > It would be appreciated if we can resume the efforts in fixing this missing
> functionality.
> > Please advise on the next steps.
> >
> > [1]
> > Commit 0da667e345 ("dpif-netdev: Forbid vport offloading attempts.")
> >
> >>
> >> -----Original Message-----
> >> From: Ilya Maximets <i.maximets at samsung.com>
> >> Sent: Monday, May 13, 2019 3:33 PM
> >> To: Ophir Munk <ophirmu at mellanox.com>; ovs-dev at openvswitch.org
> >> Cc: Ian Stokes <ian.stokes at intel.com>; Flavio Leitner
> >> <fbl at sysclose.org>; Kevin Traynor <ktraynor at redhat.com>; Roni Bar
> >> Yanai <roniba at mellanox.com>; Finn Christensen <fc at napatech.com>;
> Ben
> >> Pfaff <blp at ovn.org>; Simon Horman <simon.horman at netronome.com>;
> Olga
> >> Shern <olgas at mellanox.com>; Asaf Penso <asafp at mellanox.com>; Oz
> >> Shlomo <ozsh at mellanox.com>; Majd Dibbiny <majd at mellanox.com>
> >> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> >>
> >> On 13.05.2019 14:41, Ophir Munk wrote:
> >>> Hi Ilya,
> >>>
> >>>> -----Original Message-----
> >>>> From: Ilya Maximets <i.maximets at samsung.com>
> >>>> Sent: Monday, May 13, 2019 12:22 PM
> >>>> To: Ophir Munk <ophirmu at mellanox.com>; ovs-
> dev at openvswitch.org
> >>>> Cc: Ian Stokes <ian.stokes at intel.com>; Flavio Leitner
> >>>> <fbl at sysclose.org>; Kevin Traynor <ktraynor at redhat.com>; Roni Bar
> >>>> Yanai <roniba at mellanox.com>; Finn Christensen <fc at napatech.com>;
> >> Ben
> >>>> Pfaff <blp at ovn.org>; Simon Horman
> <simon.horman at netronome.com>;
> >> Olga
> >>>> Shern <olgas at mellanox.com>; Asaf Penso <asafp at mellanox.com>; Oz
> >>>> Shlomo <ozsh at mellanox.com>
> >>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
> >>>>
> >>>> On 09.05.2019 1:59, Ophir Munk wrote:
> >>>>> Hi Ilya,
> >>>>>
> >>> ...
> >>>>> I am recreating this scenario in my setup.
> >>>>
> >>>> I see. Yes, you're right. And I think that this case could be
> >>>> reproduced on current master without any patches. So, it's a bug
> >>>> that we
> >> need to fix.
> >>>> Otherwise userspace datapath will try to offload its flows to the
> >>>> unrelated system interfaces. For now we could just forbid
> >>>> offloading to vports in dpif- netdev. I'll prepare the patch. This
> >>>> fix also should be
> >> backported.
> >>>>
> >>>
> >>> We need a way to enable offloading of vports in dpif-netdev. So if
> >>> you can address It with your new patch - that would be appreciated.
> >>
> >> I'm suggesting disabling because it'll be easy to backport to older
> branches.
> >> On master, after applying dynamic flow API patch-set, we'll be able
> >> to use patch that I suggested in previous mail to properly handle
> >> this situation and enable vport offloading.
> >>
> >
> >>>
> >>>>>
> >>>>> This patch series is important but one of its main goals is to
> >>>>> enable different
> >>>> offloads for different vports under Kernel/userspace.
> >>>>> Can you please advise how this goal can be achieved?
> >>>>
> >>>> It looks like there is no way to distinguish system and userspace
> >>>> vports by looking only on netdev. We should check with dpif type.
> >>>
> >>> Agreed. But then looking at your sample patch below you are basing
> >>> your decision on the existence of system port (and not on dpif type).
> >>
> >> Right now 'ifindex' is used for checking, so this is equally racy.
> >>
> >>>   I think it is risky because
> >>> you are dependent on the order of operations: Creation of the system
> >>> port
> >> versus checking for system port existence.
> >>> Which was first (under different scenarios)?
> >>
> >> Actually, port creation and checking will always happen in the same
> >> thread context, so creation and checking will be serialized. Kernel
> >> should guarantee the port existence. No races here.
> >>
> >>>
> >>> What do you say about the following patch:
> >>>
> >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c int
> >>> netdev_ports_insert(struct netdev *netdev, const struct dpif_class
> >>> *dpif_class,
> >>>                       struct dpif_port *dpif_port) @@ -547,8 +555,9
> >>> @@ netdev_ports_insert(struct netdev *netdev, const struct
> >>> dpif_class *dpif_class,
> >>>                   netdev_ports_hash(dpif_port->port_no, dpif_class));
> >>>       hmap_insert(&ifindex_to_port, &data->ifindex_node, ifindex);
> >>>       ovs_mutex_unlock(&netdev_hmap_mutex);
> >>>
> >>> -    netdev_init_flow_api(netdev);
> >>> +    netdev_init_flow_api(netdev, dpif_class->type); /* Pass class
> >>> + type "netdev" or "syste" */
> >>>
> >>>       return 0;
> >>> }
> >>>
> >>> It's not a full solution. It is just a hint how to pass the dpif
> >>> type ("netdev" or
> >> "system")  when initializing the flow api.
> >>> We can use the dpif type when initializing vport offload.
> >>
> >> This is, actually, the first thing I tried. However, this requires
> >> exposing the internals of 'dpif-provider', which is bad from the point of
> code architecture.
> >>
> >> For the below patch code, I missed actual port requesting from the
> datapath.
> >> Following incremental needed:
> >>
> >> ---
> >> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index
> >> 82943c102..1c88a91ed 100644
> >> --- a/lib/netdev-vport.c
> >> +++ b/lib/netdev-vport.c
> >> @@ -124,13 +124,28 @@ netdev_vport_class_get_dpif_port(const struct
> >> netdev_class *class)  bool  netdev_vport_has_system_port(const struct
> >> netdev *netdev)  {
> >> -    bool found;
> >> +    bool found = false;
> >> +    const char *name;
> >> +    const char *type = "system";
> >>       struct sset names = SSET_INITIALIZER(&names);
> >>
> >>       ovs_assert(is_vport_class(netdev_get_class(netdev)));
> >>
> >> -    dp_enumerate_names("system", &names);
> >> -    found = sset_contains(&names, netdev_get_name(netdev));
> >> +    dp_enumerate_names(type, &names);
> >> +    SSET_FOR_EACH (name, &names) {
> >> +        struct dpif *dpifp = NULL;
> >> +
> >> +        if (dpif_open(name, type, &dpifp)) {
> >> +            continue;
> >> +        }
> >> +
> >> +        found = dpif_port_exists(dpifp, netdev_get_name(netdev));
> >> +
> >> +        dpif_close(dpifp);
> >> +        if (found) {
> >> +            break;
> >> +        }
> >> +    }
> >>       sset_destroy(&names);
> >>
> >>       return found;
> >> ---
> >>
> >>>
> >>>> Something like this (not tested):
> >>>>
> >>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> >>>> index 01e900461..b7b0616ec 100644
> >>>> --- a/lib/netdev-offload-dpdk.c
> >>>> +++ b/lib/netdev-offload-dpdk.c
> >>>> @@ -22,6 +22,7 @@
> >>>>   #include "dpif-netdev.h"
> >>>>   #include "netdev-offload-provider.h"
> >>>>   #include "netdev-provider.h"
> >>>> +#include "netdev-vport.h"
> >>>>   #include "openvswitch/match.h"
> >>>>   #include "openvswitch/vlog.h"
> >>>>   #include "packets.h"
> >>>> @@ -752,6 +753,13 @@ netdev_offload_dpdk_flow_del(struct netdev
> >>>> *netdev, const ovs_u128 *ufid,  static int
> >>>> netdev_offload_dpdk_init_flow_api(struct netdev *netdev)  {
> >>>> +    if (netdev_vport_is_vport_class(netdev->netdev_class)
> >>>> +        && netdev_vport_has_system_port(netdev)) {
> >>>> +        VLOG_DBG("%s: vport has backing system interface. Skipping.",
> >>>> +                 netdev_get_name(netdev));
> >>>> +        return EOPNOTSUPP;
> >>>> +    }
> >>>> +
> >>>>       return netdev_dpdk_flow_api_supported(netdev) ? 0 :
> >>>> EOPNOTSUPP; }
> >>>>
> >>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> >>>> index
> >>>> 498aae369..e4d121ed7 100644
> >>>> --- a/lib/netdev-offload-tc.c
> >>>> +++ b/lib/netdev-offload-tc.c
> >>>> @@ -31,6 +31,7 @@
> >>>>   #include "netdev-linux.h"
> >>>>   #include "netdev-offload-provider.h"
> >>>>   #include "netdev-provider.h"
> >>>> +#include "netdev-vport.h"
> >>>>   #include "netlink.h"
> >>>>   #include "netlink-socket.h"
> >>>>   #include "odp-netlink.h"
> >>>> @@ -1560,6 +1561,13 @@ netdev_tc_init_flow_api(struct netdev
> >> *netdev)
> >>>>       int ifindex;
> >>>>       int error;
> >>>>
> >>>> +    if (netdev_vport_is_vport_class(netdev->netdev_class)
> >>>> +        && !netdev_vport_has_system_port(netdev)) {
> >>>> +        VLOG_DBG("%s: vport has no backing system interface.
> Skipping.",
> >>>> +                 netdev_get_name(netdev));
> >>>> +        return EOPNOTSUPP;
> >>>> +    }
> >>>> +
> >>>>       ifindex = netdev_get_ifindex(netdev);
> >>>>       if (ifindex < 0) {
> >>>>           VLOG_INFO("init: failed to get ifindex for %s: %s", diff
> >>>> --git a/lib/netdev- vport.c b/lib/netdev-vport.c index
> >>>> 92a256af1..82943c102 100644
> >>>> --- a/lib/netdev-vport.c
> >>>> +++ b/lib/netdev-vport.c
> >>>> @@ -44,6 +44,7 @@
> >>>>   #include "simap.h"
> >>>>   #include "smap.h"
> >>>>   #include "socket-util.h"
> >>>> +#include "sset.h"
> >>>>   #include "unaligned.h"
> >>>>   #include "unixctl.h"
> >>>>   #include "openvswitch/vlog.h"
> >>>> @@ -120,6 +121,21 @@ netdev_vport_class_get_dpif_port(const
> struct
> >>>> netdev_class *class)
> >>>>       return is_vport_class(class) ?
> >>>> vport_class_cast(class)->dpif_port : NULL;  }
> >>>>
> >>>> +bool
> >>>> +netdev_vport_has_system_port(const struct netdev *netdev) {
> >>>> +    bool found;
> >>>> +    struct sset names = SSET_INITIALIZER(&names);
> >>>> +
> >>>> +    ovs_assert(is_vport_class(netdev_get_class(netdev)));
> >>>> +
> >>>> +    dp_enumerate_names("system", &names);
> >>>> +    found = sset_contains(&names, netdev_get_name(netdev));
> >>>> +    sset_destroy(&names);
> >>>> +
> >>>> +    return found;
> >>>> +}
> >>>> +
> >>>>   const char *
> >>>>   netdev_vport_get_dpif_port(const struct netdev *netdev,
> >>>>                              char namebuf[], size_t bufsize) diff
> >>>> --git a/lib/netdev-vport.h b/lib/netdev-vport.h index
> >>>> 9d756a265..4be1b92ec 100644
> >>>> --- a/lib/netdev-vport.h
> >>>> +++ b/lib/netdev-vport.h
> >>>> @@ -40,6 +40,7 @@ void netdev_vport_inc_tx(const struct netdev *,
> >>>>                            const struct dpif_flow_stats *);
> >>>>
> >>>>   bool netdev_vport_is_vport_class(const struct netdev_class *);
> >>>> +bool netdev_vport_has_system_port(const struct netdev *);
> >>>>   const char *netdev_vport_class_get_dpif_port(const struct
> >>>> netdev_class *);
> >>>>
> >>>>   #ifndef _WIN32
> >>>> ---
> >>>>
> >>>> I'll format this as a proper patch and send along with patch for
> >>>> enabling offloading for ports without correct ifindex.
> >>>>
> >>>> Best regards, Ilya Maximets.
> >>>
> >>> Best regards,
> >>> Ophir
> >>>


More information about the dev mailing list