[ovs-dev] [PATCH] netdev: Dynamic per-port Flow API.

Ilya Maximets i.maximets at ovn.org
Wed Nov 20 17:00:11 UTC 2019


On 20.11.2019 17:52, Ophir Munk wrote:
> Hi Ilya,
> I think the problem is that the newly created netdev called "vxlan_sys_4789" never goes through a netdev_init_flow_api(netdev); call.

'vxlan_sys_4789' is not a netdev. It's a dpif_port.
There is a corresponding 'netdev', but only 'ofproto' knows which one.

> Where do you suggest adding this call?
> 
> Regards,
> Ophir
> 
>> -----Original Message-----
>> From: Ilya Maximets <i.maximets at ovn.org>
>> Sent: Tuesday, November 19, 2019 8:30 PM
>> To: Ophir Munk <ophirmu at mellanox.com>; Ilya Maximets
>> <i.maximets at ovn.org>; ovs-dev at openvswitch.org
>> Cc: Ameer Mahagneh <ameerm at mellanox.com>; Roni Bar Yanai
>> <roniba at mellanox.com>; Eveline Raine <eveliner at mellanox.com>; Oz
>> Shlomo <ozsh at mellanox.com>; Eli Britstein <elibr at mellanox.com>
>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
>>
>> On 19.11.2019 18:20, Ophir Munk wrote:
>>> Hi Ilya,
>>> Thanks for the patch set which adds the dpif hint inside the netdev struct.
>> It is really helpful.
>>> Our goal is to have flow_put() calls on vxlan devices, using the existing dpdk
>> flow API.
>>> We added logic inside netdev_dpdk_flow_api_supported() to accept vxlan
>>> devices and indeed we see in the log:
>>> "netdev_offload|INFO|vxlan0: Assigned flow API 'dpdk_flow_api'".
>>> However, there is no flow_put() on the vxlan device.
>>> We see our own printouts in dpif-netdev such as:
>>> "dpif_netdev(dp_netdev_flow_5)|ERR|vxlan_sys_4789: calling netdev
>> flow_put()"
>>> but inside netdev_flow_put the flow_api poiner is NULL.
>>> Any ideas why?
>>
>> How many tunneling ports you have?
>> The case here is that dpif creates only one dpif_port for all the tunnels with
>> similar config.  So, only one of these netdevs will have flow api initialized and
>> you need to be sure that you're using the right one.  You may find it by the
>> datapath 'port_no' with netdev_ports_get().
>> Just guessing, but this is the one possible reason.
>>
>>>
>>>> -----Original Message-----
>>>> From: Ilya Maximets <i.maximets at ovn.org>
>>>> Sent: Friday, November 15, 2019 10:26 PM
>>>> To: Ophir Munk <ophirmu at mellanox.com>; Ilya Maximets
>>>> <i.maximets at ovn.org>; ovs-dev at openvswitch.org
>>>> Cc: Ameer Mahagneh <ameerm at mellanox.com>; Roni Bar Yanai
>>>> <roniba at mellanox.com>; Eveline Raine <eveliner at mellanox.com>; Oz
>>>> Shlomo <ozsh at mellanox.com>; Eli Britstein <elibr at mellanox.com>
>>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
>>>>
>>>> On 10.11.2019 21:17, Ophir Munk wrote:
>>>>> Hi Ilya,
>>>>> Thanks you for taking care of this.
>>>>> Assigning the correct vport flow API is a critical feature for offloading.
>>>>>
>>>>> It seems hard to address all different vport configuration scenarios
>>>>> (kernel
>>>> only, userspace only or both) by just relying on the individual
>>>> netdev and without knowing the dpif on top.
>>>>
>>>> Yes you're right.  The only module that knows the real mapping of
>>>> 'netdev's to 'dpif's is 'ofproto' and we need to get this information
>> somehow.
>>>>
>>>>> Maybe can move the flow API assignment to the point where dp is
>>>>> actually
>>>> adding the port netdev and give a hint about the dp.
>>>>
>>>> Since we already have a hint from the upper layers about the
>>>> 'dpif_class' I'm suggesting to replace it with 'dpif_type'.
>>>> In pair with storing this hint inside the netdev we could have a
>>>> flexible system without exposing internals to higher layers or trying
>>>> to use higher layer from the library code.
>>>>
>>>> Please, take a look at the new version of patches:
>>>>
>>>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.
>>>> openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-
>>>>
>> November%2F364735.html&data=02%7C01%7Cophirmu%40mellanox.c
>>>>
>> om%7Caba8a2cdff2342aa536b08d76a09fc61%7Ca652971c7d2e4d9ba6a4d149
>>>>
>> 256f461b%7C0%7C0%7C637094463434958586&sdata=taEV4B7BA83%2Fu
>>>> faHoW7Y7F824SumWEnEXhVUpUON5eA%3D&reserved=0
>>>>
>>>>> It is also confusing that userspace vport names are "sys_vxlan_4789"
>>>>> and
>>>> not "usr_vxlan_4789" for example.
>>>>
>>>> Yes, this is a bit confusing, but we, probably, could not just change
>>>> it.  At least that we'll have to rewrite a big part of system tests
>>>> that relies on the port names in the flow dumps.
>>>> We will have to duplicate them for kernel and userspace.
>>>> Some external management/monitoring tools could be affected too
>>>> because they will have to treat kernel and userspace tunneling
>> differently.
>>>>
>>>> Best regards, Ilya Maximets.
>>>>
>>>>> What do you think?
>>>>>
>>>>> Regards,
>>>>> Ophir
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ilya Maximets <i.maximets at ovn.org>
>>>>>> Sent: Friday, November 8, 2019 5:57 PM
>>>>>> To: Ophir Munk <ophirmu at mellanox.com>; Ilya Maximets
>>>>>> <i.maximets at ovn.org>; ovs-dev at openvswitch.org
>>>>>> Cc: Ameer Mahagneh <ameerm at mellanox.com>; Roni Bar Yanai
>>>>>> <roniba at mellanox.com>; Eveline Raine <eveliner at mellanox.com>
>>>>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
>>>>>>
>>>>>> On 06.11.2019 18:11, Ophir Munk wrote:
>>>>>>> Hi Ilya,
>>>>>>> A few months ago we discussed the missing functionality of vports
>>>>>> offloading under user space bridges.
>>>>>>> Commit [1] was added to explicitly avoid installing userspace
>>>>>>> vport flows (to
>>>>>> avoid confusion with the vport kernel).
>>>>>>> When reverting commit [1] - we are left with this missing
>> functionality.
>>>>>>> Applying your suggested patch netdev_vport_has_system_port()
>> API)
>>>>>> does not seem to work.
>>>>>>> It always fails when it tries to look for the interface name (say
>>>>>>> "vxlan10") in
>>>>>> the system list where vxlan interfaces are renamed (say
>>>> "vxlan_sys_4789").
>>>>>>
>>>>>> Hi Ophir,
>>>>>>
>>>>>> No-one ever tried to run this code, so I'm not surprised.
>>>>>> I could take a look at this issue on next week.
>>>>>>
>>>>>> Best regards, Ilya Maximets.
>>>>>>
>>>>>>>
>>>>>>> You wrote:
>>>>>>>> On master, after applying dynamic flow API patch-set, we'll be
>>>>>>>> able to use patch that I suggested in previous mail to properly
>>>>>>>> handle this situation and enable vport offloading.
>>>>>>>
>>>>>>> It would be appreciated if we can resume the efforts in fixing
>>>>>>> this missing
>>>>>> functionality.
>>>>>>> Please advise on the next steps.
>>>>>>>
>>>>>>> [1]
>>>>>>> Commit 0da667e345 ("dpif-netdev: Forbid vport offloading
>>>>>>> attempts.")
>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ilya Maximets <i.maximets at samsung.com>
>>>>>>>> Sent: Monday, May 13, 2019 3:33 PM
>>>>>>>> To: Ophir Munk <ophirmu at mellanox.com>; ovs-
>>>> dev at openvswitch.org
>>>>>>>> Cc: Ian Stokes <ian.stokes at intel.com>; Flavio Leitner
>>>>>>>> <fbl at sysclose.org>; Kevin Traynor <ktraynor at redhat.com>; Roni
>> Bar
>>>>>>>> Yanai <roniba at mellanox.com>; Finn Christensen
>> <fc at napatech.com>;
>>>>>> Ben
>>>>>>>> Pfaff <blp at ovn.org>; Simon Horman
>>>> <simon.horman at netronome.com>;
>>>>>> Olga
>>>>>>>> Shern <olgas at mellanox.com>; Asaf Penso <asafp at mellanox.com>;
>> Oz
>>>>>>>> Shlomo <ozsh at mellanox.com>; Majd Dibbiny
>> <majd at mellanox.com>
>>>>>>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
>>>>>>>>
>>>>>>>> On 13.05.2019 14:41, Ophir Munk wrote:
>>>>>>>>> Hi Ilya,
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Ilya Maximets <i.maximets at samsung.com>
>>>>>>>>>> Sent: Monday, May 13, 2019 12:22 PM
>>>>>>>>>> To: Ophir Munk <ophirmu at mellanox.com>; ovs-
>>>>>> dev at openvswitch.org
>>>>>>>>>> Cc: Ian Stokes <ian.stokes at intel.com>; Flavio Leitner
>>>>>>>>>> <fbl at sysclose.org>; Kevin Traynor <ktraynor at redhat.com>; Roni
>>>> Bar
>>>>>>>>>> Yanai <roniba at mellanox.com>; Finn Christensen
>>>> <fc at napatech.com>;
>>>>>>>> Ben
>>>>>>>>>> Pfaff <blp at ovn.org>; Simon Horman
>>>>>> <simon.horman at netronome.com>;
>>>>>>>> Olga
>>>>>>>>>> Shern <olgas at mellanox.com>; Asaf Penso
>> <asafp at mellanox.com>;
>>>> Oz
>>>>>>>>>> Shlomo <ozsh at mellanox.com>
>>>>>>>>>> Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.
>>>>>>>>>>
>>>>>>>>>> On 09.05.2019 1:59, Ophir Munk wrote:
>>>>>>>>>>> Hi Ilya,
>>>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>>>> I am recreating this scenario in my setup.
>>>>>>>>>>
>>>>>>>>>> I see. Yes, you're right. And I think that this case could be
>>>>>>>>>> reproduced on current master without any patches. So, it's a
>>>>>>>>>> bug that we
>>>>>>>> need to fix.
>>>>>>>>>> Otherwise userspace datapath will try to offload its flows to
>>>>>>>>>> the unrelated system interfaces. For now we could just forbid
>>>>>>>>>> offloading to vports in dpif- netdev. I'll prepare the patch.
>>>>>>>>>> This fix also should be
>>>>>>>> backported.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We need a way to enable offloading of vports in dpif-netdev. So
>>>>>>>>> if you can address It with your new patch - that would be
>> appreciated.
>>>>>>>>
>>>>>>>> I'm suggesting disabling because it'll be easy to backport to
>>>>>>>> older
>>>>>> branches.
>>>>>>>> On master, after applying dynamic flow API patch-set, we'll be
>>>>>>>> able to use patch that I suggested in previous mail to properly
>>>>>>>> handle this situation and enable vport offloading.
>>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This patch series is important but one of its main goals is to
>>>>>>>>>>> enable different
>>>>>>>>>> offloads for different vports under Kernel/userspace.
>>>>>>>>>>> Can you please advise how this goal can be achieved?
>>>>>>>>>>
>>>>>>>>>> It looks like there is no way to distinguish system and
>>>>>>>>>> userspace vports by looking only on netdev. We should check
>> with dpif type.
>>>>>>>>>
>>>>>>>>> Agreed. But then looking at your sample patch below you are
>>>>>>>>> basing your decision on the existence of system port (and not on
>> dpif type).
>>>>>>>>
>>>>>>>> Right now 'ifindex' is used for checking, so this is equally racy.
>>>>>>>>
>>>>>>>>>   I think it is risky because
>>>>>>>>> you are dependent on the order of operations: Creation of the
>>>>>>>>> system port
>>>>>>>> versus checking for system port existence.
>>>>>>>>> Which was first (under different scenarios)?
>>>>>>>>
>>>>>>>> Actually, port creation and checking will always happen in the
>>>>>>>> same thread context, so creation and checking will be serialized.
>>>>>>>> Kernel should guarantee the port existence. No races here.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> What do you say about the following patch:
>>>>>>>>>
>>>>>>>>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c int
>>>>>>>>> netdev_ports_insert(struct netdev *netdev, const struct
>>>>>>>>> dpif_class *dpif_class,
>>>>>>>>>                       struct dpif_port *dpif_port) @@ -547,8
>>>>>>>>> +555,9 @@ netdev_ports_insert(struct netdev *netdev, const
>>>>>>>>> +struct
>>>>>>>>> dpif_class *dpif_class,
>>>>>>>>>                   netdev_ports_hash(dpif_port->port_no, dpif_class));
>>>>>>>>>       hmap_insert(&ifindex_to_port, &data->ifindex_node, ifindex);
>>>>>>>>>       ovs_mutex_unlock(&netdev_hmap_mutex);
>>>>>>>>>
>>>>>>>>> -    netdev_init_flow_api(netdev);
>>>>>>>>> +    netdev_init_flow_api(netdev, dpif_class->type); /* Pass
>>>>>>>>> + class type "netdev" or "syste" */
>>>>>>>>>
>>>>>>>>>       return 0;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> It's not a full solution. It is just a hint how to pass the dpif
>>>>>>>>> type ("netdev" or
>>>>>>>> "system")  when initializing the flow api.
>>>>>>>>> We can use the dpif type when initializing vport offload.
>>>>>>>>
>>>>>>>> This is, actually, the first thing I tried. However, this
>>>>>>>> requires exposing the internals of 'dpif-provider', which is bad
>>>>>>>> from the point of
>>>>>> code architecture.
>>>>>>>>
>>>>>>>> For the below patch code, I missed actual port requesting from
>>>>>>>> the
>>>>>> datapath.
>>>>>>>> Following incremental needed:
>>>>>>>>
>>>>>>>> ---
>>>>>>>> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index
>>>>>>>> 82943c102..1c88a91ed 100644
>>>>>>>> --- a/lib/netdev-vport.c
>>>>>>>> +++ b/lib/netdev-vport.c
>>>>>>>> @@ -124,13 +124,28 @@ netdev_vport_class_get_dpif_port(const
>>>> struct
>>>>>>>> netdev_class *class)  bool  netdev_vport_has_system_port(const
>>>>>>>> struct netdev *netdev)  {
>>>>>>>> -    bool found;
>>>>>>>> +    bool found = false;
>>>>>>>> +    const char *name;
>>>>>>>> +    const char *type = "system";
>>>>>>>>       struct sset names = SSET_INITIALIZER(&names);
>>>>>>>>
>>>>>>>>       ovs_assert(is_vport_class(netdev_get_class(netdev)));
>>>>>>>>
>>>>>>>> -    dp_enumerate_names("system", &names);
>>>>>>>> -    found = sset_contains(&names, netdev_get_name(netdev));
>>>>>>>> +    dp_enumerate_names(type, &names);
>>>>>>>> +    SSET_FOR_EACH (name, &names) {
>>>>>>>> +        struct dpif *dpifp = NULL;
>>>>>>>> +
>>>>>>>> +        if (dpif_open(name, type, &dpifp)) {
>>>>>>>> +            continue;
>>>>>>>> +        }
>>>>>>>> +
>>>>>>>> +        found = dpif_port_exists(dpifp,
>>>>>>>> + netdev_get_name(netdev));
>>>>>>>> +
>>>>>>>> +        dpif_close(dpifp);
>>>>>>>> +        if (found) {
>>>>>>>> +            break;
>>>>>>>> +        }
>>>>>>>> +    }
>>>>>>>>       sset_destroy(&names);
>>>>>>>>
>>>>>>>>       return found;
>>>>>>>> ---
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Something like this (not tested):
>>>>>>>>>>
>>>>>>>>>> diff --git a/lib/netdev-offload-dpdk.c
>>>>>>>>>> b/lib/netdev-offload-dpdk.c index 01e900461..b7b0616ec 100644
>>>>>>>>>> --- a/lib/netdev-offload-dpdk.c
>>>>>>>>>> +++ b/lib/netdev-offload-dpdk.c
>>>>>>>>>> @@ -22,6 +22,7 @@
>>>>>>>>>>   #include "dpif-netdev.h"
>>>>>>>>>>   #include "netdev-offload-provider.h"
>>>>>>>>>>   #include "netdev-provider.h"
>>>>>>>>>> +#include "netdev-vport.h"
>>>>>>>>>>   #include "openvswitch/match.h"
>>>>>>>>>>   #include "openvswitch/vlog.h"
>>>>>>>>>>   #include "packets.h"
>>>>>>>>>> @@ -752,6 +753,13 @@ netdev_offload_dpdk_flow_del(struct
>>>> netdev
>>>>>>>>>> *netdev, const ovs_u128 *ufid,  static int
>>>>>>>>>> netdev_offload_dpdk_init_flow_api(struct netdev *netdev)  {
>>>>>>>>>> +    if (netdev_vport_is_vport_class(netdev->netdev_class)
>>>>>>>>>> +        && netdev_vport_has_system_port(netdev)) {
>>>>>>>>>> +        VLOG_DBG("%s: vport has backing system interface.
>>>> Skipping.",
>>>>>>>>>> +                 netdev_get_name(netdev));
>>>>>>>>>> +        return EOPNOTSUPP;
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>>       return netdev_dpdk_flow_api_supported(netdev) ? 0 :
>>>>>>>>>> EOPNOTSUPP; }
>>>>>>>>>>
>>>>>>>>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>>>>>>>>>> index
>>>>>>>>>> 498aae369..e4d121ed7 100644
>>>>>>>>>> --- a/lib/netdev-offload-tc.c
>>>>>>>>>> +++ b/lib/netdev-offload-tc.c
>>>>>>>>>> @@ -31,6 +31,7 @@
>>>>>>>>>>   #include "netdev-linux.h"
>>>>>>>>>>   #include "netdev-offload-provider.h"
>>>>>>>>>>   #include "netdev-provider.h"
>>>>>>>>>> +#include "netdev-vport.h"
>>>>>>>>>>   #include "netlink.h"
>>>>>>>>>>   #include "netlink-socket.h"
>>>>>>>>>>   #include "odp-netlink.h"
>>>>>>>>>> @@ -1560,6 +1561,13 @@ netdev_tc_init_flow_api(struct netdev
>>>>>>>> *netdev)
>>>>>>>>>>       int ifindex;
>>>>>>>>>>       int error;
>>>>>>>>>>
>>>>>>>>>> +    if (netdev_vport_is_vport_class(netdev->netdev_class)
>>>>>>>>>> +        && !netdev_vport_has_system_port(netdev)) {
>>>>>>>>>> +        VLOG_DBG("%s: vport has no backing system interface.
>>>>>> Skipping.",
>>>>>>>>>> +                 netdev_get_name(netdev));
>>>>>>>>>> +        return EOPNOTSUPP;
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>>       ifindex = netdev_get_ifindex(netdev);
>>>>>>>>>>       if (ifindex < 0) {
>>>>>>>>>>           VLOG_INFO("init: failed to get ifindex for %s: %s",
>>>>>>>>>> diff --git a/lib/netdev- vport.c b/lib/netdev-vport.c index
>>>>>>>>>> 92a256af1..82943c102 100644
>>>>>>>>>> --- a/lib/netdev-vport.c
>>>>>>>>>> +++ b/lib/netdev-vport.c
>>>>>>>>>> @@ -44,6 +44,7 @@
>>>>>>>>>>   #include "simap.h"
>>>>>>>>>>   #include "smap.h"
>>>>>>>>>>   #include "socket-util.h"
>>>>>>>>>> +#include "sset.h"
>>>>>>>>>>   #include "unaligned.h"
>>>>>>>>>>   #include "unixctl.h"
>>>>>>>>>>   #include "openvswitch/vlog.h"
>>>>>>>>>> @@ -120,6 +121,21 @@ netdev_vport_class_get_dpif_port(const
>>>>>> struct
>>>>>>>>>> netdev_class *class)
>>>>>>>>>>       return is_vport_class(class) ?
>>>>>>>>>> vport_class_cast(class)->dpif_port : NULL;  }
>>>>>>>>>>
>>>>>>>>>> +bool
>>>>>>>>>> +netdev_vport_has_system_port(const struct netdev *netdev) {
>>>>>>>>>> +    bool found;
>>>>>>>>>> +    struct sset names = SSET_INITIALIZER(&names);
>>>>>>>>>> +
>>>>>>>>>> +    ovs_assert(is_vport_class(netdev_get_class(netdev)));
>>>>>>>>>> +
>>>>>>>>>> +    dp_enumerate_names("system", &names);
>>>>>>>>>> +    found = sset_contains(&names, netdev_get_name(netdev));
>>>>>>>>>> +    sset_destroy(&names);
>>>>>>>>>> +
>>>>>>>>>> +    return found;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>   const char *
>>>>>>>>>>   netdev_vport_get_dpif_port(const struct netdev *netdev,
>>>>>>>>>>                              char namebuf[], size_t bufsize)
>>>>>>>>>> diff --git a/lib/netdev-vport.h b/lib/netdev-vport.h index
>>>>>>>>>> 9d756a265..4be1b92ec 100644
>>>>>>>>>> --- a/lib/netdev-vport.h
>>>>>>>>>> +++ b/lib/netdev-vport.h
>>>>>>>>>> @@ -40,6 +40,7 @@ void netdev_vport_inc_tx(const struct
>> netdev
>>>> *,
>>>>>>>>>>                            const struct dpif_flow_stats *);
>>>>>>>>>>
>>>>>>>>>>   bool netdev_vport_is_vport_class(const struct netdev_class
>>>>>>>>>> *);
>>>>>>>>>> +bool netdev_vport_has_system_port(const struct netdev *);
>>>>>>>>>>   const char *netdev_vport_class_get_dpif_port(const struct
>>>>>>>>>> netdev_class *);
>>>>>>>>>>
>>>>>>>>>>   #ifndef _WIN32
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> I'll format this as a proper patch and send along with patch
>>>>>>>>>> for enabling offloading for ports without correct ifindex.
>>>>>>>>>>
>>>>>>>>>> Best regards, Ilya Maximets.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Ophir
>>>>>>>>>


More information about the dev mailing list