[ovs-dev] [PATCH ovs V2 00/21] Introducing HW offload support for openvswitch

Roi Dayan roid at mellanox.com
Mon Jan 16 12:56:51 UTC 2017



On 16/01/2017 11:04, Paul Blakey wrote:
>
>
> On 16/01/2017 11:03, Paul Blakey wrote:
>>
>> On 13/01/2017 02:42, Chandran, Sugesh wrote:
>>> Hi Paul,
>>>
>>> Thank you for releasing the patch set. I have few high level comments
>>> on the patch set as below.
>>>
>>>  My first concern is not providing the device (for eg: A FPGA)
>>> specific initialization support. This has to be done based on the user
>>> requirement/use case. Consider the P4 based OVS switch which wanted to
>>> define a FPGA behavior/profile at the time of OVS initialization. This
>>> profile defines set of features/number of flow fields to be enabled on
>>> the hardware for the specific application. I feel there must be some
>>> option for control-path to init the entire hardware module.
>>> Some of the FPGA can offer dynamic feature initialization based on
>>> application need. This is something like loading different firmware on
>>> the fly based on what application profile is. Each firmware varies in
>>> performance and scalability figures hence its worth to define the
>>> profile on the need basis.  Do you think its possible to offer this
>>> support with current implementation model?

Hi Sugesh,

What about netdev_initialize?
We didn't need it currently but can be added later when needed.

>>>
>>> According to my understanding, OVS try to install the flows in
>>> hardware first if it is enabled. The flows are fallback to software
>>> only when hw insert is failed. I would like to have flexibility of
>>> defining the flow installation behavior from the control path than
>>> just based on netdev capacity.  Consider a multi-tenant deployment
>>> with limited hardware offload support (Say hardware supports only
>>> limited number of flows), OVS should install only priority flows in
>>> hardware to offer better performance. The current implementation
>>> operates in first come first serve model, which may not be optimal.

This can be added later as a feature. here, as you understood, we added 
the basic flow which is if the user enabled hw-offload then try to 
offload and fallback to software.

>>>
>>> From an orchestration point of view, how system would know the port
>>> can do hardware offload/not. More specifically how system could know
>>> the capabilities of hardware offload. In many cases this information
>>> is sufficient enough to identify a flow is hardware ready/not, which
>>> avoid the cost of hardware interaction at the time of flow insert.
>>> May be port can be added as software port, which will populate the
>>> OVSDB with hardware capability information. The orchestration can use
>>> the exposed hardware capability matrix to determine whether or not to
>>> enable the hardware offload on that port.

We definitely thought about it but didn't do it yet. currently we just 
let the user enable or disable the hw-offload which by default is disabled.

>>>
>>> Just to confirm, the provided APIs should be implemented for virtual
>>> ports(virtio-vhost ports) as well. Is that the case?

you mean netdev providers? then yes. can be NULLs though.

>>>
>>> I assume the flow between hardware ports to software only ports always
>>> programmed in the software. The packets from hardware port are arrived
>>> as normal packet in OVS-DP.  Is that correct?

can you give an example?
do you mean like veth? then if the TC policy is skip_sw then yes.
we can't add HW offload rule for veth so we'll fallback to OVS datapath.

>>>
>>> The hardware flow API are not defined for the dpdk netdev. It may fail
>>> for OVS-DPDK compilation.

Right. we'll add the missing api (NULLs currently) to make sure we pass 
compilation. we'll make sure to test ovs-dpdk compilation for next 
submission.

Thanks,
Roi

>>>
>>>
>>> Regards
>>> _Sugesh
>>>
>>>
>>>> -----Original Message-----
>>>> From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-
>>>> bounces at openvswitch.org] On Behalf Of Joe Stringer
>>>> Sent: Tuesday, January 10, 2017 6:09 PM
>>>> To: Paul Blakey <paulb at mellanox.com>
>>>> Cc: ovs dev <dev at openvswitch.org>; Shahar Klein
>>>> <shahark at mellanox.com>; Mark Bloch <markb at mellanox.com>; Simon
>>>> Horman <simon.horman at netronome.com>; Hadar Hen Zion
>>>> <hadarh at mellanox.com>; Rony Efraim <ronye at mellanox.com>; Jiri Pirko
>>>> <jiri at mellanox.com>; Eric Garver <e at erig.me>; Marcelo Ricardo Leitner
>>>> <mleitner at redhat.com>; Or Gerlitz <ogerlitz at mellanox.com>; Andy
>>>> Gospodarek <andy at greyhouse.net>
>>>> Subject: Re: [ovs-dev] [PATCH ovs V2 00/21] Introducing HW offload
>>>> support
>>>> for openvswitch
>>>>
>>>> On 10 January 2017 at 10:07, Joe Stringer <joe at ovn.org> wrote:
>>>>> On 8 January 2017 at 07:18, Paul Blakey <paulb at mellanox.com> wrote:
>>>>>>
>>>>>> On 05/01/2017 02:11, Joe Stringer wrote:
>>>>>>
>>>>>>> On 25 December 2016 at 03:39, Paul Blakey <paulb at mellanox.com>
>>>> wrote:
>>>>>>>>
>>>>>>>> This patch series introduces rule offload functionality to
>>>>>>>> dpif-netlink via netdev ports new flow offloading API. The user can
>>>>>>>> specify whether to enable rule offloading or not via OVS
>>>>>>>> configuration. Netdev providers are able to implement netdev flow
>>>> offload API in order to offload rules.
>>>>>>>>
>>>>>>>> This patch series also implements one offload scheme for
>>>>>>>> netdev-linux, using TC flower classifier, which was chosen because
>>>>>>>> its sort of natrual to state OVS DP rules for this classifier.
>>>>>>>> However, the code can be extended to support other classifiers such
>>>>>>>> as U32, eBPF, etc which support offload as well.
>>>>>>>>
>>>>>>>> The use-case we are currently addressing is the newly sriov
>>>>>>>> switchdev mode in the linux kernel which was introduced in version
>>>>>>>> 4.8 [1][2]. this series was tested against sriov vfs vports
>>>>>>>> representors of the Mellanox 100G ConnectX-4 series exposed by the
>>>>>>>> mlx5 kernel driver.
>>>>>>>>
>>>>>>>> changes from V1
>>>>>>>>      - Added generic netdev flow offloads API.
>>>>>>>>      - Implemented relevant flow API in netdev-linux (and
>>>>>>>> netdev-vport).
>>>>>>>>      - Added a other_config hw-offload option to enable offloading
>>>>>>>> (defaults to false).
>>>>>>>>      - Fixed coding style to conform with OVS.
>>>>>>>>      - Policy removed for now. (Will be discussed how best
>>>>>>>> implemented later).
>>>>>>>
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> Happy holidays to you too ;)
>>>>>>
>>>>>> Hi :)
>>>>>> Thanks for your thorough review, We are working on resubmitting the
>>>>>> next patch set and I'll try and reply as much as I can right now, but
>>>>>> this is a big review, so I'll split this to several replies.
>>>>>>>
>>>>>>>
>>>>>>> A few high level comments:
>>>>>>> * Overall the patchset is looking in better state than previously.
>>>>>>> * That said, on platforms other than your specific test environment
>>>>>>> OVS fails to build.
>>>>>>> * netdev-vport seems to have acquired platform-specific code
>>>>>>
>>>>>> Thanks, We've seen your links for travis-ci and app veyor, we'll
>>>>>> create an account and fix it accordingly.
>>>>>>>
>>>>>>> * I don't think that ovs-dpctl should read the database. One of the
>>>>>>> nice things about ovs-dpctl is that it is currently standalone, ie
>>>>>>> doesn't require anything else to be running to be able to query the
>>>>>>> state of the kernel datapath. What are you trying to achieve with
>>>>>>> this? Perhaps you can extend the ovs-appctl dpctl/... commands
>>>>>>> instead to provide the functionality you need?
>>>>>>
>>>>>> We have two config options that isn't saved with the kernel datapath
>>>>>> (and dumped back on dpif open):
>>>>>> if offload is enabled to know if we even need to dump from netdevs or
>>>>>> not for dump-flows cmd (I guess we could just dump the netdev anyway)
>>>>>> and for add-flow, and tc flower flag skip_hw/skip_sw that is also
>>>>>> needed for add-flow cmd, so we can know what flag we be used while
>>>>>> inserting to flower if offloading is enabled and possible.
>>>>>
>>>>> I discussed this with a few people offline and it seems that the most
>>>>> reasonable way to handle it for dump is just to dump from netdevs
>>>>> anyway. You can fetch the list of relevant ports from OVS datapath in
>>>>> kernel, then only dump those ports.
>>>>>
>>>>> For adding and deleting flows, using ovs-dpctl is not really
>>>>> supported, it's primarily for debug. ovs-vswitchd is going to override
>>>>> that behaviour anyway - if you add a flow, it may delete it; if you
>>>>> delete a flow, it may re-add it. I suggest that for debugging purposes
>>>>> the default behaviour is to install into OVS, but if you want to put
>>>>> it into TC sw/hw, provide an additional flag on the commandline.
>>>>>
>>>>> If you want ovs-vswitchd to decide how to install the flow, you can
>>>>> always use "ovs-appctl dpctl/add-flow ..." and the command will go via
>>>>> ovs-vswitchd, which will make the offloading decision based on the
>>>>> same logic as it would when handling regular upcalls.
>>>>>
>>>>>>> * It would be good to see at least some basic system tests ala
>>>>>>> tests/system-traffic.at or tests/system-ovn.at. Maybe there could be
>>>>>>> tests/system-tc.at that uses SKIP_HW to show that flows can be
>>>>>>> installed and that they work.
>>>>>>
>>>>>> Sure.
>>>>>>>
>>>>>>> * I think it would be great to get some ability to see what's
>>>>>>> happening with flow offloads; eg "ovs-appctl dpctl/show-hw-offloads"
>>>>>>> that could give you some idea of how many flows are being offloaded
>>>>>>
>>>>>> You mean a number/stats or actually dump only offloaded flows?
>>>>>
>>>>> At the time, I was thinking just stats.
>>>>>
>>>>> However I think it would also be useful to have some flag for
>>>>> ovs-dpctl that chooses which flows to fetch.
>>>>>
>>>>> Something like:
>>>>> ovs-dpctl dump-flows: Dump all flows from both OVS and TC (flower
>>>>> only).
>>>>> ovs-dpctl dump-flows --only-flower
>>>>> ovs-dpctl dump-flows --only-ovs
>>>>>
>>>>> Each flow when printed should also indicate in some way where it came
>>>> from.
>>>>
>>>> Could be something like ovs-dpctl dump-flows --type=flower instead,
>>>> to be a
>>>> bit more generic.
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev at openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>> +cc Roi that was removed from cc somehow, he can answer some of this
>> questions for you.


More information about the dev mailing list