[ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload

Eli Britstein elibr at nvidia.com
Wed Jun 23 15:48:51 UTC 2021


On 6/23/2021 6:18 PM, Ferriter, Cian wrote:
> External email: Use caution opening links or attachments
>
>
>> -----Original Message-----
>> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Ferriter, Cian
>> Sent: Wednesday 23 June 2021 13:38
>> To: Ilya Maximets <i.maximets at ovn.org>; Eli Britstein <elibr at nvidia.com>; dev at openvswitch.org
>> Cc: Ivan.Malov at oktetlabs.ru; Ameer Mahagneh <ameerm at nvidia.com>; Majd Dibbiny <majd at nvidia.com>
>> Subject: Re: [ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload
>>
>> Hi all,
>>
>> As part of rebasing our AVX512 DPIF on this patchset, I tested this patchset with partial HWOL and I'm
>> seeing strange behaviour.
>>
>> I'll report back more detailed findings soon, just wanted to mention this here as soon as I found the
>> issue.
>>
>> Thanks,
>> Cian
>>
> More details on the issue I'm seeing:
> I'm using Ilya's branch from Github:
> https://github.com/igsilya/ovs/tree/tmp-vxlan-decap
>
> ~/ovs_scripts# $OVS_DIR/utilities/ovs-vsctl list Open_vSwitch
> dpdk_version        : "DPDK 20.11.1"
> other_config        : {dpdk-hugepage-dir="/mnt/huge", dpdk-init="true", dpdk-lcore-mask="0x1", dpdk-socket-mem="2048,0", emc-insert-inv-prob="0", hw-offload="true", pmd-cpu-mask="0x2"}
>
> ~/ovs_scripts# $OVS_DIR/utilities/ovs-vsctl show
> 31584ce5-09c1-44b3-ab27-1a0308d63fff
>      Bridge br0
>          datapath_type: netdev
>          Port br0
>              Interface br0
>                  type: internal
>          Port phy0
>              Interface phy0
>                  type: dpdk
>                  options: {dpdk-devargs="5e:00.0"}
>
> ~/ovs_scripts# $OVS_DIR/utilities/ovs-ofctl dump-flows br0
>   cookie=0x0, duration=29.466s, table=0, n_packets=0, n_bytes=0, in_port=phy0 actions=IN_PORT
>
> I'm expecting the flow to be partially offloaded, but I get a segfault when using the above branch. More info on the segfault below:
>
> Thread 13 "pmd-c01/id:8" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7f9f72734700 (LWP 19327)]
> 0x000056163bf0d825 in set_error (error=0x0, type=RTE_FLOW_ERROR_TYPE_ATTR) at lib/netdev-dpdk.h:84
> (gdb) bt
> #0  0x000056163bf0d825 in set_error (error=0x0, type=RTE_FLOW_ERROR_TYPE_ATTR) at lib/netdev-dpdk.h:84
Yes, it is caused by passing NULL instead of valid struct rte_error, by 
Ilya's comments. I will fix it in v7.
> #1  0x000056163bf0d8d3 in netdev_dpdk_rte_flow_get_restore_info (netdev=0x1bfc65c80, p=0x19033af00, info=0x7f9f72729a20, error=0x0) at lib/netdev-dpdk.h:119
> #2  0x000056163bf14da3 in netdev_offload_dpdk_hw_miss_packet_recover (netdev=0x1bfc65c80, packet=0x19033af00) at lib/netdev-offload-dpdk.c:2133
> #3  0x000056163bde3662 in netdev_hw_miss_packet_recover (netdev=0x1bfc65c80, packet=0x19033af00) at lib/netdev-offload.c:265
> #4  0x000056163bda19a9 in dp_netdev_hw_flow (pmd=0x7f9f72735010, port_no=2, packet=0x19033af00, flow=0x7f9f72729b98) at lib/dpif-netdev.c:7087
> #5  0x000056163bda1c5c in dfc_processing (pmd=0x7f9f72735010, packets_=0x7f9f727310d0, keys=0x7f9f7272c480, missed_keys=0x7f9f7272c370, batches=0x7f9f72729f60, n_batches=0x7f9f72730f70, flow_map=0x7f9f72729c50, n_flows=0x7f9f72730f78, index_map=0x7f9f72729c30 "", md_is_valid=false, port_no=2) at lib/dpif-netdev.c:7168
> #6  0x000056163bda2f3e in dp_netdev_input__ (pmd=0x7f9f72735010, packets=0x7f9f727310d0, md_is_valid=false, port_no=2) at lib/dpif-netdev.c:7475
> #7  0x000056163bda3105 in dp_netdev_input (pmd=0x7f9f72735010, packets=0x7f9f727310d0, port_no=2) at lib/dpif-netdev.c:7519
> #8  0x000056163bd9ab04 in dp_netdev_process_rxq_port (pmd=0x7f9f72735010, rxq=0x56163fb3f610, port_no=2) at lib/dpif-netdev.c:4774
> #9  0x000056163bd9ee17 in pmd_thread_main (f_=0x7f9f72735010) at lib/dpif-netdev.c:6063
> #10 0x000056163be71c88 in ovsthread_wrapper (aux_=0x56163fb3fe70) at lib/ovs-thread.c:383
> #11 0x00007f9f884cf6db in start_thread (arg=0x7f9f72734700) at pthread_create.c:463
> #12 0x00007f9f862bb71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> In netdev_offload_dpdk_hw_miss_packet_recover() calls netdev_dpdk_rte_flow_get_restore_info() with a NULL for the struct rte_flow_error *error argument:
>
>      if (netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
>                                                &rte_restore_info, NULL)) {
>          /* This function is called for every packet, and in most cases there
>           * will be no restore info from the HW, thus error is expected.
>           */
>          return 0;
>      }
>
> There are 2 "netdev_dpdk_rte_flow_get_restore_info()" functions. One in lib/netdev-dpdk.h and one in lib/netdev-dpdk.c.
>
> I don't have the experimental API enabled, so I'm using the function rom lib/netdev-dpdk.h.
>
>>> -----Original Message-----
>>> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Ilya Maximets
>>> Sent: Tuesday 22 June 2021 16:55
>>> To: Eli Britstein <elibr at nvidia.com>; dev at openvswitch.org; Ilya Maximets <i.maximets at ovn.org>
>>> Cc: Ivan.Malov at oktetlabs.ru; Ameer Mahagneh <ameerm at nvidia.com>; Majd Dibbiny <majd at nvidia.com>
>>> Subject: Re: [ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload
>>>
>>> On 4/4/21 11:54 AM, Eli Britstein wrote:
>>>> VXLAN decap in OVS-DPDK configuration consists of two flows:
>>>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
>>>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
>>>>
>>>> F1 is a classification flow. It has outer headers matches and it
>>>> classifies the packet as a VXLAN packet, and using tnl_pop action the
>>>> packet continues processing in F2.
>>>> F2 is a flow that has matches on tunnel metadata as well as on the inner
>>>> packet headers (as any other flow).
>>>>
>>>> In order to fully offload VXLAN decap path, both F1 and F2 should be
>>>> offloaded. As there are more than one flow in HW, it is possible that
>>>> F1 is done by HW but F2 is not. Packet is received by SW, and should be
>>>> processed starting from F2 as F1 was already done by HW.
>>>> Rte_flows are applicable only on physical port IDs. Keeping the original
>>>> physical in port on which the packet is received on enables applying
>>>> vport flows (e.g. F2) on that physical port.
>>>>
>>>> This patch-set makes use of [1] introduced in DPDK 20.11, that adds API
>>>> for tunnel offloads.
>>>>
>>>> Note that MLX5 PMD has a bug that the tnl_pop private actions must be
>>>> first. In OVS it is not.
>>>> Fixing this issue is scheduled for 21.05 (and stable 20.11.2).
>>>> Meanwhile, tests were done with a workaround for it [2].
>>>>
>>>> v2-v1:
>>>> - Tracking original in_port, and applying vport on that physical port instead of all PFs.
>>>> v3-v2:
>>>> - Traversing ports using a new API instead of flow_dump.
>>>> - Refactor packet state recover logic, with bug fix for error pop_header.
>>>> - One ref count for netdev in non-tunnel case.
>>>> - Rename variables, comments, rebase.
>>>> v4-v3:
>>>> - Extract orig_in_port from physdev for flow modify.
>>>> - Miss handling fixes.
>>>> v5-v4:
>>>> - Drop refactor offload rule creation commit.
>>>> - Comment about setting in_port in restore.
>>>> - Refactor vports flow offload commit.
>>>> v6-v5:
>>>> - Fixed duplicate netdev ref bug.
>>>>
>>>> Travis:
>>>> v1: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F756418552&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=kU0ZeAX7jjJVnqdfmCSjZPG3kC9nZ9iotokpSkBnFCI%3D&reserved=0
>>>> v2: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F758382963&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=kJg%2FQpcTKsc8YYOa6IINCc0s8zYCdIvOp38r4VMNVWU%3D&reserved=0
>>>> v3: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F761089087&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=aez%2FMC%2BNEyWXqhV%2BHMYvgdwsKoAp1aOAJRzAy8bmISQ%3D&reserved=0
>>>> v4: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F763146966&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=rCG1WrZ5SDqnKvH6tLWP9wbHgBFNJvqQUI5TLhV%2BD6w%3D&reserved=0
>>>> v5: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F765271879&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=l8NnWqOmLxdgpvBzMxp9tze4U4uf4uewv4KUnJEz9Nk%3D&reserved=0
>>>> v6: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F765816800&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=paN242Nu36lzVrNpsKr2cyxXh8W7CJwv4h3kFshgKHs%3D&reserved=0
>>>>
>>>> GitHub Actions:
>>>> v1: https://github.com/elibritstein/OVS/actions/runs/515334647
>>>> v2: https://github.com/elibritstein/OVS/actions/runs/554986007
>>>> v3: https://github.com/elibritstein/OVS/actions/runs/613226225
>>>> v4: https://github.com/elibritstein/OVS/actions/runs/658415274
>>>> v5: https://github.com/elibritstein/OVS/actions/runs/704357369
>>>> v6: https://github.com/elibritstein/OVS/actions/runs/716304028
>>>>
>>>> [1] https://mails.dpdk.org/archives/dev/2020-October/187314.html
>>>> [2] https://github.com/elibritstein/dpdk-stable/pull/1
>>>>
>>>>
>>>> Eli Britstein (10):
>>>>    netdev-offload: Add HW miss packet state recover API
>>>>    netdev-dpdk: Introduce DPDK tunnel APIs
>>>>    netdev-offload: Introduce an API to traverse ports
>>>>    netdev-dpdk: Add flow_api support for netdev vxlan vports
>>>>    netdev-offload-dpdk: Implement HW miss packet recover for vport
>>>>    dpif-netdev: Add HW miss packet state recover logic
>>>>    netdev-offload-dpdk: Change log rate limits
>>>>    netdev-offload-dpdk: Support tunnel pop action
>>>>    netdev-offload-dpdk: Support vports flows offload
>>>>    netdev-dpdk-offload: Add vxlan pattern matching function
>>>>
>>>> Ilya Maximets (2):
>>>>    netdev-offload: Allow offloading to netdev without ifindex.
>>>>    netdev-offload: Disallow offloading to unrelated tunneling vports.
>>>>
>>>> Sriharsha Basavapatna (1):
>>>>    dpif-netdev: Provide orig_in_port in metadata for tunneled packets
>>>>
>>>>   Documentation/howto/dpdk.rst  |   1 +
>>>>   NEWS                          |   2 +
>>>>   lib/dpif-netdev.c             |  97 +++--
>>>>   lib/netdev-dpdk.c             | 118 ++++++
>>>>   lib/netdev-dpdk.h             | 106 ++++-
>>>>   lib/netdev-offload-dpdk.c     | 704 +++++++++++++++++++++++++++++++---
>>>>   lib/netdev-offload-provider.h |   5 +
>>>>   lib/netdev-offload-tc.c       |   8 +
>>>>   lib/netdev-offload.c          |  47 ++-
>>>>   lib/netdev-offload.h          |  10 +
>>>>   lib/packets.h                 |   8 +-
>>>>   11 files changed, 1022 insertions(+), 84 deletions(-)
>>>>
>>> Hi.  I reviewed the whole series and it looks mostly OK to me.
>>> I made a several changes along the way and below you may find a diff
>>> of what I changed.  In short:
>>>
>>>   - Some style fixes: style of function prototypes and a way how long
>>>     function calls wrapped.  Added names to uint32_t arguments to
>>>     function prototypes.
>>>
>>>   - Clarified hw_miss_packet_recover() API.  It needs a note that it
>>>     takes ownership of a packet on error.  Fixed 'return -1' which
>>>     doesn't comply with the API definition (should return positive
>>>     errno).
>>>
>>>   - Moved NEWS updates to correct place.
>>>
>>>   - Added a packet drop counter.
>>>
>>>   - Refactored changes in dfc_processing().  Basically, restored it
>>>     to look mostly like it was before, because LIKELY/UNLIKELY markers
>>>     make no sense for non-offloading cases where they both should be
>>>     'unlikely'.  Also, old way of writing this part allows following
>>>     optimization:
>>>
>>>
>> https://patchwork.ozlabs.org/project/openvswitch/patch/c0b99b4b9be6c290a6aa3cb00049fac4ebfac5d8.161839
>>> 0390.git.bnemeth at redhat.com/
>>>
>>>   - And the only functional change that I made is that I guarded
>>>     actual offloading support with 'ifdef DPDK_EXPERIMENTAL_API',
>>>     because they doesn't make sense to me if packet restoration
>>>     API is not available.  I also added this information to the
>>>     NEWS file.  I think that we should not try to offload TUNNEL_POP
>>>     if we can't restore the tunnel metadata.  And if we can't have
>>>     the first flow in HW, there is no point adding the second one.
>>>
>>> Complete branch with ready-to-push patches available here:
>>>    https://github.com/igsilya/ovs/tree/tmp-vxlan-decap
>>>
>>> Diff below is a diff with this v6 patch-set.  Please, take a look/test.
>>> I'll wait for Ack on changes below or comments, if I broke something,
>>> before pushing this to the main repo.
>>>
>>> Best regards, Ilya Maximets.
>>>
>>> The diff:
>>>
>>> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
>>> index 4918d80f3..36314c06a 100644
>>> --- a/Documentation/howto/dpdk.rst
>>> +++ b/Documentation/howto/dpdk.rst
>>> @@ -398,7 +398,7 @@ Supported actions for hardware offload are:
>>>   - VLAN Push/Pop (push_vlan/pop_vlan).
>>>   - Modification of IPv6 (set_field:<ADDR>->ipv6_src/ipv6_dst/mod_nw_ttl).
>>>   - Clone/output (tnl_push and output) for encapsulating over a tunnel.
>>> -- Tunnel pop, for changing from a physical port to a vport.
>>> +- Tunnel pop, for packets received on physical ports.
>>>
>>>   Further Reading
>>>   ---------------
>>> diff --git a/NEWS b/NEWS
>>> index e49f2955d..10b3ab053 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -17,6 +17,10 @@ Post-v2.15.0
>>>        * OVS validated with DPDK 20.11.1. It is recommended to use this version
>>>          until further releases.
>>>        * New debug appctl command 'dpdk/get-malloc-stats'.
>>> +     * Add hardware offload support for tunnel pop action (experimental).
>>> +       Available only if DPDK experimantal APIs enabled during the build.
>>> +     * Add hardware offload support for VXLAN flows (experimental).
>>> +       Available only if DPDK experimantal APIs enabled during the build.
>>>      - ovsdb-tool:
>>>        * New option '--election-timer' to the 'create-cluster' command to set the
>>>          leader election timer during cluster creation.
>>> @@ -44,8 +48,6 @@ v2.15.0 - 15 Feb 2021
>>>      - DPDK:
>>>        * Removed support for vhost-user dequeue zero-copy.
>>>        * Add support for DPDK 20.11.
>>> -     * Add hardware offload support for tunnel pop action (experimental).
>>> -     * Add hardware offload support for VXLAN flows (experimental).
>>>      - Userspace datapath:
>>>        * Add the 'pmd' option to "ovs-appctl dpctl/dump-flows", which
>>>          restricts a flow dump to a single PMD thread if set.
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 1bd828f82..8766a00ea 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -114,6 +114,7 @@ COVERAGE_DEFINE(datapath_drop_invalid_port);
>>>   COVERAGE_DEFINE(datapath_drop_invalid_bond);
>>>   COVERAGE_DEFINE(datapath_drop_invalid_tnl_port);
>>>   COVERAGE_DEFINE(datapath_drop_rx_invalid_packet);
>>> +COVERAGE_DEFINE(datapath_drop_hw_miss_recover);
>>>
>>>   /* Protects against changes to 'dp_netdevs'. */
>>>   static struct ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER;
>>> @@ -7068,9 +7069,8 @@ smc_lookup_batch(struct dp_netdev_pmd_thread *pmd,
>>>       pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SMC_HIT, n_smc_hit);
>>>   }
>>>
>>> -static struct tx_port *
>>> -pmd_send_port_cache_lookup(const struct dp_netdev_pmd_thread *pmd,
>>> -                           odp_port_t port_no);
>>> +static struct tx_port * pmd_send_port_cache_lookup(
>>> +    const struct dp_netdev_pmd_thread *pmd, odp_port_t port_no);
>>>
>>>   static inline int
>>>   dp_netdev_hw_flow(const struct dp_netdev_pmd_thread *pmd,
>>> @@ -7081,17 +7081,13 @@ dp_netdev_hw_flow(const struct dp_netdev_pmd_thread *pmd,
>>>       struct tx_port *p;
>>>       uint32_t mark;
>>>
>>> -    if (!netdev_is_flow_api_enabled() || *recirc_depth_get() != 0) {
>>> -        *flow = NULL;
>>> -        return 0;
>>> -    }
>>> -
>>>       /* Restore the packet if HW processing was terminated before completion. */
>>>       p = pmd_send_port_cache_lookup(pmd, port_no);
>>>       if (OVS_LIKELY(p)) {
>>>           int err = netdev_hw_miss_packet_recover(p->port->netdev, packet);
>>>
>>> -        if (err != 0 && err != EOPNOTSUPP) {
>>> +        if (err && err != EOPNOTSUPP) {
>>> +            COVERAGE_INC(datapath_drop_hw_miss_recover);
>>>               return -1;
>>>           }
>>>       }
>>> @@ -7168,25 +7164,28 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
>>>               pkt_metadata_init(&packet->md, port_no);
>>>           }
>>>
>>> -        if (OVS_UNLIKELY(dp_netdev_hw_flow(pmd, port_no, packet, &flow))) {
>>> -            /* Packet restoration failed. Its mbuf was freed, do not continue
>>> -             * processing.
>>> -             */
>>> -            continue;
>>> -        } else if (OVS_LIKELY(flow)) {
>>> -            tcp_flags = parse_tcp_flags(packet);
>>> -            if (OVS_LIKELY(batch_enable)) {
>>> -                dp_netdev_queue_batches(packet, flow, tcp_flags, batches,
>>> -                                        n_batches);
>>> -            } else {
>>> -                /* Flow batching should be performed only after fast-path
>>> -                 * processing is also completed for packets with emc miss
>>> -                 * or else it will result in reordering of packets with
>>> -                 * same datapath flows. */
>>> -                packet_enqueue_to_flow_map(packet, flow, tcp_flags, flow_map,
>>> -                                           map_cnt++);
>>> +        if (netdev_is_flow_api_enabled() && *recirc_depth_get() == 0) {
>>> +            if (OVS_UNLIKELY(dp_netdev_hw_flow(pmd, port_no, packet, &flow))) {
>>> +                /* Packet restoration failed and it was dropped, do not
>>> +                 * continue processing.
>>> +                 */
>>> +                continue;
>>> +            }
>>> +            if (OVS_LIKELY(flow)) {
>>> +                tcp_flags = parse_tcp_flags(packet);
>>> +                if (OVS_LIKELY(batch_enable)) {
>>> +                    dp_netdev_queue_batches(packet, flow, tcp_flags, batches,
>>> +                                            n_batches);
>>> +                } else {
>>> +                    /* Flow batching should be performed only after fast-path
>>> +                     * processing is also completed for packets with emc miss
>>> +                     * or else it will result in reordering of packets with
>>> +                     * same datapath flows. */
>>> +                    packet_enqueue_to_flow_map(packet, flow, tcp_flags,
>>> +                                               flow_map, map_cnt++);
>>> +                }
>>> +                continue;
>>>               }
>>> -            continue;
>>>           }
>>>
>>>           miniflow_extract(packet, &key->mf);
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index 6e35d0574..bc5485d60 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -5365,11 +5365,11 @@ netdev_dpdk_rte_flow_get_restore_info(struct netdev *netdev,
>>>   }
>>>
>>>   int
>>> -netdev_dpdk_rte_flow_tunnel_action_decap_release
>>> -    (struct netdev *netdev,
>>> -     struct rte_flow_action *actions,
>>> -     uint32_t num_of_actions,
>>> -     struct rte_flow_error *error)
>>> +netdev_dpdk_rte_flow_tunnel_action_decap_release(
>>> +    struct netdev *netdev,
>>> +    struct rte_flow_action *actions,
>>> +    uint32_t num_of_actions,
>>> +    struct rte_flow_error *error)
>>>   {
>>>       struct netdev_dpdk *dev;
>>>       int ret;
>>> diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
>>> index 3b9bf8681..7b77ed8e0 100644
>>> --- a/lib/netdev-dpdk.h
>>> +++ b/lib/netdev-dpdk.h
>>> @@ -53,33 +53,28 @@ netdev_dpdk_get_port_id(struct netdev *netdev);
>>>
>>>   #ifdef ALLOW_EXPERIMENTAL_API
>>>
>>> -int
>>> -netdev_dpdk_rte_flow_tunnel_decap_set(struct netdev *,
>>> +int netdev_dpdk_rte_flow_tunnel_decap_set(struct netdev *,
>>> +                                          struct rte_flow_tunnel *,
>>> +                                          struct rte_flow_action **,
>>> +                                          uint32_t *num_of_actions,
>>> +                                          struct rte_flow_error *);
>>> +int netdev_dpdk_rte_flow_tunnel_match(struct netdev *,
>>>                                         struct rte_flow_tunnel *,
>>> -                                      struct rte_flow_action **,
>>> -                                      uint32_t *,
>>> -                                      struct rte_flow_error *);
>>> -int
>>> -netdev_dpdk_rte_flow_tunnel_match(struct netdev *,
>>> -                                  struct rte_flow_tunnel *,
>>> -                                  struct rte_flow_item **,
>>> -                                  uint32_t *,
>>> -                                  struct rte_flow_error *);
>>> -int
>>> -netdev_dpdk_rte_flow_get_restore_info(struct netdev *,
>>> -                                      struct dp_packet *,
>>> -                                      struct rte_flow_restore_info *,
>>> +                                      struct rte_flow_item **,
>>> +                                      uint32_t *num_of_items,
>>>                                         struct rte_flow_error *);
>>> -int
>>> -netdev_dpdk_rte_flow_tunnel_action_decap_release(struct netdev *,
>>> -                                                 struct rte_flow_action *,
>>> -                                                 uint32_t,
>>> -                                                 struct rte_flow_error *);
>>> -int
>>> -netdev_dpdk_rte_flow_tunnel_item_release(struct netdev *,
>>> -                                         struct rte_flow_item *,
>>> -                                         uint32_t,
>>> -                                         struct rte_flow_error *);
>>> +int netdev_dpdk_rte_flow_get_restore_info(struct netdev *,
>>> +                                          struct dp_packet *,
>>> +                                          struct rte_flow_restore_info *,
>>> +                                          struct rte_flow_error *);
>>> +int netdev_dpdk_rte_flow_tunnel_action_decap_release(struct netdev *,
>>> +                                                     struct rte_flow_action *,
>>> +                                                     uint32_t num_of_actions,
>>> +                                                     struct rte_flow_error *);
>>> +int netdev_dpdk_rte_flow_tunnel_item_release(struct netdev *,
>>> +                                             struct rte_flow_item *,
>>> +                                             uint32_t num_of_items,
>>> +                                             struct rte_flow_error *);
>>>
>>>   #else
>>>
>>> @@ -92,14 +87,13 @@ set_error(struct rte_flow_error *error, enum rte_flow_error_type type)
>>>   }
>>>
>>>   static inline int
>>> -netdev_dpdk_rte_flow_tunnel_decap_set(struct netdev *netdev OVS_UNUSED,
>>> -                                      struct rte_flow_tunnel *tunnel,
>>> -                                      struct rte_flow_action **actions,
>>> -                                      uint32_t *num_of_actions OVS_UNUSED,
>>> -                                      struct rte_flow_error *error)
>>> +netdev_dpdk_rte_flow_tunnel_decap_set(
>>> +    struct netdev *netdev OVS_UNUSED,
>>> +    struct rte_flow_tunnel *tunnel OVS_UNUSED,
>>> +    struct rte_flow_action **actions OVS_UNUSED,
>>> +    uint32_t *num_of_actions OVS_UNUSED,
>>> +    struct rte_flow_error *error)
>>>   {
>>> -    (void) tunnel;
>>> -    (void) actions;
>>>       set_error(error, RTE_FLOW_ERROR_TYPE_ACTION);
>>>       return -1;
>>>   }
>>> @@ -116,34 +110,34 @@ netdev_dpdk_rte_flow_tunnel_match(struct netdev *netdev OVS_UNUSED,
>>>   }
>>>
>>>   static inline int
>>> -netdev_dpdk_rte_flow_get_restore_info(struct netdev *netdev OVS_UNUSED,
>>> -                                      struct dp_packet *p OVS_UNUSED,
>>> -                                      struct rte_flow_restore_info *info,
>>> -                                      struct rte_flow_error *error)
>>> +netdev_dpdk_rte_flow_get_restore_info(
>>> +    struct netdev *netdev OVS_UNUSED,
>>> +    struct dp_packet *p OVS_UNUSED,
>>> +    struct rte_flow_restore_info *info OVS_UNUSED,
>>> +    struct rte_flow_error *error)
>>>   {
>>> -    (void) info;
>>>       set_error(error, RTE_FLOW_ERROR_TYPE_ATTR);
>>>       return -1;
>>>   }
>>>
>>>   static inline int
>>> -netdev_dpdk_rte_flow_tunnel_action_decap_release
>>> -    (struct netdev *netdev OVS_UNUSED,
>>> -     struct rte_flow_action *actions OVS_UNUSED,
>>> -     uint32_t num_of_actions OVS_UNUSED,
>>> -     struct rte_flow_error *error)
>>> +netdev_dpdk_rte_flow_tunnel_action_decap_release(
>>> +    struct netdev *netdev OVS_UNUSED,
>>> +    struct rte_flow_action *actions OVS_UNUSED,
>>> +    uint32_t num_of_actions OVS_UNUSED,
>>> +    struct rte_flow_error *error)
>>>   {
>>>       set_error(error, RTE_FLOW_ERROR_TYPE_NONE);
>>>       return 0;
>>>   }
>>>
>>>   static inline int
>>> -netdev_dpdk_rte_flow_tunnel_item_release(struct netdev *netdev OVS_UNUSED,
>>> -                                         struct rte_flow_item *items,
>>> -                                         uint32_t num_of_items OVS_UNUSED,
>>> -                                         struct rte_flow_error *error)
>>> +netdev_dpdk_rte_flow_tunnel_item_release(
>>> +    struct netdev *netdev OVS_UNUSED,
>>> +    struct rte_flow_item *items OVS_UNUSED,
>>> +    uint32_t num_of_items OVS_UNUSED,
>>> +    struct rte_flow_error *error)
>>>   {
>>> -    (void) items;
>>>       set_error(error, RTE_FLOW_ERROR_TYPE_NONE);
>>>       return 0;
>>>   }
>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>> index 52a74a707..363f32f71 100644
>>> --- a/lib/netdev-offload-dpdk.c
>>> +++ b/lib/netdev-offload-dpdk.c
>>> @@ -459,7 +459,7 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
>>>                  act_index >= flow_actions->tnl_pmd_actions_pos &&
>>>                  act_index < flow_actions->tnl_pmd_actions_pos +
>>>                              flow_actions->tnl_pmd_actions_cnt) {
>>> -        /* Opaque PMD tunnel actions is skipped. */
>>> +        /* Opaque PMD tunnel actions are skipped. */
>>>           return;
>>>       } else if (actions->type == RTE_FLOW_ACTION_TYPE_MARK) {
>>>           const struct rte_flow_action_mark *mark = actions->conf;
>>> @@ -792,9 +792,9 @@ free_flow_actions(struct flow_actions *actions)
>>>       for (i = 0; i < actions->cnt; i++) {
>>>           if (actions->tnl_pmd_actions_cnt &&
>>>               i == actions->tnl_pmd_actions_pos) {
>>> -            if (netdev_dpdk_rte_flow_tunnel_action_decap_release
>>> -                    (actions->tnl_netdev, actions->tnl_pmd_actions,
>>> -                     actions->tnl_pmd_actions_cnt, &error)) {
>>> +            if (netdev_dpdk_rte_flow_tunnel_action_decap_release(
>>> +                    actions->tnl_netdev, actions->tnl_pmd_actions,
>>> +                    actions->tnl_pmd_actions_cnt, &error)) {
>>>                   VLOG_DBG_RL(&rl, "%s: "
>>>                               "netdev_dpdk_rte_flow_tunnel_action_decap_release "
>>>                               "failed: %d (%s).",
>>> @@ -1009,7 +1009,7 @@ parse_vxlan_match(struct flow_patterns *patterns,
>>>       return 0;
>>>   }
>>>
>>> -static int
>>> +static int OVS_UNUSED

Note that if experimental is allowed, the OVS_UNUSED attribute is 
misleading.

Also see below.

>>>   parse_flow_tnl_match(struct netdev *tnldev,
>>>                        struct flow_patterns *patterns,
>>>                        odp_port_t orig_in_port,
>>> @@ -1031,7 +1031,7 @@ parse_flow_tnl_match(struct netdev *tnldev,
>>>
>>>   static int
>>>   parse_flow_match(struct netdev *netdev,
>>> -                 odp_port_t orig_in_port,
>>> +                 odp_port_t orig_in_port OVS_UNUSED,
>>>                    struct flow_patterns *patterns,
>>>                    struct match *match)
>>>   {
>>> @@ -1045,10 +1045,12 @@ parse_flow_match(struct netdev *netdev,
>>>       }
>>>
>>>       patterns->physdev = netdev;
>>> +#ifdef ALLOW_EXPERIMENTAL_API /* Packet restoration API required. */

In my opinion those should be removed in netdev-offload-dpdk.c, and keep 
such #ifdef only in netdev-dpdk (with stubs), so later, when dpdk 
removes the experimental attribute, there will be a single place to change.

This applies both to parse_flow_tnl_match and add_tnl_pop_action.

However, this is not critical and I would not hold the merge because of 
this.
>>>       if (netdev_vport_is_vport_class(netdev->netdev_class) &&
>>>           parse_flow_tnl_match(netdev, patterns, orig_in_port, match)) {
>>>           return -1;
>>>       }
>>> +#endif
>>>       memset(&consumed_masks->in_port, 0, sizeof consumed_masks->in_port);
>>>       /* recirc id must be zero. */
>>>       if (match->wc.masks.recirc_id & match->flow.recirc_id) {
>>> @@ -1679,7 +1681,7 @@ add_jump_action(struct flow_actions *actions, uint32_t group)
>>>       add_flow_action(actions, RTE_FLOW_ACTION_TYPE_JUMP, jump);
>>>   }
>>>
>>> -static int
>>> +static int OVS_UNUSED
>>>   add_tnl_pop_action(struct netdev *netdev,
>>>                      struct flow_actions *actions,
>>>                      const struct nlattr *nla)
>>> @@ -1767,10 +1769,12 @@ parse_flow_actions(struct netdev *netdev,
>>>                                       clone_actions_len)) {
>>>                   return -1;
>>>               }
>>> +#ifdef ALLOW_EXPERIMENTAL_API /* Packet restoration API required. */
>>>           } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_TUNNEL_POP) {
>>>               if (add_tnl_pop_action(netdev, actions, nla)) {
>>>                   return -1;
>>>               }
>>> +#endif
>>>           } else {
>>>               VLOG_DBG_RL(&rl, "Unsupported action type %d", nl_attr_type(nla));
>>>               return -1;
>>> @@ -2067,7 +2071,7 @@ get_vxlan_netdev_cb(struct netdev *netdev,
>>>       struct get_vport_netdev_aux *aux = aux_;
>>>
>>>       if (strcmp(netdev_get_type(netdev), "vxlan")) {
>>> -       return false;
>>> +        return false;
>>>       }
>>>
>>>       tnl_cfg = netdev_get_tunnel_config(netdev);
>>> @@ -2121,18 +2125,16 @@ netdev_offload_dpdk_hw_miss_packet_recover(struct netdev *netdev,
>>>       struct rte_flow_restore_info rte_restore_info;
>>>       struct rte_flow_tunnel *rte_tnl;
>>>       struct netdev *vport_netdev;
>>> -    struct rte_flow_error error;
>>>       struct pkt_metadata *md;
>>>       struct flow_tnl *md_tnl;
>>>       odp_port_t vport_odp;
>>>       int ret = 0;
>>>
>>>       if (netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
>>> -                                              &rte_restore_info, &error)) {
>>> +                                              &rte_restore_info, NULL)) {
>>>           /* This function is called for every packet, and in most cases there
>>>            * will be no restore info from the HW, thus error is expected.
>>>            */
>>> -        (void) error;
>>>           return 0;
>>>       }
>>>
>>> @@ -2144,25 +2146,25 @@ netdev_offload_dpdk_hw_miss_packet_recover(struct netdev *netdev,
>>>       vport_netdev = get_vport_netdev(netdev->dpif_type, rte_tnl,
>>>                                       &vport_odp);
>>>       if (!vport_netdev) {
>>> -        VLOG_WARN("Could not find vport netdev");
>>> +        VLOG_WARN_RL(&rl, "Could not find vport netdev");
>>>           return EOPNOTSUPP;
>>>       }
>>>
>>>       md = &packet->md;
>>>       /* For tunnel recovery (RTE_FLOW_RESTORE_INFO_TUNNEL), it is possible
>>> -     * to have the packet to still be encapsulated, or not
>>> -     * (RTE_FLOW_RESTORE_INFO_ENCAPSULATED).
>>> +     * to have the packet to still be encapsulated, or not.  This is reflected
>>> +     * by the RTE_FLOW_RESTORE_INFO_ENCAPSULATED flag.
>>>        * In the case it is on, the packet is still encapsulated, and we do
>>>        * the pop in SW.
>>>        * In the case it is off, the packet is already decapsulated by HW, and
>>> -     * the tunnel info is provided in the tunnel struct. For this case we
>>> +     * the tunnel info is provided in the tunnel struct.  For this case we
>>>        * take it to OVS metadata.
>>>        */
>>>       if (rte_restore_info.flags & RTE_FLOW_RESTORE_INFO_ENCAPSULATED) {
>>>           if (!vport_netdev->netdev_class ||
>>>               !vport_netdev->netdev_class->pop_header) {
>>> -            VLOG_ERR("vport nedtdev=%s with no pop_header method",
>>> -                     netdev_get_name(vport_netdev));
>>> +            VLOG_ERR_RL(&rl, "vport nedtdev=%s with no pop_header method",
>>> +                        netdev_get_name(vport_netdev));
>>>               ret = EOPNOTSUPP;
>>>               goto close_vport_netdev;
>>>           }
>>> @@ -2171,7 +2173,7 @@ netdev_offload_dpdk_hw_miss_packet_recover(struct netdev *netdev,
>>>               /* If there is an error with popping the header, the packet is
>>>                * freed. In this case it should not continue SW processing.
>>>                */
>>> -            ret = -1;
>>> +            ret = EINVAL;
>>>               goto close_vport_netdev;
>>>           }
>>>       } else {
>>> diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h
>>> index f24c7dd19..348ca7081 100644
>>> --- a/lib/netdev-offload-provider.h
>>> +++ b/lib/netdev-offload-provider.h
>>> @@ -89,7 +89,8 @@ struct netdev_flow_api {
>>>
>>>       /* Recover the packet state (contents and data) for continued processing
>>>        * in software.
>>> -     * Return 0 if successful, otherwise returns a positive errno value. */
>>> +     * Return 0 if successful, otherwise returns a positive errno value and
>>> +     * takes ownership of a packet if errno != EOPNOTSUPP. */
>>>       int (*hw_miss_packet_recover)(struct netdev *, struct dp_packet *);
>>>
>>>       /* Initializies the netdev flow api.
>>> ---
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=%2B0FlHTBOVLb1A5dT3IddrxmRAWFtdVtHXQ0K9YH6M0M%3D&reserved=0
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Celibr%40nvidia.com%7Ca74658db54f34dda383208d9365a2a66%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637600583912503391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=%2B0FlHTBOVLb1A5dT3IddrxmRAWFtdVtHXQ0K9YH6M0M%3D&reserved=0


More information about the dev mailing list