[ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload

Ilya Maximets i.maximets at ovn.org
Wed Jun 23 15:24:33 UTC 2021


On 6/23/21 5:18 PM, Ferriter, Cian wrote:
>> -----Original Message-----
>> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Ferriter, Cian
>> Sent: Wednesday 23 June 2021 13:38
>> To: Ilya Maximets <i.maximets at ovn.org>; Eli Britstein <elibr at nvidia.com>; dev at openvswitch.org
>> Cc: Ivan.Malov at oktetlabs.ru; Ameer Mahagneh <ameerm at nvidia.com>; Majd Dibbiny <majd at nvidia.com>
>> Subject: Re: [ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload
>>
>> Hi all,
>>
>> As part of rebasing our AVX512 DPIF on this patchset, I tested this patchset with partial HWOL and I'm
>> seeing strange behaviour.
>>
>> I'll report back more detailed findings soon, just wanted to mention this here as soon as I found the
>> issue.
>>
>> Thanks,
>> Cian
>>
> 
> More details on the issue I'm seeing:
> I'm using Ilya's branch from Github:
> https://github.com/igsilya/ovs/tree/tmp-vxlan-decap
> 
> ~/ovs_scripts# $OVS_DIR/utilities/ovs-vsctl list Open_vSwitch
> dpdk_version        : "DPDK 20.11.1"
> other_config        : {dpdk-hugepage-dir="/mnt/huge", dpdk-init="true", dpdk-lcore-mask="0x1", dpdk-socket-mem="2048,0", emc-insert-inv-prob="0", hw-offload="true", pmd-cpu-mask="0x2"}
> 
> ~/ovs_scripts# $OVS_DIR/utilities/ovs-vsctl show
> 31584ce5-09c1-44b3-ab27-1a0308d63fff
>     Bridge br0
>         datapath_type: netdev
>         Port br0
>             Interface br0
>                 type: internal
>         Port phy0
>             Interface phy0
>                 type: dpdk
>                 options: {dpdk-devargs="5e:00.0"}
> 
> ~/ovs_scripts# $OVS_DIR/utilities/ovs-ofctl dump-flows br0
>  cookie=0x0, duration=29.466s, table=0, n_packets=0, n_bytes=0, in_port=phy0 actions=IN_PORT
> 
> I'm expecting the flow to be partially offloaded, but I get a segfault when using the above branch. More info on the segfault below:
> 
> Thread 13 "pmd-c01/id:8" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7f9f72734700 (LWP 19327)]
> 0x000056163bf0d825 in set_error (error=0x0, type=RTE_FLOW_ERROR_TYPE_ATTR) at lib/netdev-dpdk.h:84
> (gdb) bt
> #0  0x000056163bf0d825 in set_error (error=0x0, type=RTE_FLOW_ERROR_TYPE_ATTR) at lib/netdev-dpdk.h:84
> #1  0x000056163bf0d8d3 in netdev_dpdk_rte_flow_get_restore_info (netdev=0x1bfc65c80, p=0x19033af00, info=0x7f9f72729a20, error=0x0) at lib/netdev-dpdk.h:119
> #2  0x000056163bf14da3 in netdev_offload_dpdk_hw_miss_packet_recover (netdev=0x1bfc65c80, packet=0x19033af00) at lib/netdev-offload-dpdk.c:2133
> #3  0x000056163bde3662 in netdev_hw_miss_packet_recover (netdev=0x1bfc65c80, packet=0x19033af00) at lib/netdev-offload.c:265
> #4  0x000056163bda19a9 in dp_netdev_hw_flow (pmd=0x7f9f72735010, port_no=2, packet=0x19033af00, flow=0x7f9f72729b98) at lib/dpif-netdev.c:7087
> #5  0x000056163bda1c5c in dfc_processing (pmd=0x7f9f72735010, packets_=0x7f9f727310d0, keys=0x7f9f7272c480, missed_keys=0x7f9f7272c370, batches=0x7f9f72729f60, n_batches=0x7f9f72730f70, flow_map=0x7f9f72729c50, n_flows=0x7f9f72730f78, index_map=0x7f9f72729c30 "", md_is_valid=false, port_no=2) at lib/dpif-netdev.c:7168
> #6  0x000056163bda2f3e in dp_netdev_input__ (pmd=0x7f9f72735010, packets=0x7f9f727310d0, md_is_valid=false, port_no=2) at lib/dpif-netdev.c:7475
> #7  0x000056163bda3105 in dp_netdev_input (pmd=0x7f9f72735010, packets=0x7f9f727310d0, port_no=2) at lib/dpif-netdev.c:7519
> #8  0x000056163bd9ab04 in dp_netdev_process_rxq_port (pmd=0x7f9f72735010, rxq=0x56163fb3f610, port_no=2) at lib/dpif-netdev.c:4774
> #9  0x000056163bd9ee17 in pmd_thread_main (f_=0x7f9f72735010) at lib/dpif-netdev.c:6063
> #10 0x000056163be71c88 in ovsthread_wrapper (aux_=0x56163fb3fe70) at lib/ovs-thread.c:383
> #11 0x00007f9f884cf6db in start_thread (arg=0x7f9f72734700) at pthread_create.c:463
> #12 0x00007f9f862bb71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> In netdev_offload_dpdk_hw_miss_packet_recover() calls netdev_dpdk_rte_flow_get_restore_info() with a NULL for the struct rte_flow_error *error argument:
> 
>     if (netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
>                                               &rte_restore_info, NULL)) {
>         /* This function is called for every packet, and in most cases there
>          * will be no restore info from the HW, thus error is expected.
>          */
>         return 0;
>     }
> 
> There are 2 "netdev_dpdk_rte_flow_get_restore_info()" functions. One in lib/netdev-dpdk.h and one in lib/netdev-dpdk.c. 
> 
> I don't have the experimental API enabled, so I'm using the function rom lib/netdev-dpdk.h. 

Yes, that's my fault.  I replaced 'error' with NULL, because actual DPDK
implementation supports that and we're not using this error anyway.
But I missed the fact that dummy implementation doesn't support NULL as
argument.  Following change should fix your issue:

diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index 7b77ed8e0..699be3fb4 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -81,6 +81,9 @@ int netdev_dpdk_rte_flow_tunnel_item_release(struct netdev *,
 static inline void
 set_error(struct rte_flow_error *error, enum rte_flow_error_type type)
 {
+    if (!error) {
+        return;
+    }
     error->type = type;
     error->cause = NULL;
     error->message = NULL;
---

Please, try it out.

Best regrds, Ilya Maximets.


More information about the dev mailing list