[ovs-discuss] segmentation fault when adding a VF in DPDK to a switch

Stokes, Ian ian.stokes at intel.com
Wed Mar 7 13:41:36 UTC 2018


Hi Ricardo,

After some more time to look at the issue you could do something like below to enable crc for the interface (Note I haven’t fully validated this).

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index af9843a..28d7d1e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -700,6 +700,12 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)

     conf.rxmode.hw_ip_checksum = (dev->hw_ol_features &
                                   NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
+
+    /*
+     * Need to enable hw_strip_crc specifically for SRIOV devices.
+     */
+    conf.rxmode.hw_strip_crc = 1;
+

On my system this at least got past the configuration error when adding the SRIOV VF port and I was able to pass traffic through the port in a simple VF to phy port setup . As I’ve only completed minor validation on this maybe you could give it a shot and see if it works on your setup.

With regards to the VSI queue error I mentioned in previous posts, with some more investigation I found this only occurred when running the setup of SRIOV VFs in DPDK with GDB, I was able to reproduce the same issue in the DPDK l2fwd sample app so it is not specific to OVS. Once you are not running OVS with GDB during the SRIOV setup it should be OK. I’ll need to look at this in a little bit more detail when I have time but for the moment it shouldn’t block you.

Hope this helps,

Regards
Ian



From: ovs-discuss-bounces at openvswitch.org [mailto:ovs-discuss-bounces at openvswitch.org] On Behalf Of Stokes, Ian
Sent: Thursday, February 1, 2018 11:52 AM
To: riccardoravaioli at gmail.com
Cc: ovs-discuss at openvswitch.org
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a switch

Hi Ricardo,

Apologies for the delay. Unfortunately with the OVS 2.9 release I haven’t had much time to look at this further.

At the very least I think work needs to be done for dpdk.c and netdev-dpdk.c to enable configuration of VFs specifically (to account for the HW_CRC and VSI queue configurations).

There would also be a task to ensure the work required for enabling a VF on the i40e driver would also cover enabling a VF for the ixgbe driver. In DPDK it’s been the case in the past that driver implementations for different NIC devices can differ.

This could be looked at in the OVS 2.10 development cycle at some point. I can post an update here when there is progress.

Thanks
Ian

From: scaricaposta at gmail.com<mailto:scaricaposta at gmail.com> [mailto:scaricaposta at gmail.com] On Behalf Of Riccardo Ravaioli
Sent: Thursday, January 25, 2018 4:35 PM
To: Stokes, Ian <ian.stokes at intel.com<mailto:ian.stokes at intel.com>>
Cc: ovs-discuss at openvswitch.org<mailto:ovs-discuss at openvswitch.org>
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a switch

Hi Ian,
Thanks for looking into the issue. Anything new?
Thanks a lot!
Riccardo

On 11 January 2018 at 23:50, Stokes, Ian <ian.stokes at intel.com<mailto:ian.stokes at intel.com>> wrote:
Hi Ricardo,

That’s for reporting the issue and providing the steps to reproduce.

I was able to reproduce this with an i40e VF using igb_uio.

In short it seems there is no support currently for ixgbe and i40e VF devices in OVS with DPDK.

There are 2  issues at play here. First is the configuration error when creating and starting the VF in DPDK, the second issue is the Segfault in OVS.

The configuration of the VF fails (For the i40e device at least) because of the expectation in DPDK that the HW_CRC stripping flag is enabled in the device configuration for VFs. In your logs you will see an error reporting this. By default this seems to be disabled for VFs in OVS.

Looking in the DPDK code this is confirmed by the following in i40evf_dev_configure()  which code execution hits

   │1568            /* For non-DPDK PF drivers, VF has no ability to disable HW
   │1569             * CRC strip, and is implicitly enabled by the PF.
   │1570             */
   │1571            if (!conf->rxmode.hw_strip_crc) {
   │1572                    vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
   │1573                    if ((vf->version_major == VIRTCHNL_VERSION_MAJOR) &&
   │1574                        (vf->version_minor <= VIRTCHNL_VERSION_MINOR)) {
   │1575                            /* Peer is running non-DPDK PF driver. */
   │1576                            PMD_INIT_LOG(ERR, "VF can't disable HW CRC Strip");
   │1577                            return -EINVAL;
   │1578                    }

Out of interest I enabled HW_CRC in the configuration for the device manually in the ovs code for testing purposes. Although this allows the queue configuration to succeed the VF will later fail to start due to an issue with VSI queue mapping when DPDK attempts to start the device. I’ll have to take another look to see what exactly is going wrong here, I suspect there is more configuration needed for VFs than PFs.

The segmentation fault happens due to the error occurring during the dpdk_eth_dev_queue_setup() function, this is a separate issue and unrelated to VFs. I have seen failures in this area cause segmentation faults before in OVS so it’s an area that needs to be looked at again to handle DPDK errors properly IMO.

I hope this answers your question and I’ll follow up once I have a little more info on how to enable the VF functionality.

Thanks
Ian



From: ovs-discuss-bounces at openvswitch.org<mailto:ovs-discuss-bounces at openvswitch.org> [mailto:ovs-discuss-bounces at openvswitch.org<mailto:ovs-discuss-bounces at openvswitch.org>] On Behalf Of Riccardo Ravaioli
Sent: Thursday, January 11, 2018 10:27 AM
To: ovs-discuss at openvswitch.org<mailto:ovs-discuss at openvswitch.org>
Subject: Re: [ovs-discuss] segmentation fault when adding a VF in DPDK to a switch

Here are the steps to reproduce the issue:
1. Create one Virtual Function (VF) on a physical interface that supports SR-IOV (in my case it's an Intel i350 interface):
$ echo 1 > /sys/class/net/eth10/device/sriov_numvfs
2. Lookup its PCI address, for example with dpdk-devbind.py:
$ dpdk-devbind.py --status-dev net
0000:05:10.3 'I350 Ethernet Controller Virtual Function 1520' if=eth11 drv=igbvf unused=igb_uio,vfio-pci,uio_pci_generic
3. Bind the VF to a DPDK-compatible driver. I'll use vfio-pci, but igb_uio too will reproduce the issue:
$ dpdk-devbind.py --bind=vfio-pci 0000:05:10.3
4. Create an OVS bridge and set its datapath type to netdev:
$ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
5. Add the VF to the bridge as a DPDK interface:
$ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:05:10.3
6. Now ovs-vswitchd.log reports that OVS repeatedly crashes (segmentation fault) and restarts itself, in a loop:
2018-01-11T09:28:28.338Z|00139|dpdk|INFO|EAL: PCI device 0000:05:10.3 on NUMA socket 0
2018-01-11T09:28:28.338Z|00140|dpdk|INFO|EAL:   probe driver: 8086:1520 net_e1000_igb_vf
2018-01-11T09:28:28.338Z|00141|dpdk|INFO|EAL:   using IOMMU type 1 (Type 1)
2018-01-11T09:28:28.560Z|00142|dpdk|INFO|PMD: eth_igbvf_dev_init():     VF MAC address not assigned by Host PF
2018-01-11T09:28:28.560Z|00143|dpdk|INFO|PMD: eth_igbvf_dev_init():     Assign randomly generated MAC address c6:13:67:7b:31:6b
2018-01-11T09:28:28.560Z|00144|netdev_dpdk|INFO|Device '0000:05:10.3' attached to DPDK
2018-01-11T09:28:28.563Z|00145|dpif_netdev|INFO|PMD thread on numa_id: 0, core id:  3 created.
2018-01-11T09:28:28.566Z|00146|dpif_netdev|INFO|PMD thread on numa_id: 0, core id:  2 created.
2018-01-11T09:28:28.566Z|00147|dpif_netdev|INFO|There are 2 pmd threads on numa node 0
2018-01-11T09:28:28.646Z|00148|dpdk|INFO|PMD: igbvf_dev_configure(): VF can't disable HW CRC Strip
2018-01-11T09:28:28.646Z|00149|netdev_dpdk|ERR|Interface dpdk-p0 MTU (1500) setup error: Operation not supported
2018-01-11T09:28:28.646Z|00150|netdev_dpdk|ERR|Interface dpdk-p0(rxq:1 txq:1) configure error: Operation not supported
2018-01-11T09:28:29.062Z|00002|daemon_unix(monitor)|ERR|1 crashes: pid 2494 died, killed (Segmentation fault), core dumped, restarting
7. Removing the VF from the bridge stops this behaviour:
$ ovs-vsctl del-port br0 dpdk-p0

The same happens if I restart openvswitch between steps 4 and 5 and let it initialize itself with the list of DPDK devices, instead of hotplugging them at runtime, as described above.
Riccardo


On 11 January 2018 at 01:27, Riccardo Ravaioli <riccardoravaioli at gmail.com<mailto:riccardoravaioli at gmail.com>> wrote:
Hi,

I was going through the openvswitch+dpdk tutorial and wanted to add a virtual function (VF) to a bridge as a dpdk interface.

I can bind the VF to the vfio-pci driver successfully with dpdk-devbind.py, but as soon as I add the interface to an ovs bridge (in netdev mode), openvswitch goes in segmentation fault and continuously tries to restart itself.

I'm running openvswitch 2.8.1 and dpdk 17.11 on Debian jessie with Linux kernel 4.6.

Is this a known problem? Is there a fix?
I have the same issue with VFs bound to igb_uio, whereas with real physical interfaces it works just fine.

Here are the relevant lines from ovs-vswitchd.log:

2018-01-10T15:53:26.949Z|00157|dpdk|INFO|PMD: igbvf_dev_configure(): VF can't disable HW CRC Strip
2018-01-10T15:53:26.949Z|00158|netdev_dpdk|ERR|Interface 0.extra2 MTU (1500) setup error: Operation not supported
2018-01-10T15:53:26.949Z|00159|netdev_dpdk|ERR|Interface 0.extra2(rxq:1 txq:1) configure error: Operation not supported
2018-01-10T15:53:27.333Z|00066|daemon_unix(monitor)|ERR|fork child died before signaling startup (killed (Segmentation fault))
2018-01-10T15:53:27.333Z|00067|daemon_unix(monitor)|WARN|23 crashes: pid 21413 died, killed (Segmentation fault), core dumped, waiting until 10 seconds since last restart
2018-01-10T15:53:33.333Z|00068|daemon_unix(monitor)|ERR|23 crashes: pid 21413 died, killed (Segmentation fault), core dumped, restarting
Thanks!

Riccardo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180307/e95c35ff/attachment-0001.html>


More information about the discuss mailing list