[ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"

Ilya Maximets i.maximets at samsung.com
Wed Jul 26 12:28:12 UTC 2017


Hi.

You need to backport at least following patch:

commit 3b1fb0779b87788968c1a6a9ff295a9883547485
Author: Daniele Di Proietto <diproiettod at vmware.com>
Date:   Tue Nov 15 15:40:49 2016 -0800

    netdev-dpdk: Don't call rte_dev_stop() in update_flags().
    
    Calling rte_eth_dev_stop() while the device is running causes a crash.
    
    We could use rte_eth_dev_set_link_down(), but not every PMD implements
    that, and I found one NIC where that has no effect.
    
    Instead, this commit checks if the device has the NETDEV_UP flag when
    transmitting or receiving (similarly to what we do for vhostuser). I
    didn't notice any performance difference with this check in case the
    device is up.
    
    An alternative would be to remove the device queues from the pmd threads
    tx and receive cache, but that requires reconfiguration and I'd prefer
    to avoid it, because the change can come from OpenFlow.
    
    Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
    Acked-by: Ilya Maximets <i.maximets at samsung.com>

This should fix your issue.
In general, I'm suggesting to use stable 2.7 OVS, there was too many DPDK
related changes including stability fixes since 2.6.

Best regards, Ilya Maximets.

> Hi
>   We are experiencing a openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down".
> 
> backtrace of core is like below. Is there any issue reported earlier  for this type of crash in openvswitch community.
> 
> (gdb) bt
> #0  ixgbe_rxq_rearm (rxq=0x7fa45061f800) at /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:98
> #1  _recv_raw_pkts_vec (split_packet=0x0, nb_pkts=32, rx_pkts=<optimized out>, rxq=0x7fa45061f800)
>     at /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:290
> #2  ixgbe_recv_pkts_vec (rx_queue=0x7fa45061f800, rx_pkts=<optimized out>, nb_pkts=<optimized out>)
>     at /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:474
> #3  0x000000e5000000e4 in ?? ()
> #4  0x00000046000000e6 in ?? ()
> #5  0x0000006a00000069 in ?? ()
> #6  0x0000006c0000006b in ?? ()
> #7  0x000000ec0000006d in ?? ()
> #8  0x000000ee000000ed in ?? ()
> #9  0x00000001537f5780 in ?? ()
> #10 0x0000000000000000 in ?? ()
> (gdb)
> 
> 
> I have analyzed the core and it seems it is a result of device stop and packet receive from the port happening at same time by two thread
> OVS main thread(device stop) and PMD thread(pkt receive). More precisely main thread cleaning the packet buffer from rxq sw_ring to avoid the
> packet buffer leak while in parallel PMD thread is filling the packet buffer in sw_ring/descriptor ring as part of ixgbe_recv_pkts_vec.
> 
> version used is: openvswitch (2.6.1) with dpdk (16.11).
> 
> This crash is not every time reproducible but frequency seems to be high.
> 
> I am new to openvswitch community and this is first time I am posting a query. let me know if anything you require from my side.
> 
> Thanks
> Keshav



More information about the dev mailing list