[ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"

Ben Pfaff blp at ovn.org
Mon Jul 31 13:05:33 UTC 2017


Ilya, should we apply this patch to branch-2.6?  Are there other patches
that should be backported?

On Wed, Jul 26, 2017 at 03:28:12PM +0300, Ilya Maximets wrote:
> Hi.
> 
> You need to backport at least following patch:
> 
> commit 3b1fb0779b87788968c1a6a9ff295a9883547485
> Author: Daniele Di Proietto <diproiettod at vmware.com>
> Date:   Tue Nov 15 15:40:49 2016 -0800
> 
>     netdev-dpdk: Don't call rte_dev_stop() in update_flags().
>     
>     Calling rte_eth_dev_stop() while the device is running causes a crash.
>     
>     We could use rte_eth_dev_set_link_down(), but not every PMD implements
>     that, and I found one NIC where that has no effect.
>     
>     Instead, this commit checks if the device has the NETDEV_UP flag when
>     transmitting or receiving (similarly to what we do for vhostuser). I
>     didn't notice any performance difference with this check in case the
>     device is up.
>     
>     An alternative would be to remove the device queues from the pmd threads
>     tx and receive cache, but that requires reconfiguration and I'd prefer
>     to avoid it, because the change can come from OpenFlow.
>     
>     Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
>     Acked-by: Ilya Maximets <i.maximets at samsung.com>
> 
> This should fix your issue.
> In general, I'm suggesting to use stable 2.7 OVS, there was too many DPDK
> related changes including stability fixes since 2.6.
> 
> Best regards, Ilya Maximets.
> 
> > Hi
> >   We are experiencing a openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down".
> > 
> > backtrace of core is like below. Is there any issue reported earlier  for this type of crash in openvswitch community.
> > 
> > (gdb) bt
> > #0  ixgbe_rxq_rearm (rxq=0x7fa45061f800) at /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:98
> > #1  _recv_raw_pkts_vec (split_packet=0x0, nb_pkts=32, rx_pkts=<optimized out>, rxq=0x7fa45061f800)
> >     at /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:290
> > #2  ixgbe_recv_pkts_vec (rx_queue=0x7fa45061f800, rx_pkts=<optimized out>, nb_pkts=<optimized out>)
> >     at /home/sdn/new_cloud_sdn_switch_2/cloud-sdn-switch/dpdk/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c:474
> > #3  0x000000e5000000e4 in ?? ()
> > #4  0x00000046000000e6 in ?? ()
> > #5  0x0000006a00000069 in ?? ()
> > #6  0x0000006c0000006b in ?? ()
> > #7  0x000000ec0000006d in ?? ()
> > #8  0x000000ee000000ed in ?? ()
> > #9  0x00000001537f5780 in ?? ()
> > #10 0x0000000000000000 in ?? ()
> > (gdb)
> > 
> > 
> > I have analyzed the core and it seems it is a result of device stop and packet receive from the port happening at same time by two thread
> > OVS main thread(device stop) and PMD thread(pkt receive). More precisely main thread cleaning the packet buffer from rxq sw_ring to avoid the
> > packet buffer leak while in parallel PMD thread is filling the packet buffer in sw_ring/descriptor ring as part of ixgbe_recv_pkts_vec.
> > 
> > version used is: openvswitch (2.6.1) with dpdk (16.11).
> > 
> > This crash is not every time reproducible but frequency seems to be high.
> > 
> > I am new to openvswitch community and this is first time I am posting a query. let me know if anything you require from my side.
> > 
> > Thanks
> > Keshav
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list