[ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"
keshav.gupta at ericsson.com
Thu Jul 27 10:39:14 UTC 2017
I think this patch will have below issue though I don't know how much impact it will cause.
packet which are not processed and present in the dpdk port’s rxq at the time of bringing down the port (using mod-port) they will remain there. And next time when
Somebody bring up the port(using mode-port) that time those old packet will be received first.
From: Keshav Gupta
Sent: Thursday, July 27, 2017 3:15 PM
To: 'Ilya Maximets'; ovs-dev at openvswitch.org
Subject: RE: Re: [ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"
Thanks Ilya Maximets
It fixes the issue.
From: Ilya Maximets [mailto:i.maximets at samsung.com]
Sent: Wednesday, July 26, 2017 5:58 PM
To: ovs-dev at openvswitch.org; Keshav Gupta
Subject: Re: Re: [ovs-dev] Openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down"
You need to backport at least following patch:
Author: Daniele Di Proietto <diproiettod at vmware.com>
Date: Tue Nov 15 15:40:49 2016 -0800
netdev-dpdk: Don't call rte_dev_stop() in update_flags().
Calling rte_eth_dev_stop() while the device is running causes a crash.
We could use rte_eth_dev_set_link_down(), but not every PMD implements
that, and I found one NIC where that has no effect.
Instead, this commit checks if the device has the NETDEV_UP flag when
transmitting or receiving (similarly to what we do for vhostuser). I
didn't notice any performance difference with this check in case the
device is up.
An alternative would be to remove the device queues from the pmd threads
tx and receive cache, but that requires reconfiguration and I'd prefer
to avoid it, because the change can come from OpenFlow.
Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
Acked-by: Ilya Maximets <i.maximets at samsung.com>
This should fix your issue.
In general, I'm suggesting to use stable 2.7 OVS, there was too many DPDK related changes including stability fixes since 2.6.
Best regards, Ilya Maximets.
> We are experiencing a openvswitch crash when bringing down the dpdk bond port using "ovs-ofctl mod-port br-prv dpdk1 down".
> backtrace of core is like below. Is there any issue reported earlier for this type of crash in openvswitch community.
> (gdb) bt
> #0 ixgbe_rxq_rearm (rxq=0x7fa45061f800) at
> #1 _recv_raw_pkts_vec (split_packet=0x0, nb_pkts=32, rx_pkts=<optimized out>, rxq=0x7fa45061f800)
> #2 ixgbe_recv_pkts_vec (rx_queue=0x7fa45061f800, rx_pkts=<optimized out>, nb_pkts=<optimized out>)
> #3 0x000000e5000000e4 in ?? ()
> #4 0x00000046000000e6 in ?? ()
> #5 0x0000006a00000069 in ?? ()
> #6 0x0000006c0000006b in ?? ()
> #7 0x000000ec0000006d in ?? ()
> #8 0x000000ee000000ed in ?? ()
> #9 0x00000001537f5780 in ?? ()
> #10 0x0000000000000000 in ?? ()
> I have analyzed the core and it seems it is a result of device stop
> and packet receive from the port happening at same time by two thread
> OVS main thread(device stop) and PMD thread(pkt receive). More precisely main thread cleaning the packet buffer from rxq sw_ring to avoid the packet buffer leak while in parallel PMD thread is filling the packet buffer in sw_ring/descriptor ring as part of ixgbe_recv_pkts_vec.
> version used is: openvswitch (2.6.1) with dpdk (16.11).
> This crash is not every time reproducible but frequency seems to be high.
> I am new to openvswitch community and this is first time I am posting a query. let me know if anything you require from my side.
More information about the dev