[ovs-discuss] [bug] a new problem of ovs-dpdk
王华夏
wanghuaxia at jd.com
Wed Mar 16 01:29:55 UTC 2016
I tested it with qemu 2.5, and change the ovs version into 2.5 release version (it is not the release version before). get another problem.
The ovs-vswitch crashed after the test run sometimes(it can be repeated ), and it looks like has memory leak
The stacks as follows:
gdb) bt
#0 0x00007f72bfc2f5d7 in raise () from /lib64/libc.so.6
#1 0x00007f72bfc30cc8 in abort () from /lib64/libc.so.6
#2 0x0000000000636fce in ovs_abort_valist (err_no=<optimized out>, format=<optimized out>, args=args at entry=0x7ffe3fe2b2f8) at lib/util.c:323
#3 0x0000000000637057 in ovs_abort (err_no=err_no at entry=0, format=format at entry=0x6dfad5 "virtual memory exhausted") at lib/util.c:315
#4 0x0000000000637072 in out_of_memory () at lib/util.c:89
#5 0x0000000000651465 in dpdk_rte_mzalloc (sz=sz at entry=3170304) at lib/netdev-dpdk.c:269
#6 0x0000000000651a33 in netdev_dpdk_alloc_txq (netdev=netdev at entry=0x7f72765ec400, n_txqs=n_txqs at entry=1024) at lib/netdev-dpdk.c:565
#7 0x000000000065283c in netdev_dpdk_init (netdev_=netdev_ at entry=0x7f72765ec400, port_no=port_no at entry=4294967295, type=type at entry=DPDK_DEV_VHOST) at lib/netdev-dpdk.c:633
#8 0x0000000000652a06 in vhost_construct_helper (netdev_=0x7f72765ec400) at lib/netdev-dpdk.c:668
#9 netdev_dpdk_vhost_user_construct (netdev_=0x7f72765ec400) at lib/netdev-dpdk.c:716
#10 0x00000000005d34d9 in netdev_open (name=<optimized out>, type=0x38962d0 "dpdkvhostuser", netdevp=netdevp at entry=0x7ffe3fe2b510) at lib/netdev.c:382
#11 0x0000000000562aa7 in iface_do_create (port_cfg=<optimized out>, errp=0x7ffe3fe2b500, netdevp=<synthetic pointer>, ofp_portp=0x7ffe3fe2b4fc, iface_cfg=0x38c2700, br=0x283def0) at vswitchd/bridge.c:1762
#12 iface_create (port_cfg=0x38c29c0, iface_cfg=0x38c2700, br=0x283def0) at vswitchd/bridge.c:1816
#13 bridge_add_ports__ (br=br at entry=0x283def0, wanted_ports=wanted_ports at entry=0x283dfd0, with_requested_port=with_requested_port at entry=true) at vswitchd/bridge.c:892
#14 0x0000000000565645 in bridge_add_ports (wanted_ports=0x283dfd0, br=0x283def0) at vswitchd/bridge.c:903
#15 bridge_reconfigure (ovs_cfg=ovs_cfg at entry=0x28415c0) at vswitchd/bridge.c:646
#16 0x0000000000568900 in bridge_run () at vswitchd/bridge.c:2975
#17 0x000000000041045d in main (argc=4, argv=0x7ffe3fe2b9a8) at vswitchd/ovs-vswitchd.c:120
Wang Huaxia ---JD.COM
From: Chandran, Sugesh<mailto:sugesh.chandran at intel.com>
Sent: 2016年3月15日 19:57
To: 王华夏<mailto:wanghuaxia at jd.com>; discuss at openvswitch.org<mailto:discuss at openvswitch.org>
Cc: Traynor, Kevin<mailto:kevin.traynor at intel.com>
Subject: RE: [ovs-discuss] [bug] a new problem of ovs-dpdk
I feel the fix for the mentioned issued already present in the DPDK 2.2.
https://github.com/openvswitch/ovs/commit/2d9439f045eab772cd1863ccc2efe0d179064ae5
So you can use QEMU 2.4.0 to test.
Regards
_Sugesh
From: 王华夏 [mailto:wanghuaxia at jd.com]
Sent: Monday, March 14, 2016 1:26 AM
To: Chandran, Sugesh <sugesh.chandran at intel.com>; discuss at openvswitch.org
Subject: RE: [ovs-discuss] [bug] a new problem of ovs-dpdk
I did not used the multiqueue in this test. And in ovs user guide ,it pointed out : For versions of QEMU v2.4.0 and later, it is currently not possible to unbind more than one dpdkvhostuser port from the guest kernel driver without causing the ovs-vswitchd process to crash. If this is a requirement for your use case, it is recommended either to use a version of QEMU between v2.2.0 and v2.3.1 (inclusive”
So I tested with qemu= 2.3
Wang Huaxia ---JD.COM
From: Chandran, Sugesh<mailto:sugesh.chandran at intel.com>
Sent: 2016年3月12日 1:12
To: 王华夏<mailto:wanghuaxia at jd.com>; discuss at openvswitch.org<mailto:discuss at openvswitch.org>
Subject: RE: [ovs-discuss] [bug] a new problem of ovs-dpdk
Are you using multiqueue for this test setup??
Can you please try with Qemu >=2.4 version instead of 2.3?
Regards
_Sugesh
From: discuss [mailto:discuss-bounces at openvswitch.org] On Behalf Of ???
Sent: Friday, March 11, 2016 9:54 AM
To: discuss at openvswitch.org<mailto:discuss at openvswitch.org>
Subject: [ovs-discuss] [bug] a new problem of ovs-dpdk
Hi all.
These days ,I tested the vm by using ovs-dpdk ,and found a new problem as fellows:
My test environment :
Host:
Linux version 3.10.0-229.14.1.el7.x86_64 (builder at kbuilder.dev.centos.org<mailto:builder at kbuilder.dev.centos.org>) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Tue Sep 15 15:05:51 UTC 2015
Dpdk: version 2.2
Ovs: version 2.5
QEMU version 2.3.1,
Guest : Linux version 3.10.0-229.el7.x86_64 (builder at kbuilder.dev.centos.org<mailto:builder at kbuilder.dev.centos.org>) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Fri Mar 6 11:36:42 UTC 2015
Ovs:
1 S root 61984 1 0 80 0 - 11923 poll_s 14:29 ? 00:00:31 ovsdb-server -v --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
5 S root 61998 1 99 80 0 - 2745735 poll_s 14:29 ? 12:25:52 /usr/local/sbin/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 400 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile --detach
Ovs-port cfg:
Bridge "br1"
Port "dpdk0"
Interface "dpdk0"
type: dpdk
Port "br1"
Interface "br1"
type: internal
Bridge "br0"
Port "br0"
Interface "br0"
type: internal
Port "vxlan-1"
Interface "vxlan-1"
type: vxlan
options: {remote_ip="7.0.0.2"}
Port "vhost-user-0"
Interface "vhost-user-0"
type: dpdkvhostuser
The numa huagepages and vcpu configes in xml :
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>4</vcpu>
<os>
<type arch='x86_64' machine='pc-i440fx-2.2'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
</features>
<cpu>
<numa>
<cell id='0' cpus='0-3' memory='4000000' unit='KiB' memAccess='shared'/>
</numa>
Test steps:
Step 1: create a vm for port vhost-user-0
Step 2: create other 15 vms likes step 1
Step 3 :destroy the 15 vms created by step 2
Step 4 repeat step 2 and step 3
Then sometimes I find I can’t reach the vm created by step 1, every time, this occurred in step 2
The logs of guest as fellows:
localhost kernel: virtio_net virtio0: output.0:id 222 is not a head!
localhost kernel: net eth0: Unexpected TXQ (0) queue failure: -5
localhost kernel: net eth0: Unexpected TXQ (0) queue failure: -5
localhost kernel: net eth0: Unexpected TXQ (0) queue failure: -5
then reboot the vm it can be recovered.
It is obvious that this is network problem. In my opinion, this the problems of the hugepages. But indeed , the hugepages has been mapped when the ovs-dpdk started by “/usr/local/sbin/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 400 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile --detach”I am not sure. How the hugepages recovery when the vm is destroyed. If somebody met the same problem with me , I am do not know if this is a bug of ovs-dpdk,or dpdk???
Thanks
Eric wang
Wang Huaxia ---JD.COM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20160316/d32877cd/attachment-0002.html>
More information about the discuss
mailing list