[ovs-discuss] [bug] a new problem of ovs-dpdk

王华夏 wanghuaxia at jd.com
Wed Mar 16 01:29:55 UTC 2016


I tested it with qemu 2.5, and change the ovs version into 2.5 release version (it is not the release version before). get another problem.

The ovs-vswitch crashed after the test run sometimes(it can be repeated ), and it looks like has memory leak

The stacks as follows:

gdb) bt
#0  0x00007f72bfc2f5d7 in raise () from /lib64/libc.so.6
#1  0x00007f72bfc30cc8 in abort () from /lib64/libc.so.6
#2  0x0000000000636fce in ovs_abort_valist (err_no=<optimized out>, format=<optimized out>, args=args at entry=0x7ffe3fe2b2f8) at lib/util.c:323
#3  0x0000000000637057 in ovs_abort (err_no=err_no at entry=0, format=format at entry=0x6dfad5 "virtual memory exhausted") at lib/util.c:315
#4  0x0000000000637072 in out_of_memory () at lib/util.c:89
#5  0x0000000000651465 in dpdk_rte_mzalloc (sz=sz at entry=3170304) at lib/netdev-dpdk.c:269
#6  0x0000000000651a33 in netdev_dpdk_alloc_txq (netdev=netdev at entry=0x7f72765ec400, n_txqs=n_txqs at entry=1024) at lib/netdev-dpdk.c:565
#7  0x000000000065283c in netdev_dpdk_init (netdev_=netdev_ at entry=0x7f72765ec400, port_no=port_no at entry=4294967295, type=type at entry=DPDK_DEV_VHOST) at lib/netdev-dpdk.c:633
#8  0x0000000000652a06 in vhost_construct_helper (netdev_=0x7f72765ec400) at lib/netdev-dpdk.c:668
#9  netdev_dpdk_vhost_user_construct (netdev_=0x7f72765ec400) at lib/netdev-dpdk.c:716
#10 0x00000000005d34d9 in netdev_open (name=<optimized out>, type=0x38962d0 "dpdkvhostuser", netdevp=netdevp at entry=0x7ffe3fe2b510) at lib/netdev.c:382
#11 0x0000000000562aa7 in iface_do_create (port_cfg=<optimized out>, errp=0x7ffe3fe2b500, netdevp=<synthetic pointer>, ofp_portp=0x7ffe3fe2b4fc, iface_cfg=0x38c2700, br=0x283def0) at vswitchd/bridge.c:1762
#12 iface_create (port_cfg=0x38c29c0, iface_cfg=0x38c2700, br=0x283def0) at vswitchd/bridge.c:1816
#13 bridge_add_ports__ (br=br at entry=0x283def0, wanted_ports=wanted_ports at entry=0x283dfd0, with_requested_port=with_requested_port at entry=true) at vswitchd/bridge.c:892
#14 0x0000000000565645 in bridge_add_ports (wanted_ports=0x283dfd0, br=0x283def0) at vswitchd/bridge.c:903
#15 bridge_reconfigure (ovs_cfg=ovs_cfg at entry=0x28415c0) at vswitchd/bridge.c:646
#16 0x0000000000568900 in bridge_run () at vswitchd/bridge.c:2975
#17 0x000000000041045d in main (argc=4, argv=0x7ffe3fe2b9a8) at vswitchd/ovs-vswitchd.c:120


Wang Huaxia ---JD.COM

From: Chandran, Sugesh<mailto:sugesh.chandran at intel.com>
Sent: 2016年3月15日 19:57
To: 王华夏<mailto:wanghuaxia at jd.com>; discuss at openvswitch.org<mailto:discuss at openvswitch.org>
Cc: Traynor, Kevin<mailto:kevin.traynor at intel.com>
Subject: RE: [ovs-discuss] [bug] a new problem of ovs-dpdk

I feel the fix for the mentioned issued already present in the DPDK 2.2.

https://github.com/openvswitch/ovs/commit/2d9439f045eab772cd1863ccc2efe0d179064ae5

So you can use QEMU 2.4.0 to test.


Regards
_Sugesh

From: 王华夏 [mailto:wanghuaxia at jd.com]
Sent: Monday, March 14, 2016 1:26 AM
To: Chandran, Sugesh <sugesh.chandran at intel.com>; discuss at openvswitch.org
Subject: RE: [ovs-discuss] [bug] a new problem of ovs-dpdk

I did not used the multiqueue in this test. And in ovs user guide ,it pointed out : For versions of QEMU v2.4.0 and later, it is currently not possible to unbind more than one dpdkvhostuser port from the guest kernel driver without causing the ovs-vswitchd process to crash. If this is a requirement for your use case, it is recommended either to use a version of QEMU between v2.2.0 and v2.3.1 (inclusive”

So I tested with qemu= 2.3


Wang Huaxia ---JD.COM

From: Chandran, Sugesh<mailto:sugesh.chandran at intel.com>
Sent: 2016年3月12日 1:12
To: 王华夏<mailto:wanghuaxia at jd.com>; discuss at openvswitch.org<mailto:discuss at openvswitch.org>
Subject: RE: [ovs-discuss] [bug] a new problem of ovs-dpdk


Are you using multiqueue  for this test setup??
Can you please try with Qemu >=2.4  version instead of 2.3?


Regards
_Sugesh

From: discuss [mailto:discuss-bounces at openvswitch.org] On Behalf Of ???
Sent: Friday, March 11, 2016 9:54 AM
To: discuss at openvswitch.org<mailto:discuss at openvswitch.org>
Subject: [ovs-discuss] [bug] a new problem of ovs-dpdk

Hi all.
These days ,I tested the vm  by using ovs-dpdk ,and found a new problem as fellows:

My test environment :
Host:
Linux version 3.10.0-229.14.1.el7.x86_64 (builder at kbuilder.dev.centos.org<mailto:builder at kbuilder.dev.centos.org>) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Tue Sep 15 15:05:51 UTC 2015
Dpdk: version 2.2
Ovs: version 2.5
QEMU version 2.3.1,

Guest : Linux version 3.10.0-229.el7.x86_64 (builder at kbuilder.dev.centos.org<mailto:builder at kbuilder.dev.centos.org>) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Fri Mar 6 11:36:42 UTC 2015

Ovs:

1 S root      61984      1  0  80   0 - 11923 poll_s 14:29 ?        00:00:31 ovsdb-server -v --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
5 S root      61998      1 99  80   0 - 2745735 poll_s 14:29 ?      12:25:52 /usr/local/sbin/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 400 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile --detach

Ovs-port cfg:

    Bridge "br1"
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
       Port "br1"
            Interface "br1"
                type: internal
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "vxlan-1"
            Interface "vxlan-1"
                type: vxlan
               options: {remote_ip="7.0.0.2"}
        Port "vhost-user-0"
            Interface "vhost-user-0"
                type: dpdkvhostuser


The numa huagepages and vcpu configes in xml :

<memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.2'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
  </features>
  <cpu>
    <numa>
      <cell id='0' cpus='0-3' memory='4000000' unit='KiB' memAccess='shared'/>
    </numa>


Test steps:
Step 1: create a vm for port vhost-user-0
Step 2: create other 15 vms likes step 1
Step 3 :destroy the 15 vms created by step 2
Step 4 repeat step 2 and step 3

Then sometimes I find I can’t  reach the vm created by step 1, every time, this occurred in step 2

The logs of guest as fellows:

localhost kernel: virtio_net virtio0: output.0:id 222 is not a head!
localhost kernel: net eth0: Unexpected TXQ (0) queue failure: -5
localhost kernel: net eth0: Unexpected TXQ (0) queue failure: -5
localhost kernel: net eth0: Unexpected TXQ (0) queue failure: -5
then reboot the vm it can be recovered.

It is obvious that this is network problem. In my opinion, this the problems of the hugepages. But indeed , the hugepages has been mapped when the ovs-dpdk started by “/usr/local/sbin/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 400 -- unix:/usr/local/var/run/openvswitch/db.sock --pidfile --detach”I am not sure. How the hugepages recovery when the vm is destroyed. If somebody met the same problem with me , I am do not know if this is a bug of ovs-dpdk,or dpdk???

Thanks
Eric wang

Wang Huaxia ---JD.COM

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20160316/d32877cd/attachment-0002.html>


More information about the discuss mailing list