[ovs-discuss] ovs-dpdk on ubuntu 16.04LTS bad performance via dpdk port

Shunyi Hong hongtest at hotmail.com
Thu Oct 5 07:00:10 UTC 2017


I have following setup in Cisco UCS-B, 2 blade in same chassis connect via UCS-FI(Fiber Interconnect).


ubuntu16.04.3 LTS

UCS-B1 ------------------------------------------------- FI(10G) ------------------------------------ UCS-B2

(20.1.1.1/24)enp8s0 ----- vlan301/native -------------------------- vlan301/native ------ enp10s0 ---- ovs-dpdk ---br0(20.1.1.2/24)


Both vNIC adapter policy:

Transmit Queue = 1, Ring Size=2048

Receive Queue = 2, Ring Size=2048

Completion Queue = 2, Interrupts2=4


I followed 2 ovs-dpdk installation, same result

(1) https://www.paloaltonetworks.com/documentation/80/virtualization/virtualization/set-up-the-vm-series-firewall-on-kvm/integrate-open-vswitch-with-dpdk

(2) https://help.ubuntu.com/lts/serverguide/DPDK.html


I assign UCS-B1 enp8s0 ip = 20.1.1.1/24, bring up UCS-B2 ovs-dpdk, assigned ip=20.1.1.2/24 to br0. When I ping from ucs-b1 --> ucs-b2, the response time is very slow:

[{UCSB-DPDK-1} syhong at 172.28.185.102:/home/syhong]$ ping 20.1.1.2
PING 20.1.1.2 (20.1.1.2) 56(84) bytes of data.
64 bytes from 20.1.1.2: icmp_seq=1 ttl=64 time=1006 ms
64 bytes from 20.1.1.2: icmp_seq=2 ttl=64 time=1008 ms

If I don't have ovs-dpdk on ucs-b2, directly config ip to enp10s0 of ucs-b2, the ping response time is normal(single or less than 1ms).

Use tcpdump on br0 interface, I can see incoming icmp request, immediately icmp reply is sending back, looks like bottleneck is at dpdk port. Also I tried to mirror packet to another internal port attached to br0 and tcpdump that port, same result showing bottleneck @ dpdk port.

root at ucsb-c1-s8:/home/syhong/dpdk-2.2.0/tools# ./dpdk_nic_bind.py --status

Network devices using DPDK-compatible driver
============================================
0000:0a:00.0 'VIC Ethernet NIC' drv=igb_uio unused=enic

Network devices using kernel driver
===================================
0000:06:00.0 'VIC Ethernet NIC' if=enp6s0 drv=enic unused=igb_uio *Active*
0000:07:00.0 'VIC Ethernet NIC' if=enp7s0 drv=enic unused=igb_uio
0000:08:00.0 'VIC Ethernet NIC' if=enp8s0 drv=enic unused=igb_uio
0000:09:00.0 'VIC Ethernet NIC' if=enp9s0 drv=enic unused=igb_uio

Other network devices
=====================
<none>


One interesting thing is when ping going from 20.1.1.1 --> 20.1.1.2, if I started ping from 20.1.1.2 --> 20.1.1.1, I can see ping response from ucs-b1 is get better and better

[{UCSB-DPDK-1} syhong at 172.28.185.102:/home/syhong]$ ping 20.1.1.2
PING 20.1.1.2 (20.1.1.2) 56(84) bytes of data.
64 bytes from 20.1.1.2: icmp_seq=1 ttl=64 time=1006 ms
64 bytes from 20.1.1.2: icmp_seq=2 ttl=64 time=1008 ms
64 bytes from 20.1.1.2: icmp_seq=3 ttl=64 time=1008 ms
64 bytes from 20.1.1.2: icmp_seq=4 ttl=64 time=1008 ms
64 bytes from 20.1.1.2: icmp_seq=5 ttl=64 time=90.1 ms
64 bytes from 20.1.1.2: icmp_seq=6 ttl=64 time=88.4 ms
...
64 bytes from 20.1.1.2: icmp_seq=43 ttl=64 time=14.3 ms
64 bytes from 20.1.1.2: icmp_seq=44 ttl=64 time=12.3 ms
64 bytes from 20.1.1.2: icmp_seq=45 ttl=64 time=9.92 ms
64 bytes from 20.1.1.2: icmp_seq=46 ttl=64 time=6.92 ms
64 bytes from 20.1.1.2: icmp_seq=47 ttl=64 time=3.87 ms <-- response is getting better
64 bytes from 20.1.1.2: icmp_seq=48 ttl=64 time=0.930 ms
64 bytes from 20.1.1.2: icmp_seq=49 ttl=64 time=0.192 ms

root at ucsb-c1-s8:~# ping 20.1.1.1
PING 20.1.1.1 (20.1.1.1) 56(84) bytes of data.
64 bytes from 20.1.1.1: icmp_seq=1 ttl=64 time=911 ms
64 bytes from 20.1.1.1: icmp_seq=2 ttl=64 time=913 ms
64 bytes from 20.1.1.1: icmp_seq=3 ttl=64 time=915 ms
64 bytes from 20.1.1.1: icmp_seq=4 ttl=64 time=917 ms
64 bytes from 20.1.1.1: icmp_seq=5 ttl=64 time=919 ms
64 bytes from 20.1.1.1: icmp_seq=6 ttl=64 time=921 ms
...
64 bytes from 20.1.1.1: icmp_seq=38 ttl=64 time=985 ms
64 bytes from 20.1.1.1: icmp_seq=39 ttl=64 time=987 ms
64 bytes from 20.1.1.1: icmp_seq=40 ttl=64 time=989 ms
64 bytes from 20.1.1.1: icmp_seq=41 ttl=64 time=992 ms <-- response is getting worse
64 bytes from 20.1.1.1: icmp_seq=42 ttl=64 time=995 ms
64 bytes from 20.1.1.1: icmp_seq=43 ttl=64 time=998 ms
64 bytes from 20.1.1.1: icmp_seq=44 ttl=64 time=999 ms
64 bytes from 20.1.1.1: icmp_seq=45 ttl=64 time=999 ms

Any input what could be the probem?

Thanks,

SYH


root at ucsb-c1-s8:~#  ovs-vsctl show

c97226e3-3f03-4e8b-82ed-d34952e93396
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk

root at ucsb-c1-s8:~# ifconfig br0
br0       Link encap:Ethernet  HWaddr 00:25:b5:f1:9c:af
          inet addr:20.1.1.2  Bcast:20.1.1.255  Mask:255.255.255.0
          inet6 addr: fe80::225:b5ff:fef1:9caf/64 Scope:Link
          UP BROADCAST RUNNING PROMISC  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:508 (508.0 B)

root at ucsb-c1-s8:~# ethtool br0
Settings for br0:
        Supported ports: [ ]
        Supported link modes:   Not reported
        Supported pause frame use: No
        Supports auto-negotiation: No
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 10Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        MDI-X: Unknown
        Current message level: 0xffffffa1 (-95)
                               drv ifup tx_err tx_queued intr tx_done rx_status pktdata hw wol 0xffff8000
        Link detected: yes


root at ucsb-c1-s8:~# ovs-vsctl get  Open_vSwitch . other_config
{dpdk-init="true", pmd-cpu-mask="3c"}

root at ucsb-c1-s8:~# ps -ae | ps -eLo pid,psr,comm | grep pmd
 2756   2 pmd58
 2756   3 pmd57
 2756   4 pmd56
 2756   5 pmd55

root at ucsb-c1-s8:~# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                48
On-line CPU(s) list:   0-47
Thread(s) per core:    2
Core(s) per socket:    12
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Stepping:              2
CPU MHz:               1204.785
CPU max MHz:           3300.0000
CPU min MHz:           1200.0000
BogoMIPS:              4991.65
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              30720K
NUMA node0 CPU(s):     0-11,24-35
NUMA node1 CPU(s):     12-23,36-47
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts

/* I also tried 1GB hugepage, same result */
root at ucsb-c1-s8:~# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.4.0-87-generic root=/dev/mapper/ucsb--c1--s8--vg-root ro default_hugepagesz=2M hugepagesz=2M hugepages=2048 iommu=pt intel_iommu=on

Output of
root at ucsb-c1-s8:~# ovs-appctl dpif-netdev/pmd-stats-show
pmd thread numa_id 0 core_id 3:
        emc hits:0
        megaflow hits:0
        miss:0
        lost:0
pmd thread numa_id 0 core_id 5:
        emc hits:0
        megaflow hits:0
        miss:0
        lost:0
main thread:
        emc hits:15
        megaflow hits:0
        miss:1
        lost:0
        polling cycles:447936 (76.88%)
        processing cycles:134678 (23.12%)
        avg cycles per packet: 36413.38 (582614/16)
        avg processing cycles per packet: 8417.38 (134678/16)
pmd thread numa_id 0 core_id 2:
        emc hits:13
        megaflow hits:2
        miss:2
        lost:0
        polling cycles:34374635314 (99.98%)
        processing cycles:6535436 (0.02%)
        avg cycles per packet: 2022421808.82 (34381170750/17)
        avg processing cycles per packet: 384437.41 (6535436/17)
pmd thread numa_id 0 core_id 4:
        emc hits:0
        megaflow hits:0
        miss:0
        lost:0

root at ucsb-c1-s8:~# ovs-ofctl dump-ports br0
OFPST_PORT reply (xid=0x2): 2 ports
  port LOCAL: rx pkts=151, bytes=14550, drop=0, errs=0, frame=0, over=0, crc=0
           tx pkts=142, bytes=13832, drop=0, errs=0, coll=0
  port  1: rx pkts=142, bytes=14968, drop=0, errs=0, frame=?, over=?, crc=?
           tx pkts=150, bytes=15088, drop=0, errs=0, coll=?

root at ucsb-c1-s8:~# ovs-ofctl dump-ports-desc br0
OFPST_PORT_DESC reply (xid=0x2):
 1(dpdk0): addr:00:25:b5:f1:9c:af
     config:     0
     state:      0
     current:    1GB-HD 1GB-FD
     advertised: COPPER AUTO_PAUSE
     supported:  10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-HD 1GB-FD 10GB-FD COPPER AUTO_NEG AUTO_PAUSE AUTO_PAUSE_ASYM
     peer:       10MB-FD 100MB-HD 100MB-FD 10GB-FD COPPER
     speed: 1000 Mbps now, 10000 Mbps max
 LOCAL(br0): addr:00:25:b5:f1:9c:af
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max

root at ucsb-c1-s8:~# ovs-ofctl dump-flows br0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=406.134s, table=0, n_packets=293, n_bytes=28986, idle_age=186, priority=0 actions=NORMAL

Here is the output when ovs-vswitchd started
/* I changed pmd-cpu-mask later using "ovs-vsctl --timeout 10 set Open_vSwitch . other_config:pmd-cpu-mask=3c" */
root at ucsb-c1-s8:/home/syhong/dpdk-2.2.0/tools# ovs-vswitchd --dpdk -c 0x4 -n 4   --socket-mem  1024,0 -- unix:$DB_SOCK --pidfile --detach
2017-10-04T06:13:16Z|00001|dpdk|INFO|No -vhost_sock_dir provided - defaulting to /usr/local/var/run/openvswitch
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 8 on socket 0
EAL: Detected lcore 7 as core 9 on socket 0
EAL: Detected lcore 8 as core 10 on socket 0
EAL: Detected lcore 9 as core 11 on socket 0
EAL: Detected lcore 10 as core 12 on socket 0
EAL: Detected lcore 11 as core 13 on socket 0
EAL: Detected lcore 12 as core 0 on socket 1
EAL: Detected lcore 13 as core 1 on socket 1
EAL: Detected lcore 14 as core 2 on socket 1
EAL: Detected lcore 15 as core 3 on socket 1
EAL: Detected lcore 16 as core 4 on socket 1
EAL: Detected lcore 17 as core 5 on socket 1
EAL: Detected lcore 18 as core 8 on socket 1
EAL: Detected lcore 19 as core 9 on socket 1
EAL: Detected lcore 20 as core 10 on socket 1
EAL: Detected lcore 21 as core 11 on socket 1
EAL: Detected lcore 22 as core 12 on socket 1
EAL: Detected lcore 23 as core 13 on socket 1
EAL: Detected lcore 24 as core 0 on socket 0
EAL: Detected lcore 25 as core 1 on socket 0
EAL: Detected lcore 26 as core 2 on socket 0
EAL: Detected lcore 27 as core 3 on socket 0
EAL: Detected lcore 28 as core 4 on socket 0
EAL: Detected lcore 29 as core 5 on socket 0
EAL: Detected lcore 30 as core 8 on socket 0
EAL: Detected lcore 31 as core 9 on socket 0
EAL: Detected lcore 32 as core 10 on socket 0
EAL: Detected lcore 33 as core 11 on socket 0
EAL: Detected lcore 34 as core 12 on socket 0
EAL: Detected lcore 35 as core 13 on socket 0
EAL: Detected lcore 36 as core 0 on socket 1
EAL: Detected lcore 37 as core 1 on socket 1
EAL: Detected lcore 38 as core 2 on socket 1
EAL: Detected lcore 39 as core 3 on socket 1
EAL: Detected lcore 40 as core 4 on socket 1
EAL: Detected lcore 41 as core 5 on socket 1
EAL: Detected lcore 42 as core 8 on socket 1
EAL: Detected lcore 43 as core 9 on socket 1
EAL: Detected lcore 44 as core 10 on socket 1
EAL: Detected lcore 45 as core 11 on socket 1
EAL: Detected lcore 46 as core 12 on socket 1
EAL: Detected lcore 47 as core 13 on socket 1
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 48 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up physically contiguous memory...
EAL: Ask a virtual area of 0x2800000 bytes
EAL: Virtual area found at 0x7f703f800000 (size = 0x2800000)
EAL: Ask a virtual area of 0xbc00000 bytes
EAL: Virtual area found at 0x7f7033a00000 (size = 0xbc00000)
EAL: Ask a virtual area of 0x3e00000 bytes
EAL: Virtual area found at 0x7f702fa00000 (size = 0x3e00000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f702f600000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f702f200000 (size = 0x200000)
EAL: Ask a virtual area of 0x6d800000 bytes
EAL: Virtual area found at 0x7f6fc1800000 (size = 0x6d800000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f6fc1400000 (size = 0x200000)
EAL: Ask a virtual area of 0x7fc00000 bytes
EAL: Virtual area found at 0x7f6f41600000 (size = 0x7fc00000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f6f41200000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f6f40e00000 (size = 0x200000)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~2494224 KHz
EAL: Master lcore 2 is ready (tid=44e6dac0;cpuset=[2])
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 rte_enic_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 rte_enic_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device 0000:08:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 rte_enic_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 rte_enic_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device 0000:0a:00.0 on NUMA socket 0
EAL:   probe driver: 1137:43 rte_enic_pmd
EAL:   PCI memory mapped at 0x7f7042000000
EAL:   PCI memory mapped at 0x7f7042020000
PMD: rte_enic_pmd:  Initializing ENIC PMD version 1.0.0.6
PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:f1:9c:af wq/rq 4096/4096 mtu 1500
PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min timer 125 usec loopback tag 0x0000
PMD: rte_enic_pmd: vNIC resources avail: wq 32 rq 32 cq 64 intr 128
Zone 0: name:<RG_MP_log_history>, phys:0x1f805fdec0, len:0x2080, virt:0x7f702f3fdec0, socket_id:0, flags:0
Zone 1: name:<MP_log_history>, phys:0x6ef75f00, len:0x28a0c0, virt:0x7f7033575f00, socket_id:0, flags:0
Zone 2: name:<rte_eth_dev_data>, phys:0x1f805cd480, len:0x2f700, virt:0x7f702f3cd480, socket_id:0, flags:0
2017-10-04T06:13:18Z|00002|ovs_numa|INFO|Discovered 24 CPU cores on NUMA node 0
2017-10-04T06:13:18Z|00003|ovs_numa|INFO|Discovered 24 CPU cores on NUMA node 1
2017-10-04T06:13:18Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes and 48 CPU cores
2017-10-04T06:13:18Z|00005|reconnect|INFO|unix:/usr/local/var/run/openvswitch/db.sock: connecting...
2017-10-04T06:13:18Z|00006|reconnect|INFO|unix:/usr/local/var/run/openvswitch/db.sock: connected
2017-10-04T06:13:18Z|00007|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath supports recirculation
2017-10-04T06:13:18Z|00008|ofproto_dpif|INFO|netdev at ovs-netdev: MPLS label stack length probed as 3
2017-10-04T06:13:18Z|00009|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath supports unique flow ids
2017-10-04T06:13:18Z|00010|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath does not support ct_state
2017-10-04T06:13:18Z|00011|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath does not support ct_zone
2017-10-04T06:13:18Z|00012|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath does not support ct_mark
2017-10-04T06:13:18Z|00013|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath does not support ct_label
2017-10-04T06:13:18Z|00014|bridge|INFO|bridge ovs-br0: added interface ovs-br0 on port 65534
2017-10-04T06:13:18Z|00015|dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded.
2017-10-04T06:13:18Z|00016|bridge|INFO|bridge ovs-br0: using datapath ID 00004ac3a15d1147
2017-10-04T06:13:18Z|00017|connmgr|INFO|ovs-br0: added service controller "punix:/usr/local/var/run/openvswitch/ovs-br0.mgmt"


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20171005/ba15394f/attachment-0001.html>


More information about the discuss mailing list