[ovs-discuss] [ovs-dpdk] bandwidth issue of vhostuserclient virtio ovs-dpdk

LIU Yulong liuyulong.xa at gmail.com
Thu Nov 29 08:24:03 UTC 2018


Hi,

We recently tested ovs-dpdk, but we met some bandwidth issue. The bandwidth
from VM to VM was not close to the physical NIC, it's about 4.3Gbps on a
10Gbps NIC. For no dpdk (virtio-net) VMs, the iperf3 test can easily
reach 9.3Gbps. We enabled the virtio multiqueue for all guest VMs. In the
dpdk vhostuser guest, we noticed that the interrupts are centralized to
only one queue. But for no dpdk VM, interrupts can hash to all queues.
For those dpdk vhostuser VMs, we also noticed that the PMD usages were
also centralized to one no matter server(tx) or client(rx). And no matter
one PMD or multiple PMDs, this behavior always exists.

Furthuremore, my colleague add some systemtap hook on the openvswitch
function, he found something interesting. The function
__netdev_dpdk_vhost_send will send all the packets to one virtionet-queue.
Seems that there are some algorithm/hash table/logic does not do the hash
very well.

So I'd like to find some help from the community. Maybe I'm missing some
configrations.

Thanks.


Here is the list of the environment and some configrations:
# uname -r
3.10.0-862.11.6.el7.x86_64
# rpm -qa|grep dpdk
dpdk-17.11-11.el7.x86_64
# rpm -qa|grep openvswitch
openvswitch-2.9.0-3.el7.x86_64
# ovs-vsctl list open_vswitch
_uuid               : a6a3d9eb-28a8-4bf0-a8b4-94577b5ffe5e
bridges             : [531e4bea-ce12-402a-8a07-7074c31b978e,
5c1675e2-5408-4c1f-88bc-6d9c9b932d47]
cur_cfg             : 1305
datapath_types      : [netdev, system]
db_version          : "7.15.1"
external_ids        : {hostname="cq01-compute-10e112e5e140",
rundir="/var/run/openvswitch",
system-id="e2cc84fe-a3c8-455f-8c64-260741c141ee"}
iface_types         : [dpdk, dpdkr, dpdkvhostuser, dpdkvhostuserclient,
geneve, gre, internal, lisp, patch, stt, system, tap, vxlan]
manager_options     : [43803994-272b-49cb-accc-ab672d1eefc8]
next_cfg            : 1305
other_config        : {dpdk-init="true", dpdk-lcore-mask="0x1",
dpdk-socket-mem="1024,1024", pmd-cpu-mask="0x100000",
vhost-iommu-support="true"}
ovs_version         : "2.9.0"
ssl                 : []
statistics          : {}
system_type         : centos
system_version      : "7"
# lsmod |grep vfio
vfio_pci               41312  2
vfio_iommu_type1       22300  1
vfio                   32695  7 vfio_iommu_type1,vfio_pci
irqbypass              13503  23 kvm,vfio_pci

# ovs-appctl dpif/show
netdev at ovs-netdev: hit:759366335 missed:754283
br-ex:
bond1108 4/6: (tap)
br-ex 65534/3: (tap)
nic-10G-1 5/4: (dpdk: configured_rx_queues=8,
configured_rxq_descriptors=2048, configured_tx_queues=2,
configured_txq_descriptors=2048, mtu=1500, requested_rx_queues=8,
requested_rxq_descriptors=2048, requested_tx_queues=2,
requested_txq_descriptors=2048, rx_csum_offload=true)
nic-10G-2 6/5: (dpdk: configured_rx_queues=8,
configured_rxq_descriptors=2048, configured_tx_queues=2,
configured_txq_descriptors=2048, mtu=1500, requested_rx_queues=8,
requested_rxq_descriptors=2048, requested_tx_queues=2,
requested_txq_descriptors=2048, rx_csum_offload=true)
phy-br-ex 3/none: (patch: peer=int-br-ex)
br-int:
br-int 65534/2: (tap)
int-br-ex 1/none: (patch: peer=phy-br-ex)
vhu76f9a623-9f 2/1: (dpdkvhostuserclient: configured_rx_queues=8,
configured_tx_queues=8, mtu=1500, requested_rx_queues=8,
requested_tx_queues=8)

# ovs-appctl dpctl/show -s
netdev at ovs-netdev:
lookups: hit:759366335 missed:754283 lost:72
flows: 186
port 0: ovs-netdev (tap)
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 aborted:0 carrier:0
collisions:0
RX bytes:0  TX bytes:0
port 1: vhu76f9a623-9f (dpdkvhostuserclient: configured_rx_queues=8,
configured_tx_queues=8, mtu=1500, requested_rx_queues=8,
requested_tx_queues=8)
RX packets:718391758 errors:0 dropped:0 overruns:? frame:?
TX packets:30372410 errors:? dropped:719200 aborted:? carrier:?
collisions:?
RX bytes:1086995317051 (1012.3 GiB)  TX bytes:2024893540 (1.9 GiB)
port 2: br-int (tap)
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:1393992 errors:0 dropped:4 aborted:0 carrier:0
collisions:0
RX bytes:0  TX bytes:2113616736 (2.0 GiB)
port 3: br-ex (tap)
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:6660091 errors:0 dropped:967 aborted:0 carrier:0
collisions:0
RX bytes:0  TX bytes:2451440870 (2.3 GiB)
port 4: nic-10G-1 (dpdk: configured_rx_queues=8,
configured_rxq_descriptors=2048, configured_tx_queues=2,
configured_txq_descriptors=2048, mtu=1500, requested_rx_queues=8,
requested_rxq_descriptors=2048, requested_tx_queues=2,
requested_txq_descriptors=2048, rx_csum_offload=true)
RX packets:36409466 errors:0 dropped:0 overruns:? frame:?
TX packets:718371472 errors:0 dropped:20276 aborted:? carrier:?
collisions:?
RX bytes:2541593983 (2.4 GiB)  TX bytes:1089838136919 (1015.0 GiB)
port 5: nic-10G-2 (dpdk: configured_rx_queues=8,
configured_rxq_descriptors=2048, configured_tx_queues=2,
configured_txq_descriptors=2048, mtu=1500, requested_rx_queues=8,
requested_rxq_descriptors=2048, requested_tx_queues=2,
requested_txq_descriptors=2048, rx_csum_offload=true)
RX packets:5319466 errors:0 dropped:0 overruns:? frame:?
TX packets:0 errors:0 dropped:0 aborted:? carrier:?
collisions:?
RX bytes:344903551 (328.9 MiB)  TX bytes:0
port 6: bond1108 (tap)
RX packets:228 errors:0 dropped:0 overruns:0 frame:0
TX packets:5460 errors:0 dropped:18 aborted:0 carrier:0
collisions:0
RX bytes:21459 (21.0 KiB)  TX bytes:341087 (333.1 KiB)

# ovs-appctl dpif-netdev/pmd-stats-show
pmd thread numa_id 0 core_id 20:
packets received: 760120690
packet recirculations: 0
avg. datapath passes per packet: 1.00
emc hits: 750787577
megaflow hits: 8578758
avg. subtable lookups per megaflow hit: 1.05
miss with success upcall: 754283
miss with failed upcall: 72
avg. packets per output batch: 2.21
idle cycles: 210648140144730 (99.13%)
processing cycles: 1846745927216 (0.87%)
avg cycles per packet: 279554.14 (212494886071946/760120690)
avg processing cycles per packet: 2429.54 (1846745927216/760120690)
main thread:
packets received: 0
packet recirculations: 0
avg. datapath passes per packet: 0.00
emc hits: 0
megaflow hits: 0
avg. subtable lookups per megaflow hit: 0.00
miss with success upcall: 0
miss with failed upcall: 0
avg. packets per output batch: 0.00

# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 20:
isolated : false
port: nic-10G-1        queue-id:  0 pmd usage:  0 %
port: nic-10G-1        queue-id:  1 pmd usage:  0 %
port: nic-10G-1        queue-id:  2 pmd usage:  0 %
port: nic-10G-1        queue-id:  3 pmd usage:  0 %
port: nic-10G-1        queue-id:  4 pmd usage:  0 %
port: nic-10G-1        queue-id:  5 pmd usage:  0 %
port: nic-10G-1        queue-id:  6 pmd usage:  0 %
port: nic-10G-1        queue-id:  7 pmd usage:  0 %
port: nic-10G-2        queue-id:  0 pmd usage:  0 %
port: nic-10G-2        queue-id:  1 pmd usage:  0 %
port: nic-10G-2        queue-id:  2 pmd usage:  0 %
port: nic-10G-2        queue-id:  3 pmd usage:  0 %
port: nic-10G-2        queue-id:  4 pmd usage:  0 %
port: nic-10G-2        queue-id:  5 pmd usage:  0 %
port: nic-10G-2        queue-id:  6 pmd usage:  0 %
port: nic-10G-2        queue-id:  7 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  0 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  1 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  2 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  3 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  4 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  5 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  6 pmd usage:  0 %
port: vhu76f9a623-9f  queue-id:  7 pmd usage:  0 %


# virsh dumpxml instance-5c5191ff-c1a2-4429-9a8b-93ddd939583d
...
    <interface type='vhostuser'>
      <mac address='fa:16:3e:77:ab:fb'/>
      <source type='unix' path='/var/lib/vhost_sockets/vhu76f9a623-9f'
mode='server'/>
      <target dev='vhu76f9a623-9f'/>
      <model type='virtio'/>
      <driver name='vhost' queues='8'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
    </interface>
...

# ovs-vsctl show
a6a3d9eb-28a8-4bf0-a8b4-94577b5ffe5e
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port br-int
            Interface br-int
                type: internal
        Port "vhu76f9a623-9f"
            tag: 1
            Interface "vhu76f9a623-9f"
                type: dpdkvhostuserclient
                options: {n_rxq="8",
vhost-server-path="/var/lib/vhost_sockets/vhu76f9a623-9f"}
    Bridge br-ex
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port dpdkbond
            Interface "nic-10G-1"
                type: dpdk
                options: {dpdk-devargs="0000:01:00.0", n_rxq="8", n_txq="8"}
            Interface "nic-10G-2"
                type: dpdk
                options: {dpdk-devargs="0000:05:00.1", n_rxq="8", n_txq="8"}
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port br-ex
            Interface br-ex
                type: internal

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
node 0 size: 130978 MB
node 0 free: 7539 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
node 1 size: 131072 MB
node 1 free: 6886 MB
node distances:
node   0   1
  0:  10  21
  1:  21  10

# grep HugePages_ /proc/meminfo
HugePages_Total:     232
HugePages_Free:       10
HugePages_Rsvd:        0
HugePages_Surp:        0


# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.10.0-862.11.6.el7.x86_64
root=UUID=220ee106-5e00-4809-91a0-641e045a4c21 ro intel_idle.max_cstate=0
crashkernel=auto rhgb quiet default_hugepagesz=1G hugepagesz=1G
hugepages=232 iommu=pt intel_iommu=on


Best regards,
LIU Yulong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20181129/bbb80659/attachment-0001.html>


More information about the discuss mailing list