[ovs-dev] Phy-VM connectivity issue

Ilya Maximets i.maximets at samsung.com
Mon Mar 11 10:22:24 UTC 2019


On 11.03.2019 12:59, Ilya Maximets wrote:
> On 11.03.2019 9:48, ppnaik wrote:
>> Hi Ilya,
>>
>> Thanks for the help. Providing a different option for queues worked for the physical interface that is added to OVS.
>> But now we have a different issue.
>> We are running an mTCP application inside the VM. mTCP uses a different RSS hash.

I suspect that mTCP relies on HW RSS implementation, i.e. it configures
RSS for device while calling 'rte_eth_dev_configure' and expects it working.
However, virtio driver is not RSS capable and simply ignores the provided
RSS configuration. 

>> I feel because of that the packets are not getting delivered to the correct queue of the VM interface.
> 
> What queue is correct for you?
> 
>>
>> Is it a good solution to not have multiple queues on the host interface and have a multiqueue interface for the VM?
> 
> Only one queue in VM will be used in this case, so it's not useful.
> 
>> In this setup, will the RSS be done by OVS in software?
> 
> No, because it's not the responsibility of OVS.
> If you want some kind of packet steering you'll need to implement
> RPS/RFS on top of DPDK virtio driver as it done in linux kernel.
> 
>> I am not sure if this setup would work.
>>
>> Thanks,
>> Priyanka
>>
>> On 2019-03-07 16:08, Ilya Maximets wrote:
>>> On 07.03.2019 13:29, ppnaik wrote:
>>>> Hi Ilya,
>>>>
>>>> Thanks for your suggestion.
>>>> We tried pinning the PMD threads to core. But still the issue exists that only single queue is working instead of multiqueue.
>>>> We tried using 'dpdkvhostuserclient' instead of 'dpdkvhostuser' but we are getting the following error and the VM is not able to find the interface.
>>>
>>> Hi.
>>> Thanks for logs. See comments inline.
>>>
>>>>
>>>> ovs-vswitchd log
>>>>
>>>> 2019-03-07T07:42:40.517Z|00212|dpdk|INFO|VHOST_CONFIG: vhost-user client: socket created, fd: 1091
>>>> 2019-03-07T07:42:40.517Z|00213|netdev_dpdk|INFO|vHost User device 'dpdkvhostuser0' created in 'client' mode, using client socket '/tmp/dpdkvhostclient0'
>>>> 2019-03-07T07:42:40.517Z|00214|dpdk|WARN|VHOST_CONFIG: failed to connect to /tmp/dpdkvhostclient0: No such file or directory
>>>> 2019-03-07T07:42:40.517Z|00215|dpdk|INFO|VHOST_CONFIG: /tmp/dpdkvhostclient0: reconnecting...
>>>> 2019-03-07T07:42:40.517Z|00216|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'dpdk-p0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance.
>>>>
>>>> The output of the following commands now is:
>>>>
>>>>
>>>> ovs-appctl dpif-netdev/pmd-rxq-show
>>>>
>>>> pmd thread numa_id 0 core_id 1:
>>>>   isolated : false
>>>>   port: dpdk-p0           queue-id:  0  pmd usage:  0 %
>>>> pmd thread numa_id 0 core_id 2:
>>>>   isolated : false
>>>>   port: dpdkvhostuser0    queue-id:  0  pmd usage:  0 %
>>>>
>>>>
>>>> ovs-vsctl show:
>>>>
>>>> 356b0bfb-979c-408a-be1e-4567d52ef57f
>>>>     Bridge "br0"
>>>>         Port "dpdk-p0"
>>>>             Interface "dpdk-p0"
>>>>                 type: dpdk
>>>>                 options: {dpdk-devargs="0000:81:00.1,n_rxq=2"}
>>>
>>> You configured 'n_rxq' as part of 'dpdk-devargs'. This should be separate
>>> config to take effect:
>>>
>>>     ovs-vsctl set Interface dpdk-p0 type=dpdk options:n_rxq=2
>>> options:dpdk-devargs="0000:81:00.1"
>>>
>>>
>>>>         Port "br0"
>>>>             Interface "br0"
>>>>                 type: internal
>>>>         Port "dpdkvhostuser0"
>>>>             Interface "dpdkvhostuser0"
>>>>                 type: dpdkvhostuserclient
>>>>                 options: {vhost-server-path="/tmp/dpdkvhostuser0"}
>>>>
>>>> ovs-vswitchd --version
>>>>      ovs-vswitchd (Open vSwitch) 2.11.90
>>>>      DPDK 18.11.0
>>>>
>>>> qemu-system-x86_64 --version
>>>>       QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.34), Copyright (c) 2003-2008 Fabrice Bellard
>>>
>>> Oh. Unfortunately vhost-user-client supported in qemu only since 2.7 version.
>>> You need to switch back to usual vhost-user ports or update the qemu.
>>>
>>>>
>>>> Can you please help us with what we are missing?
>>>>
>>>> Thanks,
>>>> Priyanka
>>>>
>>>>
>>>> On 2019-03-06 18:13, Ilya Maximets wrote:
>>>>> On 06.03.2019 15:18, ppnaik wrote:
>>>>>> Thanks for the response Ilya.
>>>>>> We could get this setup working now.
>>>>>>
>>>>>> However, we could not get it working when we want to give two queues to the VM interface.
>>>>>>
>>>>>> We added the queue option when creating the interface on OVS.
>>>>>
>>>>> There is no 'n_rxq'/'n_txq' options for vhost interfaces. Number of queues
>>>>> obtained from the virtio device when QEMU connects. You could have this
>>>>> in ovsdb, but it doesn't affect anything.
>>>>> OTOH, physical ports has 'n_rxq' configurable.
>>>>>
>>>>>> We also enabled multiqueue in VM XML and gave the interface and set the vectors too.
>>>>>>
>>>>>> ethtool inside the VM shows:
>>>>>>
>>>>>> ethtool -l ens3
>>>>>> Channel parameters for ens3:
>>>>>> Pre-set maximums:
>>>>>> RX:        0
>>>>>> TX:        0
>>>>>> Other:        0
>>>>>> Combined:    2
>>>>>> Current hardware settings:
>>>>>> RX:        0
>>>>>> TX:        0
>>>>>> Other:        0
>>>>>> Combined:    2
>>>>>>
>>>>>> However, a DPDK application inside the VM is not able to get packets from both queues. It still works with one queue.
>>>>>
>>>>> I assume that you have only one PMD thread that polls both physical
>>>>> and virtual ports. Each PMD thread uses only one Tx queue per port.
>>>>> So, if you want to utilize more queues in virtual interface, you need to
>>>>> create more PMD threads using pmd-cpu-mask.
>>>>> For example, with pmd-cpu-mask=0x30 there will be 2 PMD threads. One on
>>>>> CPU #5 and another on CPU #6. Thread on core #5 will send packets to
>>>>> let's say Tx queue #0, Thread on core #6 will send packets to Tx queue #1.
>>>>> These packets will appear on Rx queue #0 and Rx queue #1 accordingly
>>>>> inside VM.
>>>>> But you need to be sure that both threads has packets to send, i.e. they
>>>>> both polls some rx queues of other ports. For exmaple, you may configure
>>>>> 2 queues on physical port and assign them to different PMD threads (this
>>>>> should be done automatically). You may check the distribution of rx queues
>>>>> between threads by 'ovs-appctl dpif-netdev/pmd-rxq-show'.
>>>>> In this case, packets that appears on different rx queues of hw port will
>>>>> go to different queues of vhost-user port.
>>>>>
>>>>>>
>>>>>> Please help us resolve this issue.
>>>>>>
>>>>>> Thanks,
>>>>>> Priyanka
>>>>>>
>>>>>>
>>>>>> On 2019-03-06 16:48, Ilya Maximets wrote:
>>>>>>> Hi.
>>>>>>>
>>>>>>> At first you need to look at ovs-vswitchd.log and the log of qemu.
>>>>>>> There might be some errors.
>>>>>>>
>>>>>>> Some thoughts inline.
>>>>>>>
>>>>>>> Best regards, Ilya Maximets.
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Our setup is as follows:
>>>>>>>>
>>>>>>>> We have two servers which are connected peer to peer over 40G
>>>>>>>> interfaces.
>>>>>>>>
>>>>>>>> On one server we have setup OVS and added the physical 40G interface as
>>>>>>>> a DPDK interface to the ovs bridge.
>>>>>>>>
>>>>>>>> We created another dpdkvhostuser interface for the VM. We added this
>>>>>>>> interface to the VM (by editing the XML). We are able to see this
>>>>>>>> interface inside the VM and have configure IP to the interface.
>>>>>>>>
>>>>>>>> We want to communicate between the other server and VM inside this
>>>>>>>> server through the OVS interface created for the VM.
>>>>>>>>
>>>>>>>> The steps we followed (on the server with OVS) are:
>>>>>>>>
>>>>>>>> modprobe uio
>>>>>>>
>>>>>>> IMHO, it's better to use vfio-pci. But it's up to you.
>>>>>>>
>>>>>>>>
>>>>>>>> cd /usr/src/dpdk-18.11/x86_64-native-linuxapp-gcc/kmod/
>>>>>>>>
>>>>>>>> insmod igb_uio.ko
>>>>>>>>
>>>>>>>> cd /usr/src/dpdk-18.11/usertools/
>>>>>>>> ./dpdk-devbind.py --bind=igb_uio 0000:81:00.1
>>>>>>>>
>>>>>>>>   export PATH=$PATH:/usr/local/share/openvswitch/scripts
>>>>>>>>   export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
>>>>>>>>
>>>>>>>>   ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock
>>>>>>>> --remote=db:Open_vSwitch,Open_vSwitch,manager_options
>>>>>>>> --private-key=db:Open_vSwitch,SSL,private_key
>>>>>>>> --certificate=db:Open_vSwitch,SSL,certificate
>>>>>>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
>>>>>>>> --log-file
>>>>>>>>
>>>>>>>>   ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
>>>>>>>>   ovs-ctl --no-ovsdb-server --db-sock="$DB_SOCK" start
>>>>>>>>
>>>>>>>> ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
>>>>>>>> ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk
>>>>>>>> options:dpdk-devargs=0000:81:00.1
>>>>>>>> ovs-vsctl add-port br0 dpdkvhostuser0     -- set Interface
>>>>>>>> dpdkvhostuser0 type=dpdkvhostuser ofport_request=3
>>>>>>>
>>>>>>> Consider using 'dpdkvhostuserclient' a.k.a. 'vhost-user-client' ports instead
>>>>>>> because server mode 'dpdkvhostuser' ports are deprecated in OVS.
>>>>>>>
>>>>>>>>
>>>>>>>> ovs-ofctl add-flow br0 in_port=1,action=output:3
>>>>>>>> ovs-ofctl add-flow br0 in_port=3,action=output:1
>>>>>>>>
>>>>>>>> echo 'vm.nr_hugepages=2048' > /etc/sysctl.d/hugepages.conf
>>>>>>>
>>>>>>> Here you're allocating 2048 pages of 2MB. This is not enough
>>>>>>> for your setup. You're trying to allocate 4096 MB for qemu memory
>>>>>>> backing + OVS will need some hugepage memory for the mempools and stuff.
>>>>>>> It'll be at total:
>>>>>>>     4096 + 1024 (default for OVS if you have only 1 NUMA node) MB,
>>>>>>> i.e. you need at least 512 more pages.
>>>>>>>
>>>>>>> Do you need to reload sysctl for changes to be applied?
>>>>>>>
>>>>>>>> grep HugePages_ /proc/meminfo
>>>>>>>
>>>>>>> So, where is the output? If the output is empty, you have no pages allocated.
>>>>>>>
>>>>>>>>
>>>>>>>> edit VM XML to add this interface:
>>>>>>>
>>>>>>> If you're starting new VM with updated XML than I'll suggest using
>>>>>>> better libvirt
>>>>>>> syntax. i.e. it's better to use sections like "memoryBacking", "interface"
>>>>>>> instead of manual attaching of cmdline arguments.
>>>>>>> See http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/#sample-xml
>>>>>>>
>>>>>>>>
>>>>>>>> first line:
>>>>>>>> <domain type='kvm'
>>>>>>>> xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
>>>>>>>>
>>>>>>>> add before </domain> tag:
>>>>>>>>
>>>>>>>> <qemu:commandline>
>>>>>>>>      <qemu:arg value='-chardev'/>
>>>>>>>>      <qemu:arg
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> value='socket,id=char1,path=/usr/local/var/run/openvswitch/dpdkvhostuser0'/>
>>>>>>>>      <qemu:arg value='-netdev'/>
>>>>>>>>      <qemu:arg
>>>>>>>>
>>>>>>>>
>>>>>>>> value='vhost-user,id=mynet1,chardev=char1,vhostforce=on,queues=1'/>
>>>>>>>>      <qemu:arg value='-device'/>
>>>>>>>>      <qemu:arg
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> value='virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet1,mq=on,vectors=4'/>
>>>>>>>>      <qemu:arg value='-m'/>
>>>>>>>>      <qemu:arg value='4096'/>
>>>>>>>>      <qemu:arg value='-object'/>
>>>>>>>>      <qemu:arg
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> value='memory-backend-file,id=mem1,size=4096M,mem-path=/dev/hugepages,share=on'/>
>>>>>>>>      <qemu:arg value='-mem-prealloc'/>
>>>>>>>>      <qemu:arg value='-numa'/>
>>>>>>>>      <qemu:arg value='node,memdev=mem1'/>
>>>>>>>>    </qemu:commandline>
>>>>>>>>
>>>>>>>> Please help us resolve this issue. I assumed ping would work between
>>>>>>>> the other server and the VM. But it is not working in our case. Also,
>>>>>>>> let us know if we are missing some setup step or if there is some
>>>>>>>> misconfiguration. If ping would not work can you let us know a way to
>>>>>>>> verify the connectivity?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Priyanka
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>


More information about the dev mailing list