[ovs-discuss] DPDK OVS on Ubuntu 15.04

Mooney, Sean K sean.k.mooney at intel.com
Thu Aug 20 19:53:31 UTC 2015


Hi I am glad you got it working.
Thanks for reminding me about apparmor also.

We had set all profiles into complain mode but it was still blocking access to the vhost user sockets on our system also.

Apt-get purge apparmor results in vms booting correctly on our system also.

For now we will document this as a workaround and investage updating the apparmor profile to allow access to the
/var/run/openvswitch directory.

Just incase you are not aware to have network connective in the vm when  booting via openstack it is also neesicary to request hugepages
This can be done by modifying the flavor as follows
nova flavor-key <FLAVOR> set hw:mem_page_size=large
without this flavor extra spec option nova will not generate hugepage element or populate the pages element.

on the acceleration question
br-p6p1 and br-int  both use the dpdk enabled netdev datapath.

Because both bridges are connected by a patch port ovs is able to
Collapse the two bridges into a single datapath. This effectively means that while there are
Two logical bridges with the vm vhost-user port connected to br-int and the dpdk physical port connected
To br-p6p1 it is the equivalent of connecting all ports to a single bridge.

Regards
Sean.

From: Gabe Black [mailto:Gabe.Black at viavisolutions.com]
Sent: Thursday, August 20, 2015 5:20 PM
To: Mooney, Sean K
Cc: John Lange; bugs at openvswitch.org
Subject: RE: DPDK OVS on Ubuntu 15.04

Ok, I think I found the config that had it boot.

Your example xml that you attached had:

<memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>

The page element was the missing piece for me.

Thanks!


From: Gabe Black
Sent: Thursday, August 20, 2015 9:31 AM
To: 'Mooney, Sean K'
Cc: John Lange; bugs at openvswitch.org<mailto:bugs at openvswitch.org>
Subject: RE: DPDK OVS on Ubuntu 15.04

Hi Sean,

The symlink let me create the vms in openstack... I'm positive that would have taken me forever to figure out.

I did notice that when I do create the vms with openstack it does generate the vhostuser section, but the vhost-user port isn't created under the br-p6p1 interface (I believe that is the bridge to the physical port (p6p1 which is the intel 82599 nic) that I hope dpdk will accelerate) but rather the vhost-user is created under br-int.

I'm just now getting my hands dirty with openstack, so forgive my ignorance if that is just fine for it to be created under that bridge, as it will still use the dpdk acceleration.

Anyway, after I did make the numa modifications, now when I try and start the VM, I get the error "Unable to find any usable hugetlbfs mount for 0 KiB"

I thought maybe I needed to increase the hugetable allocations for ovs-dpdk so I changed /etc/defaults/ovs-dpdk items to these (they were 2048):

OVS_SOCKET_MEM=16384,16384
OVS_NUM_HUGEPAGES=16384

I then restarted ovs-dpdk service (service ovs-dpdk restart) as well as the libvirtd service in case it had to notice the changes as well (saw a note in the qemu.conf that indicated it might determine hugetable size at daemon start).

However, I still get the same error.  I thought maybe it is a vm config error since it seems weird that it is trying to find a mount for "0" KiB.  These I believe are the relevant VM xml section:


<memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
....
<cpu mode='host-model'>
    <model fallback='allow'>SandyBridge</model>
    <topology sockets='1' cores='4' threads='1'/>
    <numa>
      <cell id='0' cpus='0-3' memory='1048576' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>

Now that I can launch VMs with openstack, I may just try and modify that xml to add the hugetable backing (and maybe the vhost-user that is connected to br-p6p1) and see if that works.

Anyway, I really appreciate all the help.
Gabe

From: Mooney, Sean K [mailto:sean.k.mooney at intel.com]
Sent: Thursday, August 20, 2015 8:25 AM
To: Gabe Black
Cc: John Lange; bugs at openvswitch.org<mailto:bugs at openvswitch.org>; Mooney, Sean K
Subject: RE: DPDK OVS on Ubuntu 15.04

The nosharepages  element unfortunately has a different meaning.
http://www.redhat.com/archives/libvir-list/2013-April/msg01263.html

The <nosharepages> element sets the -mem-merge=on|off parameter on the qemu commandline


Where as setting the memAccess attribute on the numa cell resulting in the hugepage file being mmap'd as shared instead of private.

<cpu mode="host-model"><file:///C:\Users\skmooney\AppData\Local\Microsoft\Windows\INetCache\Content.Outlook\WELXANEW\ubuntu_reference_image%20(3).xml>
<model fallback="allow"/>
<topology threads="1" cores="1" sockets="8"/>
<numa><file:///C:\Users\skmooney\AppData\Local\Microsoft\Windows\INetCache\Content.Outlook\WELXANEW\ubuntu_reference_image%20(3).xml>
<cell id="0" unit="KiB" memAccess="shared" memory="4190208" cpus="0-7"/>
</numa>
</cpu>


The following Libvirt xml fragment will create a vhost-user port

<interface type="vhostuser"><file:///C:\Users\skmooney\AppData\Local\Microsoft\Windows\INetCache\Content.Outlook\WELXANEW\ubuntu_reference_image%20(3).xml>
<mac address="fa:16:3e:cd:81:4a"/>
<source type="unix" mode="client" path="/var/run/openvswitch/vhuc37437e9-0d"/>
<model type="virtio"/>
<alias name="net0"/>
<address type="pci" bus="0x00" function="0x0" slot="0x03" domain="0x0000"/>
</interface>


The important line is
<source type="unix" mode="client" path="/var/run/openvswitch/vhuc37437e9-0d"/>

Type select the connection type  eg a unix socket
Mode sets who will create the socket. Clinet mode meads the socket will be created by ovs as part of the add-port.
In server mode the socket is created by the qemu process however this is not compatible with dpdk/ovs at this time.

And finally the path to the socket.

The name of the port created with add-port will be used as the name of the socket.

If you are booting with openstack, nova will generate the correct Libvirt xml.

Regards
Sean.






From: Gabe Black [mailto:Gabe.Black at viavisolutions.com]
Sent: Wednesday, August 19, 2015 10:18 PM
To: Mooney, Sean K
Cc: John Lange; bugs at openvswitch.org<mailto:bugs at openvswitch.org>
Subject: RE: DPDK OVS on Ubuntu 15.04

Hi Sean,

Thank you for the detailed reply.  I think I finally just discovered the permission denied issue, and I think it is actually related to apparmor.  I'm not sure if the configuration in /etc/apparmor.d/usr.sbin.ibvirtd is complete for Ubuntu installations.

I didn't know what might be missing in that config, so for the sake of progress, I simply disabled apparmor altogether (simply stopping the service wasn't enough, had to disable the service and reboot).  I hope that information is useful as you guys transition to testing on Ubuntu.

I saw in the documentation that the memory backing needed to be shared, and the libvirt documentation seemed to indicate that that hugetable backing is shared by default http://libvirt.org/formatdomain.html#elementsMemoryBacking
...  so all I specified was (i.e. I omitted the <nosharepages> element):

<memoryBacking>

    <hugepages/>

</memoryBacking>

Wouldn't omitting the <nosharepages> be sufficient?

I noticed that you didn't include any sort of xml for connecting to the vhost-user-1 socket.  Isn't that required?

Thanks again,
Gabe

From: Mooney, Sean K [mailto:sean.k.mooney at intel.com]
Sent: Wednesday, August 19, 2015 2:05 PM
To: Gabe Black
Cc: John Lange; bugs at openvswitch.org<mailto:bugs at openvswitch.org>; Mooney, Sean K
Subject: RE: DPDK OVS on Ubuntu 15.04


Hello Gabriel,



Thanks for trying our code :)

Sorry for the long email but I will try to answer all your questions.



We used to support deploying on ubuntu 12.04 however for the last 9 months we have not had capacity to

Test our deployment code on Ubuntu. It is good timing however as we are currently working on updating our deployment

code to work on Ubuntu.



I have just opened  a bug to track this here https://bugs.launchpad.net/networking-ovs-dpdk/+bug/1486697

Just as a side note we are currently updateing our documentation to include a short getting started guide for

Fedora 21. When we close the above bug we will update the getting started guide we information regarding deployment

On ubunutu also. The current fedora guide can be found here https://review.openstack.org/#/c/214156/2 however it

will not solve your current issues.



Currently we are investigating a number of issues relating to Ubuntu deployment.

We are aware of the su command limitation on Ubuntu and will be updating our code to make it compatible.

On Ubuntu 12.04 there previously was both a "libvirt-qemu" user and group

On 14.04 I belive the "libvirt-qemu" group has changed to "kvm"

On Ubuntu 15.04 you should have a new enough Libvirt and qemu installed by default.

On 14.04 the kilo branch of the Ubuntu cloud archive will provide the correct qemu and Libvirt versions.



In general I recommend Libvirt 1.2.13+ and qemu 2.2+



If you are manually booting with vhost-user  via Libvirt  in addition to specify  the memory backing element

<memoryBacking>

    <hugepages/>

  </memoryBacking>

You also need set the hugepages to be mapped as shared.

This can be done via the numa element in the Libvirt xml

e.g.

  <numa>

      <cell id='0' cpus='0-3' memory='512000' unit='KiB' memAccess='shared'/>

  </numa>



When booting vms with OpenStack this limitation was worked around in our qemu wrapper script for kilo and has be fixed upstream in the liberty nova tree.



The following message can be safely ignored.

2015-08-18T22:20:28Z|00011|dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded.

As part of the plugin we also unload the  kernel module but as it is not required when using dpdk.



This error message is issues because our devstack plugin does not currently create the bridges and set their datapath to netdev as a single atomic command.

As a result the bridge is initially created with the kernel datapath then updated to the netdev datapath resulting in the above warning.



On to your actual question it think the permission denied issue is related to the su issues mentioned previously.



Instead of changing

"sudo su -g $qemu_group ..." to "sg $qemu_group ..."

When starting ovs-vswitchd can you change it as follow



"sudo su -g $qemu_group -c..." to "sudo -g $qemu_group  ..."



The command our service file uses to start ovs is as follows



screen -dms ovs-vswitchd sudo su -g $qemu_group -c "umask 002; ${OVS_INSTALL_DIR}/sbin/ovs-vswitchd --dpdk -c $OVS_CORE_MASK -n $OVS_MEM_CHANNELS  --proc-type primary  --huge-dir $OVS_HUGEPAGE_MOUNT --socket-mem $OVS_SOCKET_MEM $pciAddressWhitelist -- unix:$OVS_DB_SOCKET 2>&1 | tee ${OVS_LOG_DIR}/ovs-vswitchd.log"



that is rather log but the important parts are as follows.



umask 002 changes the default file creation mode for the ovs-vswitchd process to itself and its group.



su -g $qemu_group changes the default group the ovs-vswitchd process runs as part of to the same group as the qemu process.



And obviously the  sudo causes the ovs-vswitchd process to run as the root user.



sudo support setting the group with the -g flag directly so su is not needed.



The result of combining  these command is that all vhost-user socket created by ovs will be owned by the root user and the Libvirt-qemu users group(previously "libvirt-qemu" now  "kvm")



Finally to boot a vm successfully via openstack you will need to create a symlink between /usr/var/run/openvswitch and /var/run/openvswitch.

The base path for the vhost-user socket is a constant in our mechanism driver as it must be the same on all compute nodes regardless of the operating system.



On a side note If anyone knows the correct configure options to specify that binaries should be installed in /usr/bin and all unix socket (ovsdb,ovs-vswictd and vhost-user) should be created in /var/run/openvswitch please let me know.



Currently we use ./configure --with-dpdk=${OVS_DPDK_DIR}/${RTE_TARGET} --prefix=/usr

However on fedora unix sockets are created in /var/run/openvswitch and on Ubuntu they are created in /usr/var/run/openvswitch

I am not sure if the information above will resolve your issues but as I said at the start of the email we are currently working to update our scripts to enable ubuntu support.



Regards

Sean.







-----Original Message-----

From: Gabe Black [mailto:Gabe.Black at viavisolutions.com]

Sent: Wednesday, August 19, 2015 12:33 AM

To: Mooney, Sean K

Cc: John Lange; bugs at openvswitch.org<mailto:bugs at openvswitch.org>

Subject: DPDK OVS on Ubuntu 15.04



Hi Sean,



I have carried on where my colleague John Lange left off.  I have installed Ubuntu 15.04 which has kilo support but instead have used the devstack from https://github.com/openstack-dev/devstack to install everything.



For now, I am trying to get a single host set up to try and keep it as simple as possible.



I have used the sample configuration file you suggested for an all in one node provided here https://github.com/stackforge/networking-ovs-dpdk/blob/master/doc/source/_downloads/local.conf_example



Changes that I have made in order to get things to work on Ubuntu, are the following:



====================================================



networking-ovs-dpdk/devstack/ovs-dpdk/ovs-dpdk-init:454



+    qemu_group=`id -ng $qemu_group`



Also lines 463, and 471 I changed the command running in the screen from running "sudo su -g $qemu_group ..." to "sg $qemu_group ..."



====================================================



The reason for these changes is that on Ubuntu the "-g" option does not exist on the su command, so /etc/init.d/ovs-dpdk fails to start.  Also the name "libvirt-qemu" is not the group name, but rather the user.



It seems that with those changes devstack.sh is be able to complete.



--------------------------------



There were a few messages that made me wonder if all was well; for example the ovs-vswitchd.log file had the following:

...

2015-08-18T22:20:28Z|00011|dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded.

...



Also, creating the vhost user port resulted in the following: (note, the ovs-vsctl add-port... command to create the vhost-user did not complain) ...

VHOST_CONFIG: socket created, fd:52

VHOST_CONFIG: bind to /usr/var/run/openvswitch/vhost-user-1

2015-08-18T22:33:01Z|00077|dpdk|INFO|Socket /usr/var/run/openvswitch/vhost-user-1 created for vhost-user port vhost-user-1 2015-08-18T22:33:01Z|00078|dpif_netdev|ERR|Cannot create pmd threads due to out of unpinned cores on numa node 2015-08-18T22:33:01Z|00079|bridge|INFO|bridge br-p6p1: added interface vhost-user-1 on port 3 ...



I never attempted to pin any cores, but it made me wonder if something might be incomplete... (I'm running on a dual quad core (with hyperthreading)).



=====================================================



Anyway, I added the following sections to my VM's libvirt qemu xml (/etc/libvirt/qemu/ubuntutrusty.xml) file to try and have it take the vhost-user interface, as well as ensure the VM is backed by hugetable memory.  These are the sections I added:



Hugetable-backed memory:



<memoryBacking>

    <hugepages/>

  </memoryBacking>



Vhost-user interface (I tried using the <interface type="vhostuser">... but virsh edit would complain about it not validating against domain.rng... I thought it would be supported... running 1.2.12 ) <qemu:commandline>

    <qemu:arg value='-chardev'/>

    <qemu:arg value='socket,id=char1,path=/usr/var/run/openvswitch/vhost-user-1'/>

    <qemu:arg value='-netdev'/>

    <qemu:arg value='type=vhost-user,id=mynet1,chardev=char1,vhostforce'/>

    <qemu:arg value='-device'/>

    <qemu:arg value='virtio-net-pci,mac=54:53:00:6a:b3:00,netdev=mynet1'/>

  </qemu:commandline>



Ok, so this finally brings me to my query for assistance:  When I try and launch the VM (via virt-manager), it complains about two things:



1) "process exited while connecting to monitor: /usr/bin/kvm-spice: line 36: /tmp/qemu.orig: Permission denied"

- I tried chmod 777 that file just to allow anyone to write/access that file, and it is getting the command written to it, but that error message still shows... Don't know if that is important..



2) qemu-system-x86_64: -chardev socket,id=char1,path=/usr/var/run/openvswitch/vhost-user-1: Failed to connect to socket: Permission Denied.



I've tried messing with group/user config variables in /etc/libvirt/qemu.conf and /etc/libvirt/libvirtd.conf but most of the time I then get devstack's n-cpu unable to connect to libvirtd (similar permission error).



I feel I am close, and would appreciate any advice you might have.



Very best,

Gabriel Black






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20150820/7745f0e4/attachment-0002.html>


More information about the discuss mailing list