[ovs-dev] [PATCH 1/2] doc: Refactor DPDK install documentation

Bodireddy, Bhanuprakash bhanuprakash.bodireddy at intel.com
Wed Jun 1 16:15:12 UTC 2016


Thanks Flavio for reviewing the install guide in detail. My comments inline.

>-----Original Message-----
>From: Flavio Leitner [mailto:fbl at sysclose.org]
>Sent: Tuesday, May 31, 2016 9:44 PM
>To: Bodireddy, Bhanuprakash <bhanuprakash.bodireddy at intel.com>
>Cc: dev at openvswitch.org; Traynor, Kevin <kevin.traynor at intel.com>
>Subject: Re: [ovs-dev] [PATCH 1/2] doc: Refactor DPDK install documentation
>
>
>Hi,
>
>Thanks for doing this.
>I have some comments inline.
>fbl
>
>
>On Thu, May 26, 2016 at 01:46:42PM +0100, Bhanuprakash Bodireddy wrote:
>> Refactor the INSTALL.DPDK in to two documents named INSTALL.DPDK and
>> INSTALL.DPDK-ADVANCED. While INSTALL.DPDK document shall facilitate
>the
>> novice user in setting up the OVS DPDK and running it out of box, the
>> ADVANCED document is targeted at expert users looking for the optimum
>> performance running dpdk datapath.
>>
>> This commit updates INSTALL.DPDK.md document.
>>
>> Signed-off-by: Bhanuprakash Bodireddy
><bhanuprakash.bodireddy at intel.com>
>> ---
>>  INSTALL.DPDK.md | 1299 ++++++++++++++++++--------------------------------
>-----
>>  1 file changed, 429 insertions(+), 870 deletions(-)
>>
>> diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
>> index 68735cc..561631f 100644
>> --- a/INSTALL.DPDK.md
>> +++ b/INSTALL.DPDK.md
>> @@ -1,1020 +1,579 @@
>> -Using Open vSwitch with DPDK
>> -============================
>> +OVS DPDK INSTALL GUIDE
>> +================================
>>
>> -	`./testpmd -c 0x3 -n 4 --socket-mem 512 -- --burst=64 -i --
>txqflags=0xf00 --disable-hw-vlan --forward-mode=io --auto-start`
>> +     Note: For IVSHMEM, Set `export DPDK_TARGET=x86_64-ivshmem-
>linuxapp-gcc`
>>
>> -	See below information on dpdkvhostcuse and dpdkvhostuser ports.
>> -	See [DPDK Docs] for more information on `testpmd`.
>> +### 2.3 Install OVS
>
>It seems to me that this section could be better.  We have a good INSTALL.md
>file covering all options, additional details and also have pointers to more
>specifics like how to do in Fedora or Debian.
>
>For instance, Fedora spec file in branch master allows you to build with
>DPDK support with a simple command line:
>
>   $ make rpm-fedora RPMBUILD_OPT="--with dpdk"
>
>Nothing wrong documenting a generic recipe, but I missed the other
>options.  Perhaps something like:
>
>2.3 Install OVS
>   OVS can be installed using different methods. The only requirement to
>install
>with DPDK support enabled is to pass an extra argument to ./configure.  You
>can find
>additional information in INSTALL.md or more specific instructions for a
>distribution
>in the other INSTALL.*.md files available in the repository.  This documents
>focus on
>a generic recipe that should work for most cases....

Good point, I will rework this section keeping your comments in mind. Also I would add hyperlinks to 
INSTALL.md and would redirect users doing distribution specific builds to respective pages. 

>
>I am sure it can be reworded in a better way, but it shows my point.
>
>
>> +  OVS can be downloaded in compressed format from the OVS release
>page (or)
>> +  cloned from git repository if user intends to develop and contribute
>> +  patches upstream.
>>
>> +  - [Download OVS] tar ball and extract the file, for example in to /usr/src
>> +     and set OVS_DIR
>>
>> +     ```
>> +     wget -O ovs.tar https://github.com/openvswitch/ovs/tarball/master
>> +     mkdir -p /usr/src/ovs
>> +     tar -xvf ovs.tar -C /usr/src/ovs --strip-components=1
>> +     export OVS_DIR=/usr/src/ovs
>> +     ```
>>
>> -DPDK Rings :
>> -------------
>> +  - Clone the Git repository for OVS, for example in to /usr/src
>>
>> -Following the steps above to create a bridge, you can now add dpdk rings
>> -as a port to the vswitch.  OVS will expect the DPDK ring device name to
>> -start with dpdkr and end with a portid.
>> +     ```
>> +     cd /usr/src/
>> +     git clone https://github.com/openvswitch/ovs.git
>> +     export OVS_DIR=/usr/src/ovs
>> +     ```
>>
>> -`ovs-vsctl add-port br0 dpdkr0 -- set Interface dpdkr0 type=dpdkr`
>> +  - Install OVS dependencies
>>
>> -DPDK rings client test application
>> +     GNU make, GCC 4.x (or) Clang 3.4  (Mandatory)
>> +     libssl, libcap-ng, Python 2.7  (Optional)
>> +     More information can be found at [Build Requirements]
>>
>> -Included in the test directory is a sample DPDK application for testing
>> -the rings.  This is from the base dpdk directory and modified to work
>> -with the ring naming used within ovs.
>> +  - Configure, Install OVS
>>
>> -location tests/ovs_client
>> +     ```
>> +     cd $OVS_DIR
>> +     ./boot.sh
>> +     ./configure --with-dpdk=$DPDK_BUILD
>> +     make install
>> +     ```
>>
>> -To run the client :
>> +     Note: Passing DPDK_BUILD can be skipped if DPDK library is installed in
>> +     standard locations i.e `./configure --with-dpdk` should suffice.
>>
>> -```
>> -cd /usr/src/ovs/tests/
>> -ovsclient -c 1 -n 4 --proc-type=secondary -- -n "port id you gave dpdkr"
>> -```
>> +## <a name="ovssetup"></a> 3. Setup OVS with DPDK datapath
>>
>> -In the case of the dpdkr example above the "port id you gave dpdkr" is 0.
>> +### 3.1 Setup Hugepages
>>
>> -It is essential to have --proc-type=secondary
>> +  Allocate and mount 2M Huge pages:
>>
>> -The application simply receives an mbuf on the receive queue of the
>> -ethernet ring and then places that same mbuf on the transmit ring of
>> -the ethernet ring.  It is a trivial loopback application.
>> +  - For persistent allocation of huge pages, write to hugepages.conf file
>> +    in /etc/sysctl.d
>>
>> -DPDK rings in VM (IVSHMEM shared memory communications)
>> --------------------------------------------------------
>> +    `echo 'vm.nr_hugepages=2048' > /etc/sysctl.d/hugepages.conf`
>>
>> -In addition to executing the client in the host, you can execute it within
>> -a guest VM. To do so you will need a patched qemu.  You can download the
>> -patch and getting started guide at :
>> +  - For run-time allocation of huge pages
>>
>> -https://01.org/packet-processing/downloads
>> +    `sysctl -w vm.nr_hugepages=N` where N = No. of 2M huge pages
>allocated
>>
>> -A general rule of thumb for better performance is that the client
>> -application should not be assigned the same dpdk core mask "-c" as
>> -the vswitchd.
>> +  - To verify hugepage configuration
>>
>> -DPDK vhost:
>> ------------
>> +    `grep HugePages_ /proc/meminfo`
>>
>> -DPDK 16.04 supports two types of vhost:
>> +  - Mount hugepages
>
>I'd say something 'Mount hugepages if not already mounted by default'
>otherwise it can be double mounted and that would hide libvirt dir.

Agree. I will add this.


>
>
>>
>> -1. vhost-user
>> -2. vhost-cuse
>> +    `mount -t hugetlbfs none /dev/hugepages`
>>
>> -Whatever type of vhost is enabled in the DPDK build specified, is the type
>> -that will be enabled in OVS. By default, vhost-user is enabled in DPDK.
>> -Therefore, unless vhost-cuse has been enabled in DPDK, vhost-user ports
>> -will be enabled in OVS.
>> -   ```
>> +     SSL support
>>
>> -   If one wishes to use multiple queues for an interface in the guest, the
>> -   driver in the guest operating system must be configured to do so. It is
>> -   recommended that the number of queues configured be equal to '$q'.
>> +     ```
>> +     ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
>> +         --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
>> +         --private-key=db:Open_vSwitch,SSL,private_key \
>> +         --certificate=Open_vSwitch,SSL,certificate \
>> +         --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
>> +     ```
>>
>> -   For example, this can be done for the Linux kernel virtio-net driver with:
>> +  3. Initialize DB (One time step)
>>
>> -   ```
>> -   ethtool -L <DEV> combined <$q>
>> -   ```
>> +     ```
>> +     ovs-vsctl --no-wait init
>> +     ```
>>
>> -   A note on the command above:
>> +  4. Start vswitchd
>>
>
>This section can be simplified by just listing the main options and
>pointing to ovs-vswitchd.conf.db(5) for descriptions.

Infact I had the same impression when I took changes to new install guide.  I will rework and simplify this section.

>
>
>> -   `-L`: Changes the numbers of channels of the specified network device
>> +     DPDK configuration arguments can be passed to vswitchd via
>Open_vSwitch
>> +     other_config column. The recognized configuration options are listed.
>> +     Defaults will be provided for all values not explicitly set.
>>
>> -   `combined`: Changes the number of multi-purpose channels.
>> +     * dpdk-init
>> +     Specifies whether OVS should initialize and support DPDK ports. This is
>> +     a boolean, and defaults to false.
>>
>> -DPDK vhost-cuse:
>> -----------------
>> +     * dpdk-lcore-mask
>> +     Specifies the CPU cores on which dpdk lcore threads should be
>spawned.
>> +     The DPDK lcore threads are used for DPDK library tasks, such as
>> +     library internal message processing, logging, etc. Value should be in
>> +     the form of a hex string (so '0x123') similar to the 'taskset' mask
>> +     input.
>> +     If not specified, the value will be determined by choosing the lowest
>> +     CPU core from initial cpu affinity list. Otherwise, the value will be
>> +     passed directly to the DPDK library.
>> +     For performance reasons, it is best to set this to a single core on
>> +     the system, rather than allow lcore threads to float.
>>
>> -The following sections describe the use of vhost-cuse 'dpdkvhostcuse'
>ports
>> -with OVS.
>> +     * dpdk-alloc-mem
>> +     This sets the total memory to preallocate from hugepages regardless of
>> +     processor socket. It is recommended to use dpdk-socket-mem instead.
>>
>> -DPDK vhost-cuse Prerequisites:
>> --------------------------
>> +     * dpdk-socket-mem
>> +     Comma separated list of memory to pre-allocate from hugepages on
>specific
>> +     sockets.
>>
>> -1. DPDK 16.04 with vhost support enabled as documented in the "Building
>and
>> -   Installing section"
>> -   As an additional step, you must enable vhost-cuse in DPDK by setting the
>> -   following additional flag in `config/common_base`:
>> +     * dpdk-hugepage-dir
>> +     Directory where hugetlbfs is mounted
>>
>> -   `CONFIG_RTE_LIBRTE_VHOST_USER=n`
>> +     * dpdk-extra
>> +     Extra arguments to provide to DPDK EAL, as previously specified on the
>> +     command line. Do not pass '--no-huge' to the system in this way.
>Support
>> +     for running the system without hugepages is nonexistent.
>>
>> -   Following this, rebuild DPDK as per the instructions in the "Building and
>> -   Installing" section. Finally, rebuild OVS as per step 3 in the "Building
>> -   and Installing" section - OVS will detect that DPDK has vhost-cuse libraries
>> -   compiled and in turn will enable support for it in the switch and disable
>> -   vhost-user support.
>> +     * cuse-dev-name
>> +     Option to set the vhost_cuse character device name.
>>
>> -2. Insert the Cuse module:
>> +     * vhost-sock-dir
>> +     Option to set the path to the vhost_user unix socket files.
>>
>> -     `modprobe cuse`
>> +     NOTE: Changing any of these options requires restarting the ovs-
>vswitchd
>> +     application.
>>
>> -3. Build and insert the `eventfd_link` module:
>> +     Open vSwitch can be started as normal. DPDK will be initialized as long
>> +     as the dpdk-init option has been set to 'true'.
>>
>>       ```
>> -     cd $DPDK_DIR/lib/librte_vhost/eventfd_link/
>> -     make
>> -     insmod $DPDK_DIR/lib/librte_vhost/eventfd_link.ko
>> +     export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
>> +     ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
>> +     ovs-vswitchd unix:$DB_SOCK --pidfile --detach
>>       ```
>>
>> -4. QEMU version v2.1.0+
>> -
>> -   vhost-cuse will work with QEMU v2.1.0 and above, however it is
>recommended to
>> -   use v2.2.0 if providing your VM with memory greater than 1GB due to
>potential
>> -   issues with memory mapping larger areas.
>> -   Note: QEMU v1.6.2 will also work, with slightly different command line
>parameters,
>> -   which are specified later in this document.
>> -
>> -Adding DPDK vhost-cuse ports to the Switch:
>> ---------------------------------------
>> -
>> -Following the steps above to create a bridge, you can now add DPDK vhost-
>cuse
>> -as a port to the vswitch. Unlike DPDK ring ports, DPDK vhost-cuse ports can
>have
>> -arbitrary names.
>> -
>> -  -  For vhost-cuse, the name of the port type is `dpdkvhostcuse`
>> +     If allocated more than one GB hugepage (as for IVSHMEM), set amount
>and
>> +     use NUMA node 0 memory. For details on using ivshmem with DPDK,
>refer to
>> +     [OVS Testcases].
>>
>>       ```
>> -     ovs-vsctl add-port br0 vhost-cuse-1 -- set Interface vhost-cuse-1
>> -     type=dpdkvhostcuse
>> +     ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-
>mem="1024,0"
>> +     ovs-vswitchd unix:$DB_SOCK --pidfile --detach
>>       ```
>>
>> -     When attaching vhost-cuse ports to QEMU, the name provided during
>the
>> -     add-port operation must match the ifname parameter on the QEMU
>command
>> -     line. More instructions on this can be found in the next section.
>> -
>> -DPDK vhost-cuse VM configuration:
>> ----------------------------------
>> -
>> -   vhost-cuse ports use a Linux* character device to communicate with
>QEMU.
>> -   By default it is set to `/dev/vhost-net`. It is possible to reuse this
>> -   standard device for DPDK vhost, which makes setup a little simpler but it
>> -   is better practice to specify an alternative character device in order to
>> -   avoid any conflicts if kernel vhost is to be used in parallel.
>> +     To better scale the work loads across cores, Multiple pmd threads can be
>> +     created and pinned to CPU cores by explicity specifying pmd-cpu-mask.
>> +     eg: To spawn 2 pmd threads and pin them to cores 1, 2
>>
>> -1. This step is only needed if using an alternative character device.
>> +     ```
>> +     ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
>> +     ```
>>
>> -   The new character device filename must be specified in the ovsdb:
>> +  5. Create bridge & add DPDK devices
>>
>> -        `./utilities/ovs-vsctl --no-wait set Open_vSwitch . \
>> -                          other_config:cuse-dev-name=my-vhost-net`
>> +     create a bridge with datapath_type "netdev" in the configuration
>database
>>
>> -   In the example above, the character device to be used will be
>> -   `/dev/my-vhost-net`.
>> +     `ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev`
>>
>> -2. This step is only needed if reusing the standard character device. It will
>> -   conflict with the kernel vhost character device so the user must first
>> -   remove it.
>> +     Now you can add DPDK devices. OVS expects DPDK device names to
>start with
>> +     "dpdk" and end with a portid. vswitchd should print (in the log file) the
>> +     number of dpdk devices found.
>>
>> -       `rm -rf /dev/vhost-net`
>> +     ```
>> +     ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
>> +     ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
>> +     ```
>>
>> -3a. Configure virtio-net adaptors:
>> -   The following parameters must be passed to the QEMU binary:
>> +     After the DPDK ports get added to switch, a polling thread continuously
>polls
>> +     DPDK devices and consumes 100% of the core as can be checked from
>'top' and 'ps' cmds.
>>
>>       ```
>> -     -netdev
>tap,id=<id>,script=no,downscript=no,ifname=<name>,vhost=on
>> -     -device virtio-net-pci,netdev=net1,mac=<mac>
>> +     top -H
>> +     ps -eLo pid,psr,comm | grep pmd
>>       ```
>>
>> -     Repeat the above parameters for multiple devices.
>> -
>> -     The DPDK vhost library will negiotiate its own features, so they
>> -     need not be passed in as command line params. Note that as offloads
>are
>> -     disabled this is the equivalent of setting:
>> +     Note: creating bonds of DPDK interfaces is slightly different to creating
>> +     bonds of system interfaces.  For DPDK, the interface type must be
>explicitly
>> +     set, for example:
>>
>> -     `csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off`
>> +     ```
>> +     ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 -- set Interface dpdk0
>type=dpdk -- set Interface dpdk1 type=dpdk
>> +     ```
>>
>> -3b. If using an alternative character device. It must be also explicitly
>> -    passed to QEMU using the `vhostfd` argument:
>> +  6. PMD thread statistics
>>
>>       ```
>> -     -netdev
>tap,id=<id>,script=no,downscript=no,ifname=<name>,vhost=on,
>> -     vhostfd=<open_fd>
>> -     -device virtio-net-pci,netdev=net1,mac=<mac>
>> -     ```
>> +     # Check current stats
>> +       ovs-appctl dpif-netdev/pmd-stats-show
>>
>> -     The open file descriptor must be passed to QEMU running as a child
>> -     process. This could be done with a simple python script.
>> +     # Show port/rxq assignment
>> +       ovs-appctl dpif-netdev/pmd-rxq-show
>>
>> -       ```
>> -       #!/usr/bin/python
>> -       fd = os.open("/dev/usvhost", os.O_RDWR)
>> -       subprocess.call("qemu-system-x86_64 .... -netdev tap,id=vhostnet0,\
>> -                        vhost=on,vhostfd=" + fd +"...", shell=True)
>> +     # Clear previous stats
>> +       ovs-appctl dpif-netdev/pmd-stats-clear
>> +     ```
>>
>> -   Alternatively the `qemu-wrap.py` script can be used to automate the
>> -   requirements specified above and can be used in conjunction with libvirt
>if
>> -   desired. See the "DPDK vhost VM configuration with QEMU wrapper"
>section
>> -   below.
>> +  7. Stop vswitchd & Delete bridge
>>
>> -4. Configure huge pages:
>> -   QEMU must allocate the VM's memory on hugetlbfs. Vhost ports access a
>> -   virtio-net device's virtual rings and packet buffers mapping the VM's
>> -   physical memory on hugetlbfs. To enable vhost-ports to map the VM's
>> -   memory into their process address space, pass the following parameters
>> -   to QEMU:
>> +     ```
>> +     ovs-appctl -t ovs-vswitchd exit
>> +     ovs-appctl -t ovsdb-server exit
>> +     ovs-vsctl del-br br0
>> +     ```
>
>
>I think you need to delete br0 before stopping ovsdb-server.
>
>
>>
>> -     `-object memory-backend-file,id=mem,size=4096M,mem-
>path=/dev/hugepages,
>> -      share=on -numa node,memdev=mem -mem-prealloc`
>> +## <a name="builddpdk"></a> 4. DPDK in the VM
>>
>> -   Note: For use with an earlier QEMU version such as v1.6.2, use the
>> -   following to configure hugepages instead:
>> +DPDK 'testpmd' application can be run in the Guest VM for high speed
>> +packet forwarding between vhostuser ports. This needs DPDK, testpmd to
>be
>> +compiled along with kernel modules. Below are the steps for setting up
>> +the testpmd application in the VM. More information on the vhostuser
>ports
>> +can be found in [Vhost Walkthrough].
>
>
>This looks way too complicated for a beginners guide.  I think you can
>assume that the VM has networking connectivity or even better that the
>user knows how to put a tarball inside of the VM and then take from there.

Point taken. Will simplify this.

>
>
>>
>> -     `-mem-path /dev/hugepages -mem-prealloc`
>> +  * Export the DPDK loc $DPDK_LOC to the Guest VM(/dev/sdb on VM)
>> +    and instantiate the Guest.
>> -DPDK vhost-cuse VM configuration with libvirt and QEMU wrapper:
>> -----------------------------------------------------------
>> +       # Dump flows
>> +       ovs-ofctl dump-flows br0
>> +       ```
>>
>> -To use the qemu-wrapper script in conjuntion with libvirt, follow the
>> -steps in the previous section before proceeding with the following steps:
>> +  3. Instantiate Guest VM using Qemu cmdline
>>
>> -  1. Place `qemu-wrap.py` in libvirtd's binary search PATH ($PATH)
>> -     Ideally in the same directory that the QEMU binary is located.
>> +       Guest Configuration
>>
>> -  2. Ensure that the script has the same owner/group and file permissions
>> -     as the QEMU binary.
>> +       ```
>> +       | configuration        | values | comments
>> +       |----------------------|--------|-----------------
>> +       | qemu version         | 2.2.0  |
>> +       | qemu thread affinity | core 5 | taskset 0x20
>> +       | memory               | 4GB    | -
>> +       | cores                | 2      | -
>> +       | Qcow2 image          | CentOS7| -
>> +       | mrg_rxbuf            | off    | -
>> +       | export DPDK sources  | yes    | -drive file=fat:rw:$DPDK_LOC(seen as
>/dev/sdb in VM)
>> +       ```
>>
>> -  3. Update the VM xml file using "virsh edit VM.xml"
>> +       ```
>
>You had a subsection called 'Guest configuration', I think here
>deserves another subsection, e.g.:  'Guest Starting Command"

Good observation. I will create another subsection here.

>
>> +       export VM_NAME=vhost-vm
>> +       export GUEST_MEM=3072M
>> +       export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
>> +       export DPDK_LOC=/usr/src/dpdk-16.04
>> +       export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
>>
>> -       1. Set the VM to use the launch script.
>> -          Set the emulator path contained in the `<emulator><emulator/>` tags.
>> -          For example, replace:
>> +       taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -
>enable-kvm -m $GUEST_MEM -object memory-backend-
>file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on -
>numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 -drive
>file=$QCOW2_IMAGE -drive file=fat:rw:$DPDK_LOC,snapshot=off -chardev
>socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 -netdev
>type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-

Regards,
Bhanu Prakash.



More information about the dev mailing list