[ovs-dev] [PATCH 2/5] doc: Split dpdk, dpdk-advanced into multiple docs

Stephen Finucane stephen at that.guru
Tue Dec 13 17:40:39 UTC 2016


Combined, the dpdk and dpdk-advanced installation documents provide a
lot of useful information, but most of this information is unrelated to
installation. Rework these documents, completely breaking up the
dpdk-advanced document into multiple smaller documents in other sections
and moving non-install aspects of the dpdk document into these sections.
This aims to tie the DPDK docs into the documentation structure.

Signed-off-by: Stephen Finucane <stephen at that.guru>
---
 Documentation/automake.mk                          |   6 +-
 Documentation/howto/dpdk.rst                       | 603 +++++++++++++
 Documentation/howto/index.rst                      |   1 +
 Documentation/index.rst                            |  13 +-
 Documentation/intro/install/dpdk-advanced.rst      | 938 ---------------------
 Documentation/intro/install/dpdk.rst               | 584 ++++++-------
 Documentation/intro/install/index.rst              |   5 -
 Documentation/topics/dpdk/index.rst                |  32 +
 .../topics/{dpdk.rst => dpdk/ivshmem.rst}          |   6 +-
 Documentation/topics/dpdk/vhost-user.rst           | 396 +++++++++
 Documentation/topics/index.rst                     |   3 +-
 Documentation/topics/testing.rst                   |  38 +
 12 files changed, 1368 insertions(+), 1257 deletions(-)
 create mode 100644 Documentation/howto/dpdk.rst
 delete mode 100644 Documentation/intro/install/dpdk-advanced.rst
 create mode 100644 Documentation/topics/dpdk/index.rst
 rename Documentation/topics/{dpdk.rst => dpdk/ivshmem.rst} (93%)
 create mode 100644 Documentation/topics/dpdk/vhost-user.rst
 create mode 100644 Documentation/topics/testing.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index b02d63e..ffb8ae3 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -9,7 +9,6 @@ EXTRA_DIST += \
 	Documentation/intro/install/index.rst \
 	Documentation/intro/install/bash-completion.rst \
 	Documentation/intro/install/debian.rst \
-	Documentation/intro/install/dpdk-advanced.rst \
 	Documentation/intro/install/dpdk.rst \
 	Documentation/intro/install/fedora.rst \
 	Documentation/intro/install/general.rst \
@@ -25,7 +24,10 @@ EXTRA_DIST += \
 	Documentation/topics/bonding.rst \
 	Documentation/topics/datapath.rst \
 	Documentation/topics/design.rst \
-	Documentation/topics/dpdk.rst \
+	Documentation/topics/dpdk/index.rst \
+	Documentation/topics/dpdk/vhost-user.rst \
+	Documentation/topics/dpdk/ivshmem.rst \
+	Documentation/topics/testing.rst \
 	Documentation/topics/high-availability.rst \
 	Documentation/topics/integration.rst \
 	Documentation/topics/openflow.rst \
diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
new file mode 100644
index 0000000..f55ae3b
--- /dev/null
+++ b/Documentation/howto/dpdk.rst
@@ -0,0 +1,603 @@
+..
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+      Convention for heading levels in Open vSwitch documentation:
+
+      =======  Heading 0 (reserved for the title in a document)
+      -------  Heading 1
+      ~~~~~~~  Heading 2
+      +++++++  Heading 3
+      '''''''  Heading 4
+
+      Avoid deeper levels because they do not render well.
+
+============================
+Using Open vSwitch with DPDK
+============================
+
+This document describes how to use Open vSwitch with DPDK datapath.
+
+.. important::
+
+   Using the DPDK datapath requires building OVS with DPDK support. Refer to
+   :doc:`/intro/install/dpdk` for more information.
+
+Ports and Bridges
+-----------------
+
+ovs-vsctl can be used to set up bridges and other Open vSwitch features.
+Bridges should be created with a ``datapath_type=netdev``::
+
+    $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
+
+ovs-vsctl can also be used to add DPDK devices. OVS expects DPDK device names
+to start with ``dpdk`` and end with a portid. ovs-vswitchd should print the
+number of dpdk devices found in the log file::
+
+    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
+
+After the DPDK ports get added to switch, a polling thread continuously polls
+DPDK devices and consumes 100% of the core, as can be checked from ``top`` and
+``ps`` commands::
+
+    $ top -H
+    $ ps -eLo pid,psr,comm | grep pmd
+
+Creating bonds of DPDK interfaces is slightly different to creating bonds of
+system interfaces. For DPDK, the interface type must be explicitly set. For
+example::
+
+    $ ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 \
+        -- set Interface dpdk0 type=dpdk \
+        -- set Interface dpdk1 type=dpdk
+
+To stop ovs-vswitchd & delete bridge, run::
+
+    $ ovs-appctl -t ovs-vswitchd exit
+    $ ovs-appctl -t ovsdb-server exit
+    $ ovs-vsctl del-br br0
+
+PMD Thread Statistics
+---------------------
+
+To show current stats::
+
+    $ ovs-appctl dpif-netdev/pmd-stats-show
+
+To clear previous stats::
+
+    $ ovs-appctl dpif-netdev/pmd-stats-clear
+
+Port/RXQ Assigment to PMD Threads
+---------------------------------
+
+To show port/rxq assignment::
+
+    $ ovs-appctl dpif-netdev/pmd-rxq-show
+
+To change default rxq assignment to pmd threads, rxqs may be manually pinned to
+desired cores using::
+
+    $ ovs-vsctl set Interface <iface> \
+        other_config:pmd-rxq-affinity=<rxq-affinity-list>
+
+where:
+
+- ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values
+
+For example::
+
+    $ ovs-vsctl set interface dpdk0 options:n_rxq=4 \
+        other_config:pmd-rxq-affinity="0:3,1:7,3:8"
+
+This will ensure:
+
+- Queue #0 pinned to core 3
+- Queue #1 pinned to core 7
+- Queue #2 not pinned
+- Queue #3 pinned to core 8
+
+After that PMD threads on cores where RX queues was pinned will become
+``isolated``. This means that this thread will poll only pinned RX queues.
+
+.. warning::
+  If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will
+  not be polled. Also, if provided ``core_id`` is not available (ex. this
+  ``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD
+  thread.
+
+QoS
+---
+
+Assuming you have a vhost-user port transmitting traffic consisting of packets
+of size 64 bytes, the following command would limit the egress transmission
+rate of the port to ~1,000,000 packets per second::
+
+    $ ovs-vsctl set port vhost-user0 qos=@newqos -- \
+        --id=@newqos create qos type=egress-policer other-config:cir=46000000 \
+        other-config:cbs=2048`
+
+To examine the QoS configuration of the port, run::
+
+    $ ovs-appctl -t ovs-vswitchd qos/show vhost-user0
+
+To clear the QoS configuration from the port and ovsdb, run::
+
+    $ ovs-vsctl destroy QoS vhost-user0 -- clear Port vhost-user0 qos
+
+Refer to vswitch.xml for more details on egress-policer.
+
+Rate Limiting
+--------------
+
+Here is an example on Ingress Policing usage. Assuming you have a vhost-user
+port receiving traffic consisting of packets of size 64 bytes, the following
+command would limit the reception rate of the port to ~1,000,000 packets per
+second::
+
+    $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000 \
+        ingress_policing_burst=1000`
+
+To examine the ingress policer configuration of the port::
+
+    $ ovs-vsctl list interface vhost-user0
+
+To clear the ingress policer configuration from the port::
+
+    $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=0
+
+Refer to vswitch.xml for more details on ingress-policer.
+
+Flow Control
+------------
+
+Flow control can be enabled only on DPDK physical ports. To enable flow control
+support at tx side while adding a port, run::
+
+    $ ovs-vsctl add-port br0 dpdk0 -- \
+        set Interface dpdk0 type=dpdk options:tx-flow-ctrl=true
+
+Similarly, to enable rx flow control, run::
+
+    $ ovs-vsctl add-port br0 dpdk0 -- \
+        set Interface dpdk0 type=dpdk options:rx-flow-ctrl=true
+
+To enable flow control auto-negotiation, run::
+
+    $ ovs-vsctl add-port br0 dpdk0 -- \
+        set Interface dpdk0 type=dpdk options:flow-ctrl-autoneg=true
+
+To turn ON the tx flow control at run time for an existing port, run::
+
+    $ ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=true
+
+The flow control parameters can be turned off by setting ``false`` to the
+respective parameter. To disable the flow control at tx side, run::
+
+    $ ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=false
+
+pdump
+-----
+
+pdump allows you to listen on DPDK ports and view the traffic that is passing
+on them. To use this utility, one must have libpcap installed on the system.
+Furthermore, DPDK must be built with ``CONFIG_RTE_LIBRTE_PDUMP=y`` and
+``CONFIG_RTE_LIBRTE_PMD_PCAP=y``.
+
+.. warning::
+  A performance decrease is expected when using a monitoring application like
+  the DPDK pdump app.
+
+To use pdump, simply launch OVS as usual, then navigate to the ``app/pdump``
+directory in DPDK, ``make`` the application and run like so::
+
+    $ sudo ./build/app/dpdk-pdump -- \
+        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \
+        --server-socket-path=/usr/local/var/run/openvswitch
+
+The above command captures traffic received on queue 0 of port 0 and stores it
+in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and
+pcap locations are of course also available to use. For example, to capture all
+packets that traverse port 0 in a single pcap file::
+
+    $ sudo ./build/app/dpdk-pdump -- \
+        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \
+        --server-socket-path=/usr/local/var/run/openvswitch
+
+``server-socket-path`` must be set to the value of ``ovs_rundir()`` which
+typically resolves to ``/usr/local/var/run/openvswitch``.
+
+Many tools are available to view the contents of the pcap file. Once example is
+tcpdump. Issue the following command to view the contents of ``pkts.pcap``::
+
+    $ tcpdump -r pkts.pcap
+
+More information on the pdump app and its usage can be found in the `DPDK docs
+<http://dpdk.org/doc/guides/sample_app_ug/pdump.html>`__.
+
+Jumbo Frames
+------------
+
+By default, DPDK ports are configured with standard Ethernet MTU (1500B). To
+enable Jumbo Frames support for a DPDK port, change the Interface's
+``mtu_request`` attribute to a sufficiently large value. For example, to add a
+DPDK Phy port with MTU of 9000::
+
+    $ ovs-vsctl add-port br0 dpdk0 \
+      -- set Interface dpdk0 type=dpdk \
+      -- set Interface dpdk0 mtu_request=9000`
+
+Similarly, to change the MTU of an existing port to 6200::
+
+    $ ovs-vsctl set Interface dpdk0 mtu_request=6200
+
+Some additional configuration is needed to take advantage of jumbo frames with
+vHost ports:
+
+1. *mergeable buffers* must be enabled for vHost ports, as demonstrated in the
+   QEMU command line snippet below::
+
+       -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
+       -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=on
+
+2. Where virtio devices are bound to the Linux kernel driver in a guest
+   environment (i.e. interfaces are not bound to an in-guest DPDK driver), the
+   MTU of those logical network interfaces must also be increased to a
+   sufficiently large value. This avoids segmentation of Jumbo Frames received
+   in the guest. Note that 'MTU' refers to the length of the IP packet only,
+   and not that of the entire frame.
+
+   To calculate the exact MTU of a standard IPv4 frame, subtract the L2 header
+   and CRC lengths (i.e. 18B) from the max supported frame size.  So, to set
+   the MTU for a 9018B Jumbo Frame::
+
+       $ ifconfig eth1 mtu 9000
+
+When Jumbo Frames are enabled, the size of a DPDK port's mbuf segments are
+increased, such that a full Jumbo Frame of a specific size may be accommodated
+within a single mbuf segment.
+
+Jumbo frame support has been validated against 9728B frames, which is the
+largest frame size supported by Fortville NIC using the DPDK i40e driver, but
+larger frames and other DPDK NIC drivers may be supported. These cases are
+common for use cases involving East-West traffic only.
+
+.. _dpdk-ovs-in-guest:
+
+OVS with DPDK Inside VMs
+------------------------
+
+Additional configuration is required if you want to run ovs-vswitchd with DPDK
+backend inside a QEMU virtual machine. ovs-vswitchd creates separate DPDK TX
+queues for each CPU core available. This operation fails inside QEMU virtual
+machine because, by default, VirtIO NIC provided to the guest is configured to
+support only single TX queue and single RX queue. To change this behavior, you
+need to turn on ``mq`` (multiqueue) property of all ``virtio-net-pci`` devices
+emulated by QEMU and used by DPDK.  You may do it manually (by changing QEMU
+command line) or, if you use Libvirt, by adding the following string to
+``<interface>`` sections of all network devices used by DPDK::
+
+    <driver name='vhost' queues='N'/>
+
+where:
+
+``N``
+  determines how many queues can be used by the guest.
+
+This requires QEMU >= 2.2.
+
+.. _dpdk-phy-phy:
+
+PHY-PHY
+-------
+
+Add a userspace bridge and two ``dpdk`` (PHY) ports::
+
+    # Add userspace bridge
+    $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
+
+    # Add two dpdk ports
+    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
+
+Add test flows to forward packets betwen DPDK port 0 and port 1::
+
+    # Clear current flows
+    $ ovs-ofctl del-flows br0
+
+    # Add flows between port 1 (dpdk0) to port 2 (dpdk1)
+    $ ovs-ofctl add-flow br0 in_port=1,action=output:2
+    $ ovs-ofctl add-flow br0 in_port=2,action=output:1
+
+Transmit traffic into either port. You should see it returned via the other.
+
+.. _dpdk-vhost-loopback:
+
+PHY-VM-PHY (vHost Loopback)
+---------------------------
+
+Add a userspace bridge, two ``dpdk`` (PHY) ports, and two ``dpdkvhostuser``
+ports::
+
+    # Add userspace bridge
+    $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
+
+    # Add two dpdk ports
+    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
+
+    # Add two dpdkvhostuser ports
+    $ ovs-vsctl add-port br0 dpdkvhostuser0 \
+        -- set Interface dpdkvhostuser0 type=dpdkvhostuser
+    $ ovs-vsctl add-port br0 dpdkvhostuser1 \
+        -- set Interface dpdkvhostuser1 type=dpdkvhostuser
+
+Add test flows to forward packets betwen DPDK devices and VM ports::
+
+    # Clear current flows
+    $ ovs-ofctl del-flows br0
+
+    # Add flows
+    $ ovs-ofctl add-flow br0 in_port=1,action=output:3
+    $ ovs-ofctl add-flow br0 in_port=3,action=output:1
+    $ ovs-ofctl add-flow br0 in_port=4,action=output:2
+    $ ovs-ofctl add-flow br0 in_port=2,action=output:4
+
+    # Dump flows
+    $ ovs-ofctl dump-flows br0
+
+Create a VM using the following configuration:
+
++----------------------+--------+-----------------+
+| configuration        | values | comments        |
++----------------------+--------+-----------------+
+| qemu version         | 2.2.0  | n/a             |
+| qemu thread affinity | core 5 | taskset 0x20    |
+| memory               | 4GB    | n/a             |
+| cores                | 2      | n/a             |
+| Qcow2 image          | CentOS7| n/a             |
+| mrg_rxbuf            | off    | n/a             |
++----------------------+--------+-----------------+
+
+You can do this directly with QEMU via the ``qemu-system-x86_64`` application::
+
+    $ export VM_NAME=vhost-vm
+    $ export GUEST_MEM=3072M
+    $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
+    $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
+
+    $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
+      -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
+      -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
+      -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
+      -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
+      -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
+      -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
+      -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
+      -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
+      -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off
+
+For a explanation of this command, along with alternative approaches such as
+booting the VM via libvirt, refer to :doc:`/topics/dpdk/vhost-user`.
+
+Once the guest is configured and booted, configure DPDK packet forwarding
+within the guest. To accomplish this, build the ``testpmd`` application as
+described in :ref:`dpdk-testpmd`. Once compiled, run the application::
+
+    $ cd $DPDK_DIR/app/test-pmd;
+    $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \
+        --burst=64 -i --txqflags=0xf00 --disable-hw-vlan
+    $ set fwd mac retry
+    $ start
+
+When you finish testing, bind the vNICs back to kernel::
+
+    $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0
+    $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0
+
+.. note::
+
+  Valid PCI IDs must be passed in above example. The PCI IDs can be retrieved
+  like so::
+
+      $ $DPDK_DIR/tools/dpdk-devbind.py --status
+
+More information on the dpdkvhostuser ports can be found in
+:doc:`/topics/dpdk/vhost-user`.
+
+PHY-VM-PHY (vHost Loopback) (Kernel Forwarding)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+:ref:`dpdk-vhost-loopback` details steps for PHY-VM-PHY loopback
+testcase and packet forwarding using DPDK testpmd application in the Guest VM.
+For users wishing to do packet forwarding using kernel stack below, you need to
+run the below commands on the guest::
+
+    $ ifconfig eth1 1.1.1.2/24
+    $ ifconfig eth2 1.1.2.2/24
+    $ systemctl stop firewalld.service
+    $ systemctl stop iptables.service
+    $ sysctl -w net.ipv4.ip_forward=1
+    $ sysctl -w net.ipv4.conf.all.rp_filter=0
+    $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
+    $ sysctl -w net.ipv4.conf.eth2.rp_filter=0
+    $ route add -net 1.1.2.0/24 eth2
+    $ route add -net 1.1.1.0/24 eth1
+    $ arp -s 1.1.2.99 DE:AD:BE:EF:CA:FE
+    $ arp -s 1.1.1.99 DE:AD:BE:EF:CA:EE
+
+PHY-VM-PHY (vHost Multiqueue)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+vHost Multiqueue functionality can also be validated using the PHY-VM-PHY
+configuration. To begin, follow the steps described in :ref:`dpdk-phy-phy` to
+create and initialize the database, start ovs-vswitchd and add ``dpdk``-type
+devices to bridge ``br0``. Once complete, follow the below steps:
+
+1. Configure PMD and RXQs.
+
+   For example, set the number of dpdk port rx queues to at least 2  The number
+   of rx queues at vhost-user interface gets automatically configured after
+   virtio device connection and doesn't need manual configuration::
+
+       $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xc
+       $ ovs-vsctl set Interface dpdk0 options:n_rxq=2
+       $ ovs-vsctl set Interface dpdk1 options:n_rxq=2
+
+2. Instantiate Guest VM using QEMU cmdline
+
+   We must configure with appropriate software versions to ensure this feature
+   is supported.
+
+   .. list-table:: Recommended BIOS Settings
+      :header-rows: 1
+
+      * - Setting
+        - Value
+      * - QEMU version
+        - 2.5.0
+      * - QEMU thread affinity
+        - 2 cores (taskset 0x30)
+      * - Memory
+        - 4 GB
+      * - Cores
+        - 2
+      * - Distro
+        - Fedora 22
+      * - Multiqueue
+        - Enabled
+
+   To do this, instantiate the guest as follows::
+
+       $ export VM_NAME=vhost-vm
+       $ export GUEST_MEM=4096M
+       $ export QCOW2_IMAGE=/root/Fedora22_x86_64.qcow2
+       $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
+       $ taskset 0x30 qemu-system-x86_64 -cpu host -smp 2,cores=2 -m 4096M \
+           -drive file=$QCOW2_IMAGE --enable-kvm -name $VM_NAME \
+           -nographic -numa node,memdev=mem -mem-prealloc \
+           -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
+           -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
+           -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=2 \
+           -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=6 \
+           -chardev socket,id=char2,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
+           -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=2 \
+           -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=6
+
+   .. note::
+     Queue value above should match the queues configured in OVS, The vector
+     value should be set to "number of queues x 2 + 2"
+
+3. Configure the guest interface
+
+   Assuming there are 2 interfaces in the guest named eth0, eth1 check the
+   channel configuration and set the number of combined channels to 2 for
+   virtio devices::
+
+       $ ethtool -l eth0
+       $ ethtool -L eth0 combined 2
+       $ ethtool -L eth1 combined 2
+
+   More information can be found in vHost walkthrough section.
+
+4. Configure kernel packet forwarding
+
+   Configure IP and enable interfaces::
+
+       $ ifconfig eth0 5.5.5.1/24 up
+       $ ifconfig eth1 90.90.90.1/24 up
+
+   Configure IP forwarding and add route entries::
+
+       $ sysctl -w net.ipv4.ip_forward=1
+       $ sysctl -w net.ipv4.conf.all.rp_filter=0
+       $ sysctl -w net.ipv4.conf.eth0.rp_filter=0
+       $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
+       $ ip route add 2.1.1.0/24 dev eth1
+       $ route add default gw 2.1.1.2 eth1
+       $ route add default gw 90.90.90.90 eth1
+       $ arp -s 90.90.90.90 DE:AD:BE:EF:CA:FE
+       $ arp -s 2.1.1.2 DE:AD:BE:EF:CA:FA
+
+   Check traffic on multiple queues::
+
+       $ cat /proc/interrupts | grep virtio
+
+PHY-VM-PHY (IVSHMEM loopback)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+IVSHMEM can also be validated using the PHY-VM-PHY configuration. To begin, add
+a userspace bridge, two ``dpdk`` (PHY) ports, and a single ``dpdkr`` port::
+
+    # Add userspace bridge
+    $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
+
+    # Add two dpdk ports
+    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
+
+    # Add one dpdkr ports
+    $ ovs-vsctl add-port br0 dpdkr0 -- set Interface dpdkr0 type=dpdkr
+
+.. TODO(stephenfin): What flows should the user configure?
+
+QEMU must be patched to enable IVSHMEM support::
+
+    $ cd /usr/src/
+    $ wget http://wiki.qemu.org/download/qemu-2.2.1.tar.bz2
+    $ tar -jxvf qemu-2.2.1.tar.bz2
+    $ cd /usr/src/qemu-2.2.1
+    $ wget https://raw.githubusercontent.com/netgroup-polito/un-orchestrator/master/orchestrator/compute_controller/plugins/kvm-libvirt/patches/ivshmem-qemu-2.2.1.patch
+    $ patch -p1 < ivshmem-qemu-2.2.1.patch
+    $ ./configure --target-list=x86_64-softmmu --enable-debug --extra-cflags='-g'
+    $ make -j 4
+
+In addition, the ``cmdline_generator`` utility must be downloaded and built::
+
+    $ mkdir -p /usr/src/cmdline_generator
+    $ cd /usr/src/cmdline_generator
+    $ wget https://raw.githubusercontent.com/netgroup-polito/un-orchestrator/master/orchestrator/compute_controller/plugins/kvm-libvirt/cmdline_generator/cmdline_generator.c
+    $ wget https://raw.githubusercontent.com/netgroup-polito/un-orchestrator/master/orchestrator/compute_controller/plugins/kvm-libvirt/cmdline_generator/Makefile
+    $ export RTE_SDK=/usr/src/dpdk-16.11
+    $ export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc
+    $ make
+
+Once both the patche QEMU and ``cmdline_generator`` utilities have been built,
+run ``cmdline_generator`` to generate a suitable QEMU commandline, and use this
+to instantiate a guest. For example::
+
+    $ ./build/cmdline_generator -m -p dpdkr0 XXX
+    $ cmdline=`cat OVSMEMPOOL`
+    $ export VM_NAME=ivshmem-vm
+    $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
+    $ export QEMU_BIN=/usr/src/qemu-2.2.1/x86_64-softmmu/qemu-system-x86_64
+    $ taskset 0x20 $QEMU_BIN -cpu host -smp 2,cores=2 -hda $QCOW2_IMAGE \
+        -m 4096 --enable-kvm -name $VM_NAME -nographic -vnc :2 \
+        -pidfile /tmp/vm1.pid $cmdline
+
+When the guest has started, connect to it and build and run the sample
+``dpdkr`` app. This application will simply loopback packets received over the
+DPDK ring port::
+
+    $ echo 1024 > /proc/sys/vm/nr_hugepages
+    $ mount -t hugetlbfs nodev /dev/hugepages (if not already mounted)
+
+    # Build the DPDK ring application in the VM
+    $ export RTE_SDK=/root/dpdk-16.11
+    $ export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc
+    $ make
+
+    # Run dpdkring application
+    $ ./build/dpdkr -c 1 -n 4 -- -n 0
+    # where "-n 0" refers to ring '0' i.e dpdkr0
diff --git a/Documentation/howto/index.rst b/Documentation/howto/index.rst
index fe85a34..0eb3d75 100644
--- a/Documentation/howto/index.rst
+++ b/Documentation/howto/index.rst
@@ -40,6 +40,7 @@ topics covered herein, refer to :doc:`/topics/index`.
    lisp
    native-tunneling
    vtep
+   dpdk
 
 .. toctree::
    :maxdepth: 1
diff --git a/Documentation/index.rst b/Documentation/index.rst
index f15993f..ca367f1 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -61,7 +61,18 @@ vSwitch? Start here.
 Deeper Dive
 -----------
 
-**TODO**
+- **Architecture** :doc:`topics/design` |
+  :doc:`topics/openflow` |
+  :doc:`topics/integration` |
+  :doc:`topics/porting`
+
+- **DPDK** :doc:`howto/dpdk` |
+  :doc:`topics/dpdk/vhost-user` |
+  :doc:`topics/dpdk/ivshmem`
+
+- **Windows** :doc:`topics/windows`
+
+- **Testing** :doc:`topics/testing`
 
 The Open vSwitch Project
 ------------------------
diff --git a/Documentation/intro/install/dpdk-advanced.rst b/Documentation/intro/install/dpdk-advanced.rst
deleted file mode 100644
index 44d1cd7..0000000
--- a/Documentation/intro/install/dpdk-advanced.rst
+++ /dev/null
@@ -1,938 +0,0 @@
-..
-      Licensed under the Apache License, Version 2.0 (the "License"); you may
-      not use this file except in compliance with the License. You may obtain
-      a copy of the License at
-
-          http://www.apache.org/licenses/LICENSE-2.0
-
-      Unless required by applicable law or agreed to in writing, software
-      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-      License for the specific language governing permissions and limitations
-      under the License.
-
-      Convention for heading levels in Open vSwitch documentation:
-
-      =======  Heading 0 (reserved for the title in a document)
-      -------  Heading 1
-      ~~~~~~~  Heading 2
-      +++++++  Heading 3
-      '''''''  Heading 4
-
-      Avoid deeper levels because they do not render well.
-
-=================================
-Open vSwitch with DPDK (Advanced)
-=================================
-
-The Advanced Install Guide explains how to improve OVS performance when using
-DPDK datapath. This guide provides information on tuning, system configuration,
-troubleshooting, static code analysis and testcases.
-
-Building as a Shared Library
-----------------------------
-
-DPDK can be built as a static or a shared library and shall be linked by
-applications using DPDK datapath. When building OVS with DPDK, you can link
-Open vSwitch against the shared DPDK library.
-
-.. note::
-  Minor performance loss is seen with OVS when using shared DPDK library as
-  compared to static library.
-
-To build Open vSwitch using DPDK as a shared library, first refer to
-:doc:`/intro/install/dpdk` for download instructions for DPDK and OVS.
-
-Once DPDK and OVS have been downloaded, you must configure the DPDK library
-accordingly. Simply set ``CONFIG_RTE_BUILD_SHARED_LIB=y`` in
-``config/common_base``, then build and install DPDK. Once done, DPDK can be
-built as usual. For example::
-
-    $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
-    $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
-    $ make install T=$DPDK_TARGET DESTDIR=install
-
-Once DPDK is built, export the DPDK shared library location and setup OVS as
-detailed in :doc:`/intro/install/dpdk`::
-
-    $ export LD_LIBRARY_PATH=$DPDK_DIR/x86_64-native-linuxapp-gcc/lib
-
-System Configuration
---------------------
-
-To achieve optimal OVS performance, the system can be configured and that
-includes BIOS tweaks, Grub cmdline additions, better understanding of NUMA
-nodes and apt selection of PCIe slots for NIC placement.
-
-Recommended BIOS Settings
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. list-table:: Recommended BIOS Settings
-   :header-rows: 1
-
-   * - Setting
-     - Value
-   * - C3 Power State
-     - Disabled
-   * - C6 Power State
-     - Disabled
-   * - MLC Streamer
-     - Enabled
-   * - MLC Spacial Prefetcher
-     - Enabled
-   * - DCU Data Prefetcher
-     - Enabled
-   * - DCA
-     - Enabled
-   * - CPU Power and Performance
-     - Performance
-   * - Memeory RAS and Performance Config -> NUMA optimized
-     - Enabled
-
-PCIe Slot Selection
-~~~~~~~~~~~~~~~~~~~
-
-The fastpath performance can be affected by factors related to the placement of
-the NIC, such as channel speeds between PCIe slot and CPU or the proximity of
-PCIe slot to the CPU cores running the DPDK application. Listed below are the
-steps to identify right PCIe slot.
-
-#. Retrieve host details using ``dmidecode``. For example::
-
-       $ dmidecode -t baseboard | grep "Product Name"
-
-#. Download the technical specification for product listed, e.g: S2600WT2
-
-#. Check the Product Architecture Overview on the Riser slot placement, CPU
-   sharing info and also PCIe channel speeds
-
-   For example: On S2600WT, CPU1 and CPU2 share Riser Slot 1 with Channel speed
-   between CPU1 and Riser Slot1 at 32GB/s, CPU2 and Riser Slot1 at 16GB/s.
-   Running DPDK app on CPU1 cores and NIC inserted in to Riser card Slots will
-   optimize OVS performance in this case.
-
-#. Check the Riser Card #1 - Root Port mapping information, on the available
-   slots and individual bus speeds. In S2600WT slot 1, slot 2 has high bus
-   speeds and are potential slots for NIC placement.
-
-Advanced Hugepage Setup
-~~~~~~~~~~~~~~~~~~~~~~~
-
-Allocate and mount 1 GB hugepages.
-
-- For persistent allocation of huge pages, add the following options to the
-  kernel bootline::
-
-      default_hugepagesz=1GB hugepagesz=1G hugepages=N
-
-  For platforms supporting multiple huge page sizes, add multiple options::
-
-      default_hugepagesz=<size> hugepagesz=<size> hugepages=N
-
-  where:
-
-  ``N``
-    number of huge pages requested
-  ``size``
-    huge page size with an optional suffix ``[kKmMgG]``
-
-- For run-time allocation of huge pages::
-
-      $ echo N > /sys/devices/system/node/nodeX/hugepages/hugepages-1048576kB/nr_hugepages
-
-  where:
-
-  ``N``
-    number of huge pages requested
-  ``X``
-    NUMA Node
-
-  .. note::
-    For run-time allocation of 1G huge pages, Contiguous Memory Allocator
-    (``CONFIG_CMA``) has to be supported by kernel, check your Linux distro.
-
-Now mount the huge pages, if not already done so::
-
-    $ mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
-
-Enable HyperThreading
-~~~~~~~~~~~~~~~~~~~~~
-
-With HyperThreading, or SMT, enabled, a physical core appears as two logical
-cores. SMT can be utilized to spawn worker threads on logical cores of the same
-physical core there by saving additional cores.
-
-With DPDK, when pinning pmd threads to logical cores, care must be taken to set
-the correct bits of the ``pmd-cpu-mask`` to ensure that the pmd threads are
-pinned to SMT siblings.
-
-Take a sample system configuration, with 2 sockets, 2 * 10 core processors, HT
-enabled. This gives us a total of 40 logical cores. To identify the physical
-core shared by two logical cores, run::
-
-    $ cat /sys/devices/system/cpu/cpuN/topology/thread_siblings_list
-
-where ``N`` is the logical core number.
-
-In this example, it would show that cores ``1`` and ``21`` share the same
-physical core. As cores are counted from 0, the ``pmd-cpu-mask`` can be used
-to enable these two pmd threads running on these two logical cores (one
-physical core) is::
-
-    $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x200002
-
-Isolate Cores
-~~~~~~~~~~~~~
-
-The ``isolcpus`` option can be used to isolate cores from the Linux scheduler.
-The isolated cores can then be used to dedicatedly run HPC applications or
-threads.  This helps in better application performance due to zero context
-switching and minimal cache thrashing. To run platform logic on core 0 and
-isolate cores between 1 and 19 from scheduler, add  ``isolcpus=1-19`` to GRUB
-cmdline.
-
-.. note::
-  It has been verified that core isolation has minimal advantage due to mature
-  Linux scheduler in some circumstances.
-
-NUMA/Cluster-on-Die
-~~~~~~~~~~~~~~~~~~~
-
-Ideally inter-NUMA datapaths should be avoided where possible as packets will
-go across QPI and there may be a slight performance penalty when compared with
-intra NUMA datapaths. On Intel Xeon Processor E5 v3, Cluster On Die is
-introduced on models that have 10 cores or more.  This makes it possible to
-logically split a socket into two NUMA regions and again it is preferred where
-possible to keep critical datapaths within the one cluster.
-
-It is good practice to ensure that threads that are in the datapath are pinned
-to cores in the same NUMA area. e.g. pmd threads and QEMU vCPUs responsible for
-forwarding. If DPDK is built with ``CONFIG_RTE_LIBRTE_VHOST_NUMA=y``, vHost
-User ports automatically detect the NUMA socket of the QEMU vCPUs and will be
-serviced by a PMD from the same node provided a core on this node is enabled in
-the ``pmd-cpu-mask``. ``libnuma`` packages are required for this feature.
-
-Compiler Optimizations
-~~~~~~~~~~~~~~~~~~~~~~
-
-The default compiler optimization level is ``-O2``. Changing this to more
-aggressive compiler optimization such as ``-O3 -march=native`` with
-gcc (verified on 5.3.1) can produce performance gains though not siginificant.
-``-march=native`` will produce optimized code on local machine and should be
-used when software compilation is done on Testbed.
-
-Performance Tuning
-------------------
-
-Affinity
-~~~~~~~~
-
-For superior performance, DPDK pmd threads and Qemu vCPU threads needs to be
-affinitized accordingly.
-
-- PMD thread Affinity
-
-  A poll mode driver (pmd) thread handles the I/O of all DPDK interfaces
-  assigned to it. A pmd thread shall poll the ports for incoming packets,
-  switch the packets and send to tx port.  pmd thread is CPU bound, and needs
-  to be affinitized to isolated cores for optimum performance.
-
-  By setting a bit in the mask, a pmd thread is created and pinned to the
-  corresponding CPU core. e.g. to run a pmd thread on core 2::
-
-      $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x4
-
-  .. note::
-    pmd thread on a NUMA node is only created if there is at least one DPDK
-    interface from that NUMA node added to OVS.
-
-- QEMU vCPU thread Affinity
-
-  A VM performing simple packet forwarding or running complex packet pipelines
-  has to ensure that the vCPU threads performing the work has as much CPU
-  occupancy as possible.
-
-  For example, on a multicore VM, multiple QEMU vCPU threads shall be spawned.
-  When the DPDK ``testpmd`` application that does packet forwarding is invoked,
-  the ``taskset`` command should be used to affinitize the vCPU threads to the
-  dedicated isolated cores on the host system.
-
-Multiple Poll-Mode Driver Threads
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-With pmd multi-threading support, OVS creates one pmd thread for each NUMA node
-by default. However, in cases where there are multiple ports/rxq's producing
-traffic, performance can be improved by creating multiple pmd threads running
-on separate cores. These pmd threads can share the workload by each being
-responsible for different ports/rxq's. Assignment of ports/rxq's to pmd threads
-is done automatically.
-
-A set bit in the mask means a pmd thread is created and pinned to the
-corresponding CPU core. For example, to run pmd threads on core 1 and 2::
-
-    $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x6
-
-When using dpdk and dpdkvhostuser ports in a bi-directional VM loopback as
-shown below, spreading the workload over 2 or 4 pmd threads shows significant
-improvements as there will be more total CPU occupancy available::
-
-    NIC port0 <-> OVS <-> VM <-> OVS <-> NIC port 1
-
-DPDK Physical Port Rx Queues
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-::
-
-    $ ovs-vsctl set Interface <DPDK interface> options:n_rxq=<integer>
-
-The command above sets the number of rx queues for DPDK physical interface.
-The rx queues are assigned to pmd threads on the same NUMA node in a
-round-robin fashion.
-
-DPDK Physical Port Queue Sizes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-::
-
-    $ ovs-vsctl set Interface dpdk0 options:n_rxq_desc=<integer>
-    $ ovs-vsctl set Interface dpdk0 options:n_txq_desc=<integer>
-
-The command above sets the number of rx/tx descriptors that the NIC associated
-with dpdk0 will be initialised with.
-
-Different ``n_rxq_desc`` and ``n_txq_desc`` configurations yield different
-benefits in terms of throughput and latency for different scenarios.
-Generally, smaller queue sizes can have a positive impact for latency at the
-expense of throughput. The opposite is often true for larger queue sizes.
-Note: increasing the number of rx descriptors eg. to 4096  may have a negative
-impact on performance due to the fact that non-vectorised DPDK rx functions may
-be used. This is dependant on the driver in use, but is true for the commonly
-used i40e and ixgbe DPDK drivers.
-
-Exact Match Cache
-~~~~~~~~~~~~~~~~~
-
-Each pmd thread contains one Exact Match Cache (EMC). After initial flow setup
-in the datapath, the EMC contains a single table and provides the lowest level
-(fastest) switching for DPDK ports. If there is a miss in the EMC then the next
-level where switching will occur is the datapath classifier.  Missing in the
-EMC and looking up in the datapath classifier incurs a significant performance
-penalty.  If lookup misses occur in the EMC because it is too small to handle
-the number of flows, its size can be increased. The EMC size can be modified by
-editing the define ``EM_FLOW_HASH_SHIFT`` in ``lib/dpif-netdev.c``.
-
-As mentioned above, an EMC is per pmd thread. An alternative way of increasing
-the aggregate amount of possible flow entries in EMC and avoiding datapath
-classifier lookups is to have multiple pmd threads running.
-
-Rx Mergeable Buffers
-~~~~~~~~~~~~~~~~~~~~
-
-Rx mergeable buffers is a virtio feature that allows chaining of multiple
-virtio descriptors to handle large packet sizes. Large packets are handled by
-reserving and chaining multiple free descriptors together. Mergeable buffer
-support is negotiated between the virtio driver and virtio device and is
-supported by the DPDK vhost library.  This behavior is supported and enabled by
-default, however in the case where the user knows that rx mergeable buffers are
-not needed i.e. jumbo frames are not needed, it can be forced off by adding
-``mrg_rxbuf=off`` to the QEMU command line options. By not reserving multiple
-chains of descriptors it will make more individual virtio descriptors available
-for rx to the guest using dpdkvhost ports and this can improve performance.
-
-OVS Testcases
--------------
-
-PHY-VM-PHY (vHost Loopback)
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-:doc:`/intro/install/dpdk` details steps for PHY-VM-PHY loopback testcase and
-packet forwarding using DPDK testpmd application in the Guest VM. For users
-wishing to do packet forwarding using kernel stack below, you need to run the
-below commands on the guest::
-
-    $ ifconfig eth1 1.1.1.2/24
-    $ ifconfig eth2 1.1.2.2/24
-    $ systemctl stop firewalld.service
-    $ systemctl stop iptables.service
-    $ sysctl -w net.ipv4.ip_forward=1
-    $ sysctl -w net.ipv4.conf.all.rp_filter=0
-    $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
-    $ sysctl -w net.ipv4.conf.eth2.rp_filter=0
-    $ route add -net 1.1.2.0/24 eth2
-    $ route add -net 1.1.1.0/24 eth1
-    $ arp -s 1.1.2.99 DE:AD:BE:EF:CA:FE
-    $ arp -s 1.1.1.99 DE:AD:BE:EF:CA:EE
-
-PHY-VM-PHY (IVSHMEM)
-~~~~~~~~~~~~~~~~~~~~
-
-IVSHMEM can also be validated using the PHY-VM-PHY configuration. To begin,
-follow the steps described in the :doc:`/intro/install/dpdk` to create and
-initialize the database, start ovs-vswitchd and add ``dpdk``-type devices to
-bridge ``br0``. Once complete, follow the below steps:
-
-1. Add DPDK ring port to the bridge::
-
-       $ ovs-vsctl add-port br0 dpdkr0 -- set Interface dpdkr0 type=dpdkr
-
-2. Build modified QEMU
-
-   QEMU must be patched to enable IVSHMEM support::
-
-       $ cd /usr/src/
-       $ wget http://wiki.qemu.org/download/qemu-2.2.1.tar.bz2
-       $ tar -jxvf qemu-2.2.1.tar.bz2
-       $ cd /usr/src/qemu-2.2.1
-       $ wget https://raw.githubusercontent.com/netgroup-polito/un-orchestrator/master/orchestrator/compute_controller/plugins/kvm-libvirt/patches/ivshmem-qemu-2.2.1.patch
-       $ patch -p1 < ivshmem-qemu-2.2.1.patch
-       $ ./configure --target-list=x86_64-softmmu --enable-debug --extra-cflags='-g'
-       $ make -j 4
-
-3. Generate QEMU commandline::
-
-       $ mkdir -p /usr/src/cmdline_generator
-       $ cd /usr/src/cmdline_generator
-       $ wget https://raw.githubusercontent.com/netgroup-polito/un-orchestrator/master/orchestrator/compute_controller/plugins/kvm-libvirt/cmdline_generator/cmdline_generator.c
-       $ wget https://raw.githubusercontent.com/netgroup-polito/un-orchestrator/master/orchestrator/compute_controller/plugins/kvm-libvirt/cmdline_generator/Makefile
-       $ export RTE_SDK=/usr/src/dpdk-16.11
-       $ export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc
-       $ make
-       $ ./build/cmdline_generator -m -p dpdkr0 XXX
-       $ cmdline=`cat OVSMEMPOOL`
-
-4. Start guest VM::
-
-       $ export VM_NAME=ivshmem-vm
-       $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
-       $ export QEMU_BIN=/usr/src/qemu-2.2.1/x86_64-softmmu/qemu-system-x86_64
-       $ taskset 0x20 $QEMU_BIN -cpu host -smp 2,cores=2 -hda $QCOW2_IMAGE \
-           -m 4096 --enable-kvm -name $VM_NAME -nographic -vnc :2 \
-           -pidfile /tmp/vm1.pid $cmdline
-
-5. Build and run the sample ``dpdkr`` app in VM::
-
-       $ echo 1024 > /proc/sys/vm/nr_hugepages
-       $ mount -t hugetlbfs nodev /dev/hugepages (if not already mounted)
-
-       # Build the DPDK ring application in the VM
-       $ export RTE_SDK=/root/dpdk-16.11
-       $ export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc
-       $ make
-
-       # Run dpdkring application
-       $ ./build/dpdkr -c 1 -n 4 -- -n 0
-       # where "-n 0" refers to ring '0' i.e dpdkr0
-
-PHY-VM-PHY (vHost Multiqueue)
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-vHost Multique functionality can also be validated using the PHY-VM-PHY
-configuration. To begin, follow the steps described in
-:doc:`/intro/install/dpdk` to create and initialize the database, start
-ovs-vswitchd and add ``dpdk``-type devices to bridge ``br0``. Once complete,
-follow the below steps:
-
-1. Configure PMD and RXQs.
-
-   For example, set the number of dpdk port rx queues to at least 2  The number
-   of rx queues at vhost-user interface gets automatically configured after
-   virtio device connection and doesn't need manual configuration::
-
-       $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xC
-       $ ovs-vsctl set Interface dpdk0 options:n_rxq=2
-       $ ovs-vsctl set Interface dpdk1 options:n_rxq=2
-
-2. Instantiate Guest VM using QEMU cmdline
-
-   We must configure with appropriate software versions to ensure this feature
-   is supported.
-
-   .. list-table:: Recommended BIOS Settings
-      :header-rows: 1
-
-      * - Setting
-        - Value
-      * - QEMU version
-        - 2.5.0
-      * - QEMU thread affinity
-        - 2 cores (taskset 0x30)
-      * - Memory
-        - 4 GB
-      * - Cores
-        - 2
-      * - Distro
-        - Fedora 22
-      * - Multiqueue
-        - Enabled
-
-   To do this, instantiate the guest as follows::
-
-       $ export VM_NAME=vhost-vm
-       $ export GUEST_MEM=4096M
-       $ export QCOW2_IMAGE=/root/Fedora22_x86_64.qcow2
-       $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
-       $ taskset 0x30 qemu-system-x86_64 -cpu host -smp 2,cores=2 -m 4096M \
-           -drive file=$QCOW2_IMAGE --enable-kvm -name $VM_NAME \
-           -nographic -numa node,memdev=mem -mem-prealloc \
-           -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
-           -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
-           -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=2 \
-           -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=6 \
-           -chardev socket,id=char2,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
-           -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=2 \
-           -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=6
-
-   .. note::
-     Queue value above should match the queues configured in OVS, The vector
-     value should be set to "number of queues x 2 + 2"
-
-3. Configure the guest interface
-
-   Assuming there are 2 interfaces in the guest named eth0, eth1 check the
-   channel configuration and set the number of combined channels to 2 for
-   virtio devices::
-
-       $ ethtool -l eth0
-       $ ethtool -L eth0 combined 2
-       $ ethtool -L eth1 combined 2
-
-   More information can be found in vHost walkthrough section.
-
-4. Configure kernel packet forwarding
-
-   Configure IP and enable interfaces::
-
-       $ ifconfig eth0 5.5.5.1/24 up
-       $ ifconfig eth1 90.90.90.1/24 up
-
-   Configure IP forwarding and add route entries::
-
-       $ sysctl -w net.ipv4.ip_forward=1
-       $ sysctl -w net.ipv4.conf.all.rp_filter=0
-       $ sysctl -w net.ipv4.conf.eth0.rp_filter=0
-       $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
-       $ ip route add 2.1.1.0/24 dev eth1
-       $ route add default gw 2.1.1.2 eth1
-       $ route add default gw 90.90.90.90 eth1
-       $ arp -s 90.90.90.90 DE:AD:BE:EF:CA:FE
-       $ arp -s 2.1.1.2 DE:AD:BE:EF:CA:FA
-
-   Check traffic on multiple queues::
-
-       $ cat /proc/interrupts | grep virtio
-
-vHost Walkthrough
------------------
-
-Two types of vHost User ports are available in OVS:
-
-- vhost-user (``dpdkvhostuser``)
-
-- vhost-user-client (``dpdkvhostuserclient``)
-
-vHost User uses a client-server model. The server creates/manages/destroys the
-vHost User sockets, and the client connects to the server. Depending on which
-port type you use, ``dpdkvhostuser`` or ``dpdkvhostuserclient``, a different
-configuration of the client-server model is used.
-
-For vhost-user ports, Open vSwitch acts as the server and QEMU the client.  For
-vhost-user-client ports, Open vSwitch acts as the client and QEMU the server.
-
-vhost-user
-~~~~~~~~~~
-
-1. Install the prerequisites:
-
-   - QEMU version >= 2.2
-
-2. Add vhost-user ports to the switch.
-
-   Unlike DPDK ring ports, DPDK vhost-user ports can have arbitrary names,
-   except that forward and backward slashes are prohibited in the names.
-
-   For vhost-user, the name of the port type is ``dpdkvhostuser``::
-
-       $ ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 \
-           type=dpdkvhostuser
-
-   This action creates a socket located at
-   ``/usr/local/var/run/openvswitch/vhost-user-1``, which you must provide to
-   your VM on the QEMU command line. More instructions on this can be found in
-   the next section "Adding vhost-user ports to VM"
-
-   .. note::
-     If you wish for the vhost-user sockets to be created in a sub-directory of
-     ``/usr/local/var/run/openvswitch``, you may specify this directory in the
-     ovsdb like so::
-
-         $ ovs-vsctl --no-wait \
-             set Open_vSwitch . other_config:vhost-sock-dir=subdir`
-
-3. Add vhost-user ports to VM
-
-   1. Configure sockets
-
-      Pass the following parameters to QEMU to attach a vhost-user device::
-
-          -chardev socket,id=char1,path=/usr/local/var/run/openvswitch/vhost-user-1
-          -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce
-          -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1
-
-      where ``vhost-user-1`` is the name of the vhost-user port added to the
-      switch.
-
-      Repeat the above parameters for multiple devices, changing the chardev
-      ``path`` and ``id`` as necessary. Note that a separate and different
-      chardev ``path`` needs to be specified for each vhost-user device. For
-      example you have a second vhost-user port named ``vhost-user-2``, you
-      append your QEMU command line with an additional set of parameters::
-
-          -chardev socket,id=char2,path=/usr/local/var/run/openvswitch/vhost-user-2
-          -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce
-          -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2
-
-    2. Configure hugepages
-
-       QEMU must allocate the VM's memory on hugetlbfs. vhost-user ports access
-       a virtio-net device's virtual rings and packet buffers mapping the VM's
-       physical memory on hugetlbfs. To enable vhost-user ports to map the VM's
-       memory into their process address space, pass the following parameters
-       to QEMU::
-
-           -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on
-           -numa node,memdev=mem -mem-prealloc
-
-    3. Enable multiqueue support (optional)
-
-       QEMU needs to be configured to use multiqueue::
-
-           -chardev socket,id=char2,path=/usr/local/var/run/openvswitch/vhost-user-2
-           -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=$q
-           -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=$v
-
-       where:
-
-       ``$q``
-         The number of queues
-       ``$v``
-         The number of vectors, which is ``$q`` * 2 + 2
-
-       The vhost-user interface will be automatically reconfigured with
-       required number of rx and tx queues after connection of virtio device.
-       Manual configuration of ``n_rxq`` is not supported because OVS will work
-       properly only if ``n_rxq`` will match number of queues configured in
-       QEMU.
-
-       A least 2 PMDs should be configured for the vswitch when using
-       multiqueue.  Using a single PMD will cause traffic to be enqueued to the
-       same vhost queue rather than being distributed among different vhost
-       queues for a vhost-user interface.
-
-       If traffic destined for a VM configured with multiqueue arrives to the
-       vswitch via a physical DPDK port, then the number of rxqs should also be
-       set to at least 2 for that physical DPDK port. This is required to
-       increase the probability that a different PMD will handle the multiqueue
-       transmission to the guest using a different vhost queue.
-
-       If one wishes to use multiple queues for an interface in the guest, the
-       driver in the guest operating system must be configured to do so. It is
-       recommended that the number of queues configured be equal to ``$q``.
-
-       For example, this can be done for the Linux kernel virtio-net driver
-       with::
-
-           $ ethtool -L <DEV> combined <$q>
-
-       where:
-
-       ``-L``
-         Changes the numbers of channels of the specified network device
-       ``combined``
-         Changes the number of multi-purpose channels.
-
-Configure the VM using libvirt
-++++++++++++++++++++++++++++++
-
-You can also build and configure the VM using libvirt rather than QEMU by
-itself.
-
-1. Change the user/group, access control policty and restart libvirtd.
-
-   - In ``/etc/libvirt/qemu.conf`` add/edit the following lines::
-
-         user = "root"
-         group = "root"
-
-   - Disable SELinux or set to permissive mode::
-
-         $ setenforce 0
-
-   - Restart the libvirtd process, For example, on Fedora::
-
-         $ systemctl restart libvirtd.service
-
-2. Instantiate the VM
-
-   - Copy the XML configuration described in :doc:`/intro/install/dpdk`
-
-   - Start the VM::
-
-         $ virsh create demovm.xml
-
-   - Connect to the guest console::
-
-         $ virsh console demovm
-
-3. Configure the VM
-
-   The demovm xml configuration is aimed at achieving out of box performance on
-   VM.
-
-   - The vcpus are pinned to the cores of the CPU socket 0 using ``vcpupin``.
-
-   - Configure NUMA cell and memory shared using ``memAccess='shared'``.
-
-   - Disable ``mrg_rxbuf='off'``
-
-Refer to the `libvirt documentation <http://libvirt.org/formatdomain.html>`__
-for more information.
-
-vhost-user-client
-~~~~~~~~~~~~~~~~~
-
-1. Install the prerequisites:
-
-   - QEMU version >= 2.7
-
-2. Add vhost-user-client ports to the switch.
-
-   Unlike vhost-user ports, the name given to port does not govern the name of
-   the socket device. ``vhost-server-path`` reflects the full path of the
-   socket that has been or will be created by QEMU for the given vHost User
-   client port.
-
-   For vhost-user-client, the name of the port type is
-   ``dpdkvhostuserclient``::
-
-       $ VHOST_USER_SOCKET_PATH=/path/to/socker
-       $ ovs-vsctl add-port br0 vhost-client-1 \
-           -- set Interface vhost-client-1 type=dpdkvhostuserclient \
-                options:vhost-server-path=$VHOST_USER_SOCKET_PATH
-
-3. Add vhost-user-client ports to VM
-
-   1. Configure sockets
-
-      Pass the following parameters to QEMU to attach a vhost-user device::
-
-          -chardev socket,id=char1,path=$VHOST_USER_SOCKET_PATH,server
-          -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce
-          -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1
-
-      where ``vhost-user-1`` is the name of the vhost-user port added to the
-      switch.
-
-      If the corresponding dpdkvhostuserclient port has not yet been configured
-      in OVS with ``vhost-server-path=/path/to/socket``, QEMU will print a log
-      similar to the following::
-
-          QEMU waiting for connection on: disconnected:unix:/path/to/socket,server
-
-      QEMU will wait until the port is created sucessfully in OVS to boot the VM.
-
-      One benefit of using this mode is the ability for vHost ports to
-      'reconnect' in event of the switch crashing or being brought down. Once
-      it is brought back up, the vHost ports will reconnect automatically and
-      normal service will resume.
-
-DPDK Backend Inside VM
-~~~~~~~~~~~~~~~~~~~~~~
-
-Additional configuration is required if you want to run ovs-vswitchd with DPDK
-backend inside a QEMU virtual machine. Ovs-vswitchd creates separate DPDK TX
-queues for each CPU core available. This operation fails inside QEMU virtual
-machine because, by default, VirtIO NIC provided to the guest is configured to
-support only single TX queue and single RX queue. To change this behavior, you
-need to turn on ``mq`` (multiqueue) property of all ``virtio-net-pci`` devices
-emulated by QEMU and used by DPDK.  You may do it manually (by changing QEMU
-command line) or, if you use Libvirt, by adding the following string to
-``<interface>`` sections of all network devices used by DPDK::
-
-    <driver name='vhost' queues='N'/>
-
-Where:
-
-``N``
-  determines how many queues can be used by the guest.
-
-This requires QEMU >= 2.2.
-
-QoS
----
-
-Assuming you have a vhost-user port transmitting traffic consisting of packets
-of size 64 bytes, the following command would limit the egress transmission
-rate of the port to ~1,000,000 packets per second::
-
-    $ ovs-vsctl set port vhost-user0 qos=@newqos -- \
-        --id=@newqos create qos type=egress-policer other-config:cir=46000000 \
-        other-config:cbs=2048`
-
-To examine the QoS configuration of the port, run::
-
-    $ ovs-appctl -t ovs-vswitchd qos/show vhost-user0
-
-To clear the QoS configuration from the port and ovsdb, run::
-
-    $ ovs-vsctl destroy QoS vhost-user0 -- clear Port vhost-user0 qos
-
-Refer to vswitch.xml for more details on egress-policer.
-
-Rate Limiting
---------------
-
-Here is an example on Ingress Policing usage.  Assuming you have a vhost-user
-port receiving traffic consisting of packets of size 64 bytes, the following
-command would limit the reception rate of the port to ~1,000,000 packets per
-second::
-
-    $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000 \
-        ingress_policing_burst=1000`
-
-To examine the ingress policer configuration of the port::
-
-    $ ovs-vsctl list interface vhost-user0
-
-To clear the ingress policer configuration from the port::
-
-    $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=0
-
-Refer to vswitch.xml for more details on ingress-policer.
-
-Flow Control
-------------
-
-Flow control can be enabled only on DPDK physical ports.  To enable flow
-control support at tx side while adding a port, run::
-
-    $ ovs-vsctl add-port br0 dpdk0 -- \
-        set Interface dpdk0 type=dpdk options:tx-flow-ctrl=true
-
-Similarly, to enable rx flow control, run::
-
-    $ ovs-vsctl add-port br0 dpdk0 -- \
-        set Interface dpdk0 type=dpdk options:rx-flow-ctrl=true
-
-To enable flow control auto-negotiation, run::
-
-    $ ovs-vsctl add-port br0 dpdk0 -- \
-        set Interface dpdk0 type=dpdk options:flow-ctrl-autoneg=true
-
-To turn ON the tx flow control at run time(After the port is being added to
-OVS)::
-
-    $ ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=true
-
-The flow control parameters can be turned off by setting ``false`` to the
-respective parameter. To disable the flow control at tx side, run::
-
-    $ ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=false
-
-pdump
------
-
-Pdump allows you to listen on DPDK ports and view the traffic that is passing
-on them. To use this utility, one must have libpcap installed on the system.
-Furthermore, DPDK must be built with ``CONFIG_RTE_LIBRTE_PDUMP=y`` and
-``CONFIG_RTE_LIBRTE_PMD_PCAP=y``.
-
-.. warning::
-  A performance decrease is expected when using a monitoring application like
-  the DPDK pdump app.
-
-To use pdump, simply launch OVS as usual. Then, navigate to the ``app/pdump``
-directory in DPDK, ``make`` the application and run like so::
-
-    $ sudo ./build/app/dpdk-pdump -- \
-        --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \
-        --server-socket-path=/usr/local/var/run/openvswitch
-
-The above command captures traffic received on queue 0 of port 0 and stores it
-in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and
-pcap locations are of course also available to use. For example, to capture all
-packets that traverse port 0 in a single pcap file::
-
-    $ sudo ./build/app/dpdk-pdump -- \
-        --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \
-        --server-socket-path=/usr/local/var/run/openvswitch
-
-``server-socket-path`` must be set to the value of ovs_rundir() which typically
-resolves to ``/usr/local/var/run/openvswitch``.
-
-Many tools are available to view the contents of the pcap file. Once example is
-tcpdump. Issue the following command to view the contents of ``pkts.pcap``::
-
-    $ tcpdump -r pkts.pcap
-
-More information on the pdump app and its usage can be found in the `DPDK docs
-<http://dpdk.org/doc/guides/sample_app_ug/pdump.html>`__.
-
-Jumbo Frames
-------------
-
-By default, DPDK ports are configured with standard Ethernet MTU (1500B). To
-enable Jumbo Frames support for a DPDK port, change the Interface's
-``mtu_request`` attribute to a sufficiently large value. For example, to add a
-DPDK Phy port with MTU of 9000::
-
-    $ ovs-vsctl add-port br0 dpdk0 \
-      -- set Interface dpdk0 type=dpdk \
-      -- set Interface dpdk0 mtu_request=9000`
-
-Similarly, to change the MTU of an existing port to 6200::
-
-    $ ovs-vsctl set Interface dpdk0 mtu_request=6200
-
-Some additional configuration is needed to take advantage of jumbo frames with
-vHost ports:
-
-1. *mergeable buffers* must be enabled for vHost ports, as demonstrated in the
-   QEMU command line snippet below::
-
-       -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-       -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=on
-
-2. Where virtio devices are bound to the Linux kernel driver in a guest
-   environment (i.e. interfaces are not bound to an in-guest DPDK driver), the
-   MTU of those logical network interfaces must also be increased to a
-   sufficiently large value. This avoids segmentation of Jumbo Frames received
-   in the guest. Note that 'MTU' refers to the length of the IP packet only,
-   and not that of the entire frame.
-
-   To calculate the exact MTU of a standard IPv4 frame, subtract the L2 header
-   and CRC lengths (i.e. 18B) from the max supported frame size.  So, to set
-   the MTU for a 9018B Jumbo Frame::
-
-       $ ifconfig eth1 mtu 9000
-
-When Jumbo Frames are enabled, the size of a DPDK port's mbuf segments are
-increased, such that a full Jumbo Frame of a specific size may be accommodated
-within a single mbuf segment.
-
-Jumbo frame support has been validated against 9728B frames, which is the
-largest frame size supported by Fortville NIC using the DPDK i40e driver, but
-larger frames and other DPDK NIC drivers may be supported. These cases are
-common for use cases involving East-West traffic only.
-
-vsperf
-------
-
-The vsperf project aims to develop a vSwitch test framework that can be used to
-validate the suitability of different vSwitch implementations in a telco
-deployment environment. More information can be found on the `OPNFV wiki
-<https://wiki.opnfv.org/display/vsperf/VSperf+Home>`__.
-
-Bug Reporting
--------------
-
-Report problems to bugs at openvswitch.org.
diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
index 7724c8a..87dc830 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -53,10 +53,7 @@ vSwitch with DPDK will require the following:
   present, it will be necessary to upgrade your kernel or build a custom kernel
   with these flags enabled.
 
-.. TODO(stephenfin): drag the below information in from dpdk-advanced
-
-Detailed system requirements can be found at `DPDK requirements`_, while more
-detailed install information can be found in :doc:`dpdk-advanced`.
+Detailed system requirements can be found at `DPDK requirements`_.
 
 .. _DPDK supported NIC: http://dpdk.org/doc/nics
 .. _DPDK requirements: http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html
@@ -64,10 +61,10 @@ detailed install information can be found in :doc:`dpdk-advanced`.
 Installing
 ----------
 
-DPDK
-~~~~
+Install DPDK
+~~~~~~~~~~~~
 
-1. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
+#. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
        $ cd /usr/src/
        $ wget http://fast.dpdk.org/rel/dpdk-16.11.tar.xz
@@ -75,7 +72,18 @@ DPDK
        $ export DPDK_DIR=/usr/src/dpdk-16.11
        $ cd $DPDK_DIR
 
-2. Configure and install DPDK
+#. (Optional) Configure DPDK as a shared library
+
+   DPDK can be built as either a static library or a shared library.  By
+   default, it is configured for the former. If you wish to use the latter, set
+   ``CONFIG_RTE_BUILD_SHARED_LIB=y`` in ``$DPDK_DIR/config/common_base``.
+
+   .. note::
+
+      Minor performance loss is expected when using OVS with a shared DPDK
+      library compared to a static DPDK library.
+
+#. Configure and install DPDK
 
    Build and install the DPDK library::
 
@@ -87,6 +95,13 @@ DPDK
 
        $ export DPDK_TARGET=x86_64-ivshmem-linuxapp-gcc
 
+#. (Optional) Export the DPDK shared library location
+
+   If DPDK was built as a shared library, export the path to this library for
+   use when building OVS::
+
+       $ export LD_LIBRARY_PATH=$DPDK_DIR/x86_64-native-linuxapp-gcc/lib
+
 .. _DPDK sources: http://dpdk.org/rel
 
 Install OVS
@@ -101,12 +116,12 @@ has to be configured with DPDK support (``--with-dpdk``).
 
 .. _OVS sources: http://openvswitch.org/releases/
 
-1. Ensure the standard OVS requirements, described in
+#. Ensure the standard OVS requirements, described in
    :ref:`general-build-reqs`, are installed
 
-2. Bootstrap, if required, as described in :ref:`general-bootstrapping`
+#. Bootstrap, if required, as described in :ref:`general-bootstrapping`
 
-3. Configure the package using the ``--with-dpdk`` flag::
+#. Configure the package using the ``--with-dpdk`` flag::
 
        $ ./configure --with-dpdk=$DPDK_BUILD
 
@@ -117,7 +132,7 @@ has to be configured with DPDK support (``--with-dpdk``).
      While ``--with-dpdk`` is required, you can pass any other configuration
      option described in :ref:`general-configuring`.
 
-4. Build and install OVS, as described in :ref:`general-building`
+#. Build and install OVS, as described in :ref:`general-building`
 
 Additional information can be found in :doc:`general`.
 
@@ -225,7 +240,7 @@ threads and pin them to cores 1,2, run::
 
     $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x6
 
-For details on using ivshmem with DPDK, refer to :doc:`dpdk-advanced`.
+For details on using IVSHMEM with DPDK, refer to :doc:`/topics/dpdk/ivshmem`.
 
 Refer to ovs-vswitchd.conf.db(5) for additional information on configuration
 options.
@@ -237,345 +252,300 @@ options.
 Validating
 ----------
 
-Creating bridges and ports
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You can now use ovs-vsctl to set up bridges and other Open vSwitch features.
-Bridges should be created with a ``datapath_type=netdev``::
+At this point you can use ovs-vsctl to set up bridges and other Open vSwitch
+features. Seeing as we've configured the DPDK datapath, we will use DPDK-type
+ports. For example, to create a userspace bridge named ``br0`` and add two
+``dpdk`` ports to it, run::
 
     $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
-
-Now you can add DPDK devices. OVS expects DPDK device names to start with
-``dpdk`` and end with a portid. ovs-vswitchd should print the number of dpdk
-devices found in the log file::
-
     $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
     $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
 
-After the DPDK ports get added to switch, a polling thread continuously polls
-DPDK devices and consumes 100% of the core, as can be checked from 'top' and
-'ps' cmds::
+Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
 
-    $ top -H
-    $ ps -eLo pid,psr,comm | grep pmd
+Performance Tuning
+------------------
 
-Creating bonds of DPDK interfaces is slightly different to creating bonds of
-system interfaces. For DPDK, the interface type must be explicitly set. For
-example::
+To achieve optimal OVS performance, the system can be configured and that
+includes BIOS tweaks, Grub cmdline additions, better understanding of NUMA
+nodes and apt selection of PCIe slots for NIC placement.
 
-    $ ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 \
-        -- set Interface dpdk0 type=dpdk \
-        -- set Interface dpdk1 type=dpdk
+.. note::
 
-To stop ovs-vswitchd & delete bridge, run::
+   This section is optional. Once installed as described above, OVS with DPDK
+   will work out of the box.
 
-    $ ovs-appctl -t ovs-vswitchd exit
-    $ ovs-appctl -t ovsdb-server exit
-    $ ovs-vsctl del-br br0
+Recommended BIOS Settings
+~~~~~~~~~~~~~~~~~~~~~~~~~
 
-PMD thread statistics
-~~~~~~~~~~~~~~~~~~~~~
+.. list-table:: Recommended BIOS Settings
+   :header-rows: 1
 
-To show current stats::
+   * - Setting
+     - Value
+   * - C3 Power State
+     - Disabled
+   * - C6 Power State
+     - Disabled
+   * - MLC Streamer
+     - Enabled
+   * - MLC Spacial Prefetcher
+     - Enabled
+   * - DCU Data Prefetcher
+     - Enabled
+   * - DCA
+     - Enabled
+   * - CPU Power and Performance
+     - Performance
+   * - Memeory RAS and Performance Config -> NUMA optimized
+     - Enabled
 
-    $ ovs-appctl dpif-netdev/pmd-stats-show
+PCIe Slot Selection
+~~~~~~~~~~~~~~~~~~~
 
-To clear previous stats::
+The fastpath performance can be affected by factors related to the placement of
+the NIC, such as channel speeds between PCIe slot and CPU or the proximity of
+PCIe slot to the CPU cores running the DPDK application. Listed below are the
+steps to identify right PCIe slot.
 
-    $ ovs-appctl dpif-netdev/pmd-stats-clear
+#. Retrieve host details using ``dmidecode``. For example::
 
-Port/rxq assigment to PMD threads
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+       $ dmidecode -t baseboard | grep "Product Name"
 
-To show port/rxq assignment::
+#. Download the technical specification for product listed, e.g: S2600WT2
 
-    $ ovs-appctl dpif-netdev/pmd-rxq-show
+#. Check the Product Architecture Overview on the Riser slot placement, CPU
+   sharing info and also PCIe channel speeds
 
-To change default rxq assignment to pmd threads, rxqs may be manually pinned to
-desired cores using::
+   For example: On S2600WT, CPU1 and CPU2 share Riser Slot 1 with Channel speed
+   between CPU1 and Riser Slot1 at 32GB/s, CPU2 and Riser Slot1 at 16GB/s.
+   Running DPDK app on CPU1 cores and NIC inserted in to Riser card Slots will
+   optimize OVS performance in this case.
 
-    $ ovs-vsctl set Interface <iface> \
-        other_config:pmd-rxq-affinity=<rxq-affinity-list>
+#. Check the Riser Card #1 - Root Port mapping information, on the available
+   slots and individual bus speeds. In S2600WT slot 1, slot 2 has high bus
+   speeds and are potential slots for NIC placement.
 
-where:
+Advanced Hugepage Setup
+~~~~~~~~~~~~~~~~~~~~~~~
 
-- ``<rxq-affinity-list>`` ::= ``NULL`` | ``<non-empty-list>``
-- ``<non-empty-list>`` ::= ``<affinity-pair>`` |
-                           ``<affinity-pair>`` , ``<non-empty-list>``
-- ``<affinity-pair>`` ::= ``<queue-id>`` : ``<core-id>``
+Allocate and mount 1 GB hugepages.
 
-For example::
+- For persistent allocation of huge pages, add the following options to the
+  kernel bootline::
 
-    $ ovs-vsctl set interface dpdk0 options:n_rxq=4 \
-        other_config:pmd-rxq-affinity="0:3,1:7,3:8"
+      default_hugepagesz=1GB hugepagesz=1G hugepages=N
 
-This will ensure:
+  For platforms supporting multiple huge page sizes, add multiple options::
 
-- Queue #0 pinned to core 3
-- Queue #1 pinned to core 7
-- Queue #2 not pinned
-- Queue #3 pinned to core 8
+      default_hugepagesz=<size> hugepagesz=<size> hugepages=N
 
-After that PMD threads on cores where RX queues was pinned will become
-``isolated``. This means that this thread will poll only pinned RX queues.
+  where:
 
-.. warning::
-  If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will
-  not be polled. Also, if provided ``core_id`` is not available (ex. this
-  ``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD
-  thread.
+  ``N``
+    number of huge pages requested
+  ``size``
+    huge page size with an optional suffix ``[kKmMgG]``
 
-.. _dpdk-guest-setup:
+- For run-time allocation of huge pages::
 
-DPDK in the VM
---------------
+      $ echo N > /sys/devices/system/node/nodeX/hugepages/hugepages-1048576kB/nr_hugepages
 
-DPDK 'testpmd' application can be run in the Guest VM for high speed packet
-forwarding between vhostuser ports. DPDK and testpmd application has to be
-compiled on the guest VM. Below are the steps for setting up the testpmd
-application in the VM. More information on the vhostuser ports can be found in
-:doc:`dpdk-advanced`.
+  where:
 
-.. note::
-  Support for DPDK in the guest requires QEMU >= 2.2.0.
-
-To being, instantiate the guest::
-
-    $ export VM_NAME=Centos-vm export GUEST_MEM=3072M
-    $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
-    $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
-
-    $ qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
-        -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
-        -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
-        -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
-        -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
-        -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-        -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
-        -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
-        -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
-        -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off \
-
-Download the DPDK sourcs to VM and build DPDK::
-
-    $ cd /root/dpdk/
-    $ wget http://fast.dpdk.org/rel/dpdk-16.11.tar.xz
-    $ tar xf dpdk-16.11.tar.xz
-    $ export DPDK_DIR=/root/dpdk/dpdk-16.11
-    $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
-    $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
-    $ cd $DPDK_DIR
-    $ make install T=$DPDK_TARGET DESTDIR=install
-
-Build the test-pmd application::
-
-    $ cd app/test-pmd
-    $ export RTE_SDK=$DPDK_DIR
-    $ export RTE_TARGET=$DPDK_TARGET
-    $ make
-
-Setup huge pages and DPDK devices using UIO::
-
-    $ sysctl vm.nr_hugepages=1024
-    $ mkdir -p /dev/hugepages
-    $ mount -t hugetlbfs hugetlbfs /dev/hugepages  # only if not already mounted
-    $ modprobe uio
-    $ insmod $DPDK_BUILD/kmod/igb_uio.ko
-    $ $DPDK_DIR/tools/dpdk-devbind.py --status
-    $ $DPDK_DIR/tools/dpdk-devbind.py -b igb_uio 00:03.0 00:04.0
+  ``N``
+    number of huge pages requested
+  ``X``
+    NUMA Node
 
-.. note::
+  .. note::
+    For run-time allocation of 1G huge pages, Contiguous Memory Allocator
+    (``CONFIG_CMA``) has to be supported by kernel, check your Linux distro.
 
-  vhost ports pci ids can be retrieved using::
+Now mount the huge pages, if not already done so::
 
-      lspci | grep Ethernet
+    $ mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
 
-Testing
--------
+Enable HyperThreading
+~~~~~~~~~~~~~~~~~~~~~
 
-Below are few testcases and the list of steps to be followed. Before beginning,
-ensure a userspace bridge has been created and two DPDK ports added::
+With HyperThreading, or SMT, enabled, a physical core appears as two logical
+cores. SMT can be utilized to spawn worker threads on logical cores of the same
+physical core there by saving additional cores.
 
-    $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
-    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
-    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
+With DPDK, when pinning pmd threads to logical cores, care must be taken to set
+the correct bits of the ``pmd-cpu-mask`` to ensure that the pmd threads are
+pinned to SMT siblings.
 
-PHY-PHY
-~~~~~~~
-
-Add test flows to forward packets betwen DPDK port 0 and port 1::
-
-    # Clear current flows
-    $ ovs-ofctl del-flows br0
-
-    # Add flows between port 1 (dpdk0) to port 2 (dpdk1)
-    $ ovs-ofctl add-flow br0 in_port=1,action=output:2
-    $ ovs-ofctl add-flow br0 in_port=2,action=output:1
-
-Transmit traffic into either port. You should see it returned via the other.
-
-PHY-VM-PHY (vhost loopback)
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Add two ``dpdkvhostuser`` ports to bridge ``br0``::
-
-    $ ovs-vsctl add-port br0 dpdkvhostuser0 \
-        -- set Interface dpdkvhostuser0 type=dpdkvhostuser
-    $ ovs-vsctl add-port br0 dpdkvhostuser1 \
-        -- set Interface dpdkvhostuser1 type=dpdkvhostuser
-
-Add test flows to forward packets betwen DPDK devices and VM ports::
-
-    # Clear current flows
-    $ ovs-ofctl del-flows br0
-
-    # Add flows
-    $ ovs-ofctl add-flow br0 in_port=1,action=output:3
-    $ ovs-ofctl add-flow br0 in_port=3,action=output:1
-    $ ovs-ofctl add-flow br0 in_port=4,action=output:2
-    $ ovs-ofctl add-flow br0 in_port=2,action=output:4
-
-    # Dump flows
-    $ ovs-ofctl dump-flows br0
-
-Create a VM using the following configuration:
-
-+----------------------+--------+-----------------+
-| configuration        | values | comments        |
-+----------------------+--------+-----------------+
-| qemu version         | 2.2.0  | n/a             |
-| qemu thread affinity | core 5 | taskset 0x20    |
-| memory               | 4GB    | n/a             |
-| cores                | 2      | n/a             |
-| Qcow2 image          | CentOS7| n/a             |
-| mrg_rxbuf            | off    | n/a             |
-+----------------------+--------+-----------------+
-
-You can do this directly with QEMU via the ``qemu-system-x86_64``
-application::
-
-    $ export VM_NAME=vhost-vm
-    $ export GUEST_MEM=3072M
-    $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
-    $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
-
-    $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
-      -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
-      -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
-      -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
-      -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
-      -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-      -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
-      -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
-      -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
-      -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off
-
-Alternatively, you can configure the guest using libvirt. Below is an XML
-configuration for a 'demovm' guest that can be instantiated using `virsh`::
-
-    <domain type='kvm'>
-      <name>demovm</name>
-      <uuid>4a9b3f53-fa2a-47f3-a757-dd87720d9d1d</uuid>
-      <memory unit='KiB'>4194304</memory>
-      <currentMemory unit='KiB'>4194304</currentMemory>
-      <memoryBacking>
-        <hugepages>
-          <page size='2' unit='M' nodeset='0'/>
-        </hugepages>
-      </memoryBacking>
-      <vcpu placement='static'>2</vcpu>
-      <cputune>
-        <shares>4096</shares>
-        <vcpupin vcpu='0' cpuset='4'/>
-        <vcpupin vcpu='1' cpuset='5'/>
-        <emulatorpin cpuset='4,5'/>
-      </cputune>
-      <os>
-        <type arch='x86_64' machine='pc'>hvm</type>
-        <boot dev='hd'/>
-      </os>
-      <features>
-        <acpi/>
-        <apic/>
-      </features>
-      <cpu mode='host-model'>
-        <model fallback='allow'/>
-        <topology sockets='2' cores='1' threads='1'/>
-        <numa>
-          <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
-        </numa>
-      </cpu>
-      <on_poweroff>destroy</on_poweroff>
-      <on_reboot>restart</on_reboot>
-      <on_crash>destroy</on_crash>
-      <devices>
-        <emulator>/usr/bin/qemu-kvm</emulator>
-        <disk type='file' device='disk'>
-          <driver name='qemu' type='qcow2' cache='none'/>
-          <source file='/root/CentOS7_x86_64.qcow2'/>
-          <target dev='vda' bus='virtio'/>
-        </disk>
-        <disk type='dir' device='disk'>
-          <driver name='qemu' type='fat'/>
-          <source dir='/usr/src/dpdk-16.11'/>
-          <target dev='vdb' bus='virtio'/>
-          <readonly/>
-        </disk>
-        <interface type='vhostuser'>
-          <mac address='00:00:00:00:00:01'/>
-          <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/>
-           <model type='virtio'/>
-          <driver queues='2'>
-            <host mrg_rxbuf='off'/>
-          </driver>
-        </interface>
-        <interface type='vhostuser'>
-          <mac address='00:00:00:00:00:02'/>
-          <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser1' mode='client'/>
-          <model type='virtio'/>
-          <driver queues='2'>
-            <host mrg_rxbuf='off'/>
-          </driver>
-        </interface>
-        <serial type='pty'>
-          <target port='0'/>
-        </serial>
-        <console type='pty'>
-          <target type='serial' port='0'/>
-        </console>
-      </devices>
-    </domain>
-
-Once the guest is configured and booted, configure DPDK packet forwarding
-within the guest. To accomplish this, DPDK and testpmd application have to
-be first compiled on the VM as described in **Guest Setup**. Once compiled, run
-the ``test-pmd`` application::
-
-    $ cd $DPDK_DIR/app/test-pmd;
-    $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \
-        --burst=64 -i --txqflags=0xf00 --disable-hw-vlan
-    $ set fwd mac retry
-    $ start
-
-When you finish testing, bind the vNICs back to kernel::
-
-    $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0
-    $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0
+Take a sample system configuration, with 2 sockets, 2 * 10 core processors, HT
+enabled. This gives us a total of 40 logical cores. To identify the physical
+core shared by two logical cores, run::
 
-.. note::
-  Appropriate PCI IDs to be passed in above example. The PCI IDs can be
-  retrieved like so::
+    $ cat /sys/devices/system/cpu/cpuN/topology/thread_siblings_list
+
+where ``N`` is the logical core number.
+
+In this example, it would show that cores ``1`` and ``21`` share the same
+physical core. As cores are counted from 0, the ``pmd-cpu-mask`` can be used
+to enable these two pmd threads running on these two logical cores (one
+physical core) is::
+
+    $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x200002
 
-      $ $DPDK_DIR/tools/dpdk-devbind.py --status
+Isolate Cores
+~~~~~~~~~~~~~
+
+The ``isolcpus`` option can be used to isolate cores from the Linux scheduler.
+The isolated cores can then be used to dedicatedly run HPC applications or
+threads.  This helps in better application performance due to zero context
+switching and minimal cache thrashing. To run platform logic on core 0 and
+isolate cores between 1 and 19 from scheduler, add  ``isolcpus=1-19`` to GRUB
+cmdline.
 
 .. note::
-  More information on the dpdkvhostuser ports can be found in
-  :doc:`dpdk-advanced`.
+  It has been verified that core isolation has minimal advantage due to mature
+  Linux scheduler in some circumstances.
 
-PHY-VM-PHY (IVSHMEM loopback)
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+NUMA/Cluster-on-Die
+~~~~~~~~~~~~~~~~~~~
+
+Ideally inter-NUMA datapaths should be avoided where possible as packets will
+go across QPI and there may be a slight performance penalty when compared with
+intra NUMA datapaths. On Intel Xeon Processor E5 v3, Cluster On Die is
+introduced on models that have 10 cores or more.  This makes it possible to
+logically split a socket into two NUMA regions and again it is preferred where
+possible to keep critical datapaths within the one cluster.
+
+It is good practice to ensure that threads that are in the datapath are pinned
+to cores in the same NUMA area. e.g. pmd threads and QEMU vCPUs responsible for
+forwarding. If DPDK is built with ``CONFIG_RTE_LIBRTE_VHOST_NUMA=y``, vHost
+User ports automatically detect the NUMA socket of the QEMU vCPUs and will be
+serviced by a PMD from the same node provided a core on this node is enabled in
+the ``pmd-cpu-mask``. ``libnuma`` packages are required for this feature.
+
+Compiler Optimizations
+~~~~~~~~~~~~~~~~~~~~~~
+
+The default compiler optimization level is ``-O2``. Changing this to more
+aggressive compiler optimization such as ``-O3 -march=native`` with
+gcc (verified on 5.3.1) can produce performance gains though not siginificant.
+``-march=native`` will produce optimized code on local machine and should be
+used when software compilation is done on Testbed.
+
+Affinity
+~~~~~~~~
+
+For superior performance, DPDK pmd threads and Qemu vCPU threads needs to be
+affinitized accordingly.
+
+- PMD thread Affinity
+
+  A poll mode driver (pmd) thread handles the I/O of all DPDK interfaces
+  assigned to it. A pmd thread shall poll the ports for incoming packets,
+  switch the packets and send to tx port.  pmd thread is CPU bound, and needs
+  to be affinitized to isolated cores for optimum performance.
+
+  By setting a bit in the mask, a pmd thread is created and pinned to the
+  corresponding CPU core. e.g. to run a pmd thread on core 2::
+
+      $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x4
+
+  .. note::
+    pmd thread on a NUMA node is only created if there is at least one DPDK
+    interface from that NUMA node added to OVS.
+
+- QEMU vCPU thread Affinity
+
+  A VM performing simple packet forwarding or running complex packet pipelines
+  has to ensure that the vCPU threads performing the work has as much CPU
+  occupancy as possible.
+
+  For example, on a multicore VM, multiple QEMU vCPU threads shall be spawned.
+  When the DPDK ``testpmd`` application that does packet forwarding is invoked,
+  the ``taskset`` command should be used to affinitize the vCPU threads to the
+  dedicated isolated cores on the host system.
+
+Multiple Poll-Mode Driver Threads
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+With pmd multi-threading support, OVS creates one pmd thread for each NUMA node
+by default. However, in cases where there are multiple ports/rxq's producing
+traffic, performance can be improved by creating multiple pmd threads running
+on separate cores. These pmd threads can share the workload by each being
+responsible for different ports/rxq's. Assignment of ports/rxq's to pmd threads
+is done automatically.
+
+A set bit in the mask means a pmd thread is created and pinned to the
+corresponding CPU core. For example, to run pmd threads on core 1 and 2::
+
+    $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x6
+
+When using dpdk and dpdkvhostuser ports in a bi-directional VM loopback as
+shown below, spreading the workload over 2 or 4 pmd threads shows significant
+improvements as there will be more total CPU occupancy available::
+
+    NIC port0 <-> OVS <-> VM <-> OVS <-> NIC port 1
+
+DPDK Physical Port Rx Queues
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+    $ ovs-vsctl set Interface <DPDK interface> options:n_rxq=<integer>
+
+The above command sets the number of rx queues for DPDK physical interface.
+The rx queues are assigned to pmd threads on the same NUMA node in a
+round-robin fashion.
+
+DPDK Physical Port Queue Sizes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+    $ ovs-vsctl set Interface dpdk0 options:n_rxq_desc=<integer>
+    $ ovs-vsctl set Interface dpdk0 options:n_txq_desc=<integer>
+
+The above command sets the number of rx/tx descriptors that the NIC associated
+with dpdk0 will be initialised with.
+
+Different ``n_rxq_desc`` and ``n_txq_desc`` configurations yield different
+benefits in terms of throughput and latency for different scenarios.
+Generally, smaller queue sizes can have a positive impact for latency at the
+expense of throughput. The opposite is often true for larger queue sizes.
+Note: increasing the number of rx descriptors eg. to 4096  may have a negative
+impact on performance due to the fact that non-vectorised DPDK rx functions may
+be used. This is dependant on the driver in use, but is true for the commonly
+used i40e and ixgbe DPDK drivers.
+
+Exact Match Cache
+~~~~~~~~~~~~~~~~~
+
+Each pmd thread contains one Exact Match Cache (EMC). After initial flow setup
+in the datapath, the EMC contains a single table and provides the lowest level
+(fastest) switching for DPDK ports. If there is a miss in the EMC then the next
+level where switching will occur is the datapath classifier.  Missing in the
+EMC and looking up in the datapath classifier incurs a significant performance
+penalty.  If lookup misses occur in the EMC because it is too small to handle
+the number of flows, its size can be increased. The EMC size can be modified by
+editing the define ``EM_FLOW_HASH_SHIFT`` in ``lib/dpif-netdev.c``.
+
+As mentioned above, an EMC is per pmd thread. An alternative way of increasing
+the aggregate amount of possible flow entries in EMC and avoiding datapath
+classifier lookups is to have multiple pmd threads running.
+
+Rx Mergeable Buffers
+~~~~~~~~~~~~~~~~~~~~
 
-Refer to the :doc:`dpdk-advanced`.
+Rx mergeable buffers is a virtio feature that allows chaining of multiple
+virtio descriptors to handle large packet sizes. Large packets are handled by
+reserving and chaining multiple free descriptors together. Mergeable buffer
+support is negotiated between the virtio driver and virtio device and is
+supported by the DPDK vhost library.  This behavior is supported and enabled by
+default, however in the case where the user knows that rx mergeable buffers are
+not needed i.e. jumbo frames are not needed, it can be forced off by adding
+``mrg_rxbuf=off`` to the QEMU command line options. By not reserving multiple
+chains of descriptors it will make more individual virtio descriptors available
+for rx to the guest using dpdkvhost ports and this can improve performance.
 
 Limitations
 ------------
diff --git a/Documentation/intro/install/index.rst b/Documentation/intro/install/index.rst
index 2366388..0d3ea06 100644
--- a/Documentation/intro/install/index.rst
+++ b/Documentation/intro/install/index.rst
@@ -33,10 +33,6 @@ different environments and using different configurations.
 Installation from Source
 ------------------------
 
-.. TODO(stephenfin): The DPDK-ADVANCED doc is mostly usage material. The
-   install related instructions should be moved to the main doc, while the
-   rest should be moved to howto and topic docs
-
 .. TODO(stephenfin): Based on the title alone, the NetBSD doc should probably
    be merged into the general install doc
 
@@ -49,7 +45,6 @@ Installation from Source
    xenserver
    userspace
    dpdk
-   dpdk-advanced
    bash-completion
 
 Installation from Packages
diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst
new file mode 100644
index 0000000..3c98a9a
--- /dev/null
+++ b/Documentation/topics/dpdk/index.rst
@@ -0,0 +1,32 @@
+..
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+      Convention for heading levels in Open vSwitch documentation:
+
+      =======  Heading 0 (reserved for the title in a document)
+      -------  Heading 1
+      ~~~~~~~  Heading 2
+      +++++++  Heading 3
+      '''''''  Heading 4
+
+      Avoid deeper levels because they do not render well.
+
+=================
+The DPDK Datapath
+=================
+
+.. toctree::
+   :maxdepth: 2
+
+   vhost-user
+   ivshmem
diff --git a/Documentation/topics/dpdk.rst b/Documentation/topics/dpdk/ivshmem.rst
similarity index 93%
rename from Documentation/topics/dpdk.rst
rename to Documentation/topics/dpdk/ivshmem.rst
index 74e0266..bd4dd99 100644
--- a/Documentation/topics/dpdk.rst
+++ b/Documentation/topics/dpdk/ivshmem.rst
@@ -21,8 +21,8 @@
 
       Avoid deeper levels because they do not render well.
 
-================
-DPDK Integration
-================
+==================
+DPDK IVSHMEM Ports
+==================
 
 **TODO**
diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
new file mode 100644
index 0000000..5448bd2
--- /dev/null
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -0,0 +1,396 @@
+..
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+      Convention for heading levels in Open vSwitch documentation:
+
+      =======  Heading 0 (reserved for the title in a document)
+      -------  Heading 1
+      ~~~~~~~  Heading 2
+      +++++++  Heading 3
+      '''''''  Heading 4
+
+      Avoid deeper levels because they do not render well.
+
+=====================
+DPDK vHost User Ports
+=====================
+
+The DPDK datapath provides DPDK-backed vHost user ports as a primary way to
+interact with guests. For more information on vHost User, refer to the `QEMU
+documentation`_ on same.
+
+Quick Example
+-------------
+
+This example demonstrates how to add two ``dpdkvhostuser`` ports to an existing
+bridge called ``br0``::
+
+    $ ovs-vsctl add-port br0 dpdkvhostuser0 \
+        -- set Interface dpdkvhostuser0 type=dpdkvhostuser
+    $ ovs-vsctl add-port br0 dpdkvhostuser1 \
+        -- set Interface dpdkvhostuser1 type=dpdkvhostuser
+
+vhost-user vs. vhost-user-client
+--------------------------------
+
+Open vSwitch provides two types of vHost User ports:
+
+- vhost-user (``dpdkvhostuser``)
+
+- vhost-user-client (``dpdkvhostuserclient``)
+
+vHost User uses a client-server model. The server creates/manages/destroys the
+vHost User sockets, and the client connects to the server. Depending on which
+port type you use, ``dpdkvhostuser`` or ``dpdkvhostuserclient``, a different
+configuration of the client-server model is used.
+
+For vhost-user ports, Open vSwitch acts as the server and QEMU the client.  For
+vhost-user-client ports, Open vSwitch acts as the client and QEMU the server.
+
+.. _dpdk-vhost-user:
+
+vhost-user
+----------
+
+.. important::
+
+   Use of vhost-user ports requires QEMU >= 2.2
+
+To use vhost-user ports, you must first add said ports to the switch. Unlike
+DPDK ring ports, DPDK vhost-user ports can have arbitrary names, except that
+forward and backward slashes are prohibited in the names. For vhost-user, the
+port type is ``dpdkvhostuser``::
+
+    $ ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 \
+        type=dpdkvhostuser
+
+This action creates a socket located at
+``/usr/local/var/run/openvswitch/vhost-user-1``, which you must provide to your
+VM on the QEMU command line.
+
+.. note::
+
+   If you wish for the vhost-user sockets to be created in a sub-directory of
+   ``/usr/local/var/run/openvswitch``, you may specify this directory in the
+   ovsdb like so::
+
+       $ ovs-vsctl --no-wait \
+           set Open_vSwitch . other_config:vhost-sock-dir=subdir`
+
+Once the vhost-user ports have been added to the switch, they must be added to
+the guest. There are two ways to do this: using QEMU directly, or using
+libvirt.
+
+Adding vhost-user ports to the guest (QEMU)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To begin, you must attach the vhost-user device sockets to the guest. To do
+this, you must pass the following parameters to QEMU::
+
+    -chardev socket,id=char1,path=/usr/local/var/run/openvswitch/vhost-user-1
+    -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce
+    -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1
+
+where ``vhost-user-1`` is the name of the vhost-user port added to the switch.
+
+Repeat the above parameters for multiple devices, changing the chardev ``path``
+and ``id`` as necessary. Note that a separate and different chardev ``path``
+needs to be specified for each vhost-user device. For example you have a second
+vhost-user port named ``vhost-user-2``, you append your QEMU command line with
+an additional set of parameters::
+
+    -chardev socket,id=char2,path=/usr/local/var/run/openvswitch/vhost-user-2
+    -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce
+    -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2
+
+In addition,       QEMU must allocate the VM's memory on hugetlbfs. vhost-user
+ports access a virtio-net device's virtual rings and packet buffers mapping the
+VM's physical memory on hugetlbfs. To enable vhost-user ports to map the VM's
+memory into their process address space, pass the following parameters to
+QEMU::
+
+    -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on
+    -numa node,memdev=mem -mem-prealloc
+
+Finally, you may wish to enable multiqueue support. This is optional but,
+should you wish to enable it, run::
+
+    -chardev socket,id=char2,path=/usr/local/var/run/openvswitch/vhost-user-2
+    -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=$q
+    -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=$v
+
+where:
+
+``$q``
+  The number of queues
+``$v``
+  The number of vectors, which is ``$q`` * 2 + 2
+
+The vhost-user interface will be automatically reconfigured with required
+number of rx and tx queues after connection of virtio device.  Manual
+configuration of ``n_rxq`` is not supported because OVS will work properly only
+if ``n_rxq`` will match number of queues configured in QEMU.
+
+A least 2 PMDs should be configured for the vswitch when using multiqueue.
+Using a single PMD will cause traffic to be enqueued to the same vhost queue
+rather than being distributed among different vhost queues for a vhost-user
+interface.
+
+If traffic destined for a VM configured with multiqueue arrives to the vswitch
+via a physical DPDK port, then the number of rxqs should also be set to at
+least 2 for that physical DPDK port. This is required to increase the
+probability that a different PMD will handle the multiqueue transmission to the
+guest using a different vhost queue.
+
+If one wishes to use multiple queues for an interface in the guest, the driver
+in the guest operating system must be configured to do so. It is recommended
+that the number of queues configured be equal to ``$q``.
+
+For example, this can be done for the Linux kernel virtio-net driver with::
+
+    $ ethtool -L <DEV> combined <$q>
+
+where:
+
+``-L``
+  Changes the numbers of channels of the specified network device
+``combined``
+  Changes the number of multi-purpose channels.
+
+Adding vhost-user ports to the guest (libvirt)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. TODO(stephenfin): This seems like something that wouldn't be acceptable in
+   production. Is this really required?
+
+To begin, you must change the user and group that libvirt runs under, configure
+access control policy and restart libvirtd.
+
+- In ``/etc/libvirt/qemu.conf`` add/edit the following lines::
+
+      user = "root"
+      group = "root"
+
+- Disable SELinux or set to permissive mode::
+
+      $ setenforce 0
+
+- Finally, restart the libvirtd process, For example, on Fedora::
+
+      $ systemctl restart libvirtd.service
+
+Once complete, instantiate the VM. A sample XML configuration file is provided
+at the :ref:`end of this file <dpdk-vhost-user-xml>`. Save this file, then
+create a VM using this file::
+
+    $ virsh create demovm.xml
+
+Once created, you can connect to the guest console::
+
+    $ virsh console demovm
+
+The demovm xml configuration is aimed at achieving out of box performance on
+VM. These enhancements include:
+
+- The vcpus are pinned to the cores of the CPU socket 0 using ``vcpupin``.
+
+- Configure NUMA cell and memory shared using ``memAccess='shared'``.
+
+- Disable ``mrg_rxbuf='off'``
+
+Refer to the `libvirt documentation <http://libvirt.org/formatdomain.html>`__
+for more information.
+
+.. _dpdk-vhost-user-client:
+
+vhost-user-client
+-----------------
+
+.. important::
+
+   Use of vhost-user ports requires QEMU >= 2.7
+
+To use vhost-user-client ports, you must first add said ports to the switch.
+Like DPDK vhost-user ports, DPDK vhost-user-client ports can have mostly
+arbitrary. However, the name given to the port does not govern the name of the
+socket device. Instead, this must be configured by the user by way of a
+``vhost-server-path`` option. For vhost-user-client, the port type is
+``dpdkvhostuserclient``::
+
+    $ VHOST_USER_SOCKET_PATH=/path/to/socket
+    $ ovs-vsctl add-port br0 vhost-client-1 \
+        -- set Interface vhost-client-1 type=dpdkvhostuserclient \
+             options:vhost-server-path=$VHOST_USER_SOCKET_PATH
+
+Once the vhost-user-client ports have been added to the switch, they must be
+added to the guest. Like vhost-user ports, there are two ways to do this: using
+QEMU directly, or using libvirt. Only the QEMU case is covered here.
+
+Adding vhost-user-client ports to the guest (QEMU)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Attach the vhost-user device sockets to the guest. To do this, you must pass
+the following parameters to QEMU::
+
+    -chardev socket,id=char1,path=$VHOST_USER_SOCKET_PATH,server
+    -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce
+    -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1
+
+where ``vhost-user-1`` is the name of the vhost-user port added to the switch.
+
+If the corresponding ``dpdkvhostuserclient`` port has not yet been configured
+in OVS with ``vhost-server-path=/path/to/socket``, QEMU will print a log
+similar to the following::
+
+    QEMU waiting for connection on: disconnected:unix:/path/to/socket,server
+
+QEMU will wait until the port is created sucessfully in OVS to boot the VM.
+One benefit of using this mode is the ability for vHost ports to 'reconnect' in
+event of the switch crashing or being brought down. Once it is brought back up,
+the vHost ports will reconnect automatically and normal service will resume.
+
+.. _dpdk-testpmd:
+
+DPDK in the Guest
+-----------------
+
+The DPDK ``testpmd`` application can be run in guest VMs for high speed packet
+forwarding between vhostuser ports. DPDK and testpmd application has to be
+compiled on the guest VM. Below are the steps for setting up the testpmd
+application in the VM.
+
+.. note::
+
+  Support for DPDK in the guest requires QEMU >= 2.2
+
+To begin, instantiate a guest as described in :ref:`dpdk-vhost-user` or
+:ref:`dpdk-vhost-user-client`. Once started, connect to the VM, download the
+DPDK sources to VM and build DPDK::
+
+    $ cd /root/dpdk/
+    $ wget http://fast.dpdk.org/rel/dpdk-16.11.tar.xz
+    $ tar xf dpdk-16.11.tar.xz
+    $ export DPDK_DIR=/root/dpdk/dpdk-16.11
+    $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
+    $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
+    $ cd $DPDK_DIR
+    $ make install T=$DPDK_TARGET DESTDIR=install
+
+Build the test-pmd application::
+
+    $ cd app/test-pmd
+    $ export RTE_SDK=$DPDK_DIR
+    $ export RTE_TARGET=$DPDK_TARGET
+    $ make
+
+Setup huge pages and DPDK devices using UIO::
+
+    $ sysctl vm.nr_hugepages=1024
+    $ mkdir -p /dev/hugepages
+    $ mount -t hugetlbfs hugetlbfs /dev/hugepages  # only if not already mounted
+    $ modprobe uio
+    $ insmod $DPDK_BUILD/kmod/igb_uio.ko
+    $ $DPDK_DIR/tools/dpdk-devbind.py --status
+    $ $DPDK_DIR/tools/dpdk-devbind.py -b igb_uio 00:03.0 00:04.0
+
+.. note::
+
+  vhost ports pci ids can be retrieved using::
+
+      lspci | grep Ethernet
+
+Finally, start the application::
+
+    # TODO
+
+.. _dpdk-vhost-user-xml:
+
+Sample XML
+----------
+
+::
+
+    <domain type='kvm'>
+      <name>demovm</name>
+      <uuid>4a9b3f53-fa2a-47f3-a757-dd87720d9d1d</uuid>
+      <memory unit='KiB'>4194304</memory>
+      <currentMemory unit='KiB'>4194304</currentMemory>
+      <memoryBacking>
+        <hugepages>
+          <page size='2' unit='M' nodeset='0'/>
+        </hugepages>
+      </memoryBacking>
+      <vcpu placement='static'>2</vcpu>
+      <cputune>
+        <shares>4096</shares>
+        <vcpupin vcpu='0' cpuset='4'/>
+        <vcpupin vcpu='1' cpuset='5'/>
+        <emulatorpin cpuset='4,5'/>
+      </cputune>
+      <os>
+        <type arch='x86_64' machine='pc'>hvm</type>
+        <boot dev='hd'/>
+      </os>
+      <features>
+        <acpi/>
+        <apic/>
+      </feature>
+      <cpu mode='host-model'>
+        <model fallback='allow'/>
+        <topology sockets='2' cores='1' threads='1'/>
+        <numa>
+          <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
+        </numa>
+      </cpu>
+      <on_poweroff>destroy</on_poweroff>
+      <on_reboot>restart</on_reboot>
+      <on_crash>destroy</on_crash>
+      <devices>
+        <emulator>/usr/bin/qemu-kvm</emulator>
+        <disk type='file' device='disk'>
+          <driver name='qemu' type='qcow2' cache='none'/>
+          <source file='/root/CentOS7_x86_64.qcow2'/>
+          <target dev='vda' bus='virtio'/>
+        </disk>
+        <disk type='dir' device='disk'>
+          <driver name='qemu' type='fat'/>
+          <source dir='/usr/src/dpdk-16.11'/>
+          <target dev='vdb' bus='virtio'/>
+          <readonly/>
+        </disk>
+        <interface type='vhostuser'>
+          <mac address='00:00:00:00:00:01'/>
+          <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/>
+           <model type='virtio'/>
+          <driver queues='2'>
+            <host mrg_rxbuf='off'/>
+          </driver>
+        </interface>
+        <interface type='vhostuser'>
+          <mac address='00:00:00:00:00:02'/>
+          <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser1' mode='client'/>
+          <model type='virtio'/>
+          <driver queues='2'>
+            <host mrg_rxbuf='off'/>
+          </driver>
+        </interface>
+        <serial type='pty'>
+          <target port='0'/>
+        </serial>
+        <console type='pty'>
+          <target type='serial' port='0'/>
+        </console>
+      </devices>
+    </domain>
+
+.. _QEMU documentation: http://git.qemu-project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-user.txt;h=7890d7169;hb=HEAD
diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst
index 30f74fe..e5a8b4d 100644
--- a/Documentation/topics/index.rst
+++ b/Documentation/topics/index.rst
@@ -40,8 +40,9 @@ that way.
    openflow
    bonding
    ovsdb-replication
-   dpdk
+   dpdk/index
    windows
+   testing
 
 .. toctree::
    :maxdepth: 2
diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
new file mode 100644
index 0000000..5265ab1
--- /dev/null
+++ b/Documentation/topics/testing.rst
@@ -0,0 +1,38 @@
+..
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+      Convention for heading levels in Open vSwitch documentation:
+
+      =======  Heading 0 (reserved for the title in a document)
+      -------  Heading 1
+      ~~~~~~~  Heading 2
+      +++++++  Heading 3
+      '''''''  Heading 4
+
+      Avoid deeper levels because they do not render well.
+
+=======
+Testing
+=======
+
+.. TODO(stephenfin): Flesh this out with information from the general
+   installation guide, among others.
+
+vsperf
+------
+
+The vsperf project aims to develop a vSwitch test framework that can be used to
+validate the suitability of different vSwitch implementations in a telco
+deployment environment. More information can be found on the `OPNFV wiki`_.
+
+.. _OPNFV wiki: https://wiki.opnfv.org/display/vsperf/VSperf+Home
-- 
2.9.3



More information about the dev mailing list