[ovs-git] [openvswitch/ovs] 140dd6: dpif-netdev: Incremental addition/deletion of PMD ...

GitHub noreply at github.com
Wed Aug 2 17:54:26 UTC 2017


  Branch: refs/heads/master
  Home:   https://github.com/openvswitch/ovs
  Commit: 140dd699463a61923538dab70e3b3eeec6fb695d
      https://github.com/openvswitch/ovs/commit/140dd699463a61923538dab70e3b3eeec6fb695d
  Author: Ilya Maximets <i.maximets at samsung.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/dpif-netdev.c
    M tests/pmd.at

  Log Message:
  -----------
  dpif-netdev: Incremental addition/deletion of PMD threads.

Currently, change of 'pmd-cpu-mask' is very heavy operation.
It requires destroying of all the PMD threads and creating
them back. After that, all the threads will sleep until
ports' redistribution finished.

This patch adds ability to not stop the datapath while
adjusting number/placement of PMD threads. All not affected
threads will forward traffic without any additional latencies.

id-pool created for static tx queue ids to keep them sequential
in a flexible way. non-PMD thread will always have
static_tx_qid = 0 as it was before.

Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
Tested-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: e215018b0bc2f9f8aab5f7613ff013323a0018ab
      https://github.com/openvswitch/ovs/commit/e215018b0bc2f9f8aab5f7613ff013323a0018ab
  Author: Ilya Maximets <i.maximets at samsung.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/dpif-netdev.c

  Log Message:
  -----------
  dpif-netdev: Don't uninit emc on reload.

There are many reasons for reloading of pmd threads:
	* reconfiguration of one of the ports.
	* Adjusting of static_tx_qid.
	* Adding new tx/rx ports.

In many cases EMC is still useful after reload and uninit
will only lead to unnecessary upcalls/classifier lookups.

Such behaviour slows down the datapath. Uninit itself slows
down the reload path. All this factors leads to additional
unexpected latencies/drops on events not directly connected
to current PMD thread.

Lets not uninitialize emc cache on reload path.
'emc_cache_slow_sweep()' and replacements should free all
the old/unwanted entries.

Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
Acked-by: Cian Ferriter <cian.ferriter at intel.com>
Tested-by: Cian Ferriter <cian.ferriter at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: c37813fdb030b4270d05ad61943754f67021a50d
      https://github.com/openvswitch/ovs/commit/c37813fdb030b4270d05ad61943754f67021a50d
  Author: Billy O'Mahony <billy.o.mahony at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M Documentation/intro/install/dpdk.rst
    M lib/dpif-netdev.c

  Log Message:
  -----------
  dpif-netdev: Assign ports to pmds on non-local numa node.

Previously if there is no available (non-isolated) pmd on the numa node
for a port then the port is not polled at all. This can result in a
non-operational system until such time as nics are physically
repositioned. It is preferable to operate with a pmd on the 'wrong' numa
node albeit with lower performance. Local pmds are still chosen when
available.

Signed-off-by: Billy O'Mahony <billy.o.mahony at intel.com>
Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
Co-authored-by: Ilya Maximets <i.maximets at samsung.com>
Tested-by: Ian Stokes <ian.stokes at intel.com>
Acked-by: Ian Stokes <ian.stokes at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 67fe6d635193761439f791e48652acfd60076cfb
      https://github.com/openvswitch/ovs/commit/67fe6d635193761439f791e48652acfd60076cfb
  Author: Mark Kavanagh <mark.b.kavanagh at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/netdev-dpdk.c

  Log Message:
  -----------
  netdev-dpdk: use rte_eth_dev_set_mtu.

DPDK provides an API to set the MTU of compatible physical devices -
rte_eth_dev_set_mtu(). Prior to DPDK v16.07 however, this API was not
implemented in some DPDK PMDs (i40e, specifically). To allow the use
of jumbo frames with affected NICs in OvS-DPDK, MTU configuration was
achieved by setting the jumbo frame flag, and corresponding maximum
permitted Rx frame size, in an rte_eth_conf structure for the NIC
port, and subsequently invoking rte_eth_dev_configure() with that
configuration.

However, that method does not set the MTU field of the underlying DPDK
structure (rte_eth_dev) for the corresponding physical device;
consequently, rte_eth_dev_get_mtu() reports the incorrect MTU for an
OvS-DPDK phy device with non-standard MTU.

Resolve this issue by invoking rte_eth_dev_set_mtu() when setting up
or modifying the MTU of a DPDK phy port.

Fixes: 0072e93 ("netdev-dpdk: add support for jumbo frames")
Reported-by: Aaron Conole <aconole at redhat.com>
Reported-by: Vipin Varghese <vipin.varghese at intel.com>
Reviewed-by: Aaron Conole <aconole at redhat.com>
Acked-by: Sugesh Chandran <sugesh.chandran at intel.com>
Tested-by: Sugesh Chandran <sugesh.chandran at intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: f3e7ec254738364101eed8f04b1d954cb510615c
      https://github.com/openvswitch/ovs/commit/f3e7ec254738364101eed8f04b1d954cb510615c
  Author: Michal Weglicki <michalx.weglicki at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M .travis/linux-build.sh
    M Documentation/faq/releases.rst
    M Documentation/howto/dpdk.rst
    M Documentation/intro/install/dpdk.rst
    M Documentation/topics/dpdk/vhost-user.rst
    M NEWS
    M lib/netdev-dpdk.c
    M rhel/openvswitch-fedora.spec.in
    M tests/dpdk/ring_client.c

  Log Message:
  -----------
  Update relevant artifacts to add support for DPDK 17.05.1.

Upgrading to DPDK 17.05.1 stable release adds new
significant features relevant to OVS, including,
but not limited to:
- tun/tap PMD,
- VFIO hotplug support,
- Generic flow API.

Following changes are applied:
- netdev-dpdk: Changes required by DPDK API modifications.
- doc: Because of DPDK API changes, backward compatibility
  with previous DPDK releases will be broken, thus all
  relevant documentation entries are updated.
- .travis: DPDK version change from 16.11.1 to 17.05.1.
- rhel/openvswitch-fedora.spec.in: DPDK version change
  from 16.11 to 17.05.1

Signed-off-by: Michal Weglicki <michalx.weglicki at intel.com>
Acked-by: Kevin Traynor <ktraynor at redhat.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
Tested-by: Ian Stokes <ian.stokes at intel.com>
Acked-by: Aaron Conole <aconole at redhat.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 0ee821c2e604628b3ef3ab09c45bdd941be7702e
      https://github.com/openvswitch/ovs/commit/0ee821c2e604628b3ef3ab09c45bdd941be7702e
  Author: Darrell Ball <dlu998 at gmail.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M Documentation/howto/dpdk.rst
    M lib/netdev-dpdk.c

  Log Message:
  -----------
  dpdk: Fix device cleanup.

Commit 5dcde09c80a8 was introduced to make detaching more
automatic without using an additional command beyond
ovs-vsctl del-port <br> <port>.

Sometimes, since commit 5dcde09c80a8, dpdk devices are
not detached when del-port is issued; command example:

sudo ovs-vsctl del-port br0 dpdk1

This can happen when vswitchd is (re)started with an existing
database and devices are already bound to dpdk.

A minimal recipe to reproduce the issue is:

1/ Starting with

darrell at prmh-nsx-perf-server125:~$ sudo ovs-vsctl show
1c50d8ee-b17f-4fac-a595-03b0da8c8275
    Bridge "br0"
  Port "br0"
      Interface "br0"
          type: internal
  Port "dpdk1"
      Interface "dpdk1"
          type: dpdk
          options: {dpdk-devargs="0000:04:00.1"}
  Port "dpdk0"
      Interface "dpdk0"
          type: dpdk
          options: {dpdk-devargs="0000:04:00.0"}

darrell at prmh-nsx-perf-server125:~$ /usr/src/dpdk-16.11/tools/dpdk-devbind.py --status

Network devices using DPDK-compatible driver

============================================
0000:04:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=uio_pci_generic unused=ixgbe,vfio-pci
0000:04:00.1 'Ethernet Controller 10-Gigabit X540-AT2' drv=uio_pci_generic unused=ixgbe,vfio-pci

2/ restart vswitchd

3/ run
 sudo ovs-vsctl del-port br0 dpdk1

and find the interface is NOT detached; there is
no info log ‘Device '0000:04:00.1' detached’.

A more verbose discussion is here:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/333462.html
along with another possible solution.

Since we are nearing the end of a release, a safe approach is needed,
at this time.
One approach is to revert 5dcde09c80a8.  This patch does not do that
but reinstates the command ovs-appctl netdev-dpdk/detach to handle
cases when del-port will not work.

To detach the device, run the reinstated command
ovs-appctl netdev-dpdk/detach 0000:04:00.1
Observe console output
‘Device '0000:04:00.1' has been detached’

Fixes: 5dcde09c80a8 ("netdev-dpdk: Fix device leak on port deletion.")
CC: Ilya Maximets <i.maximets at samsung.com>
Acked-by: Aaron Conole <aconole at redhat.com>
Acked-by: Fischetti, Antonio <antonio.fischetti at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: af697f26b51b67fb4b8db8358bb4c6268bcce0f4
      https://github.com/openvswitch/ovs/commit/af697f26b51b67fb4b8db8358bb4c6268bcce0f4
  Author: Daniele Di Proietto <diproiettod at vmware.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/packets.h

  Log Message:
  -----------
  packets: Do not initialize ct_orig_tuple.

Commit "odp: Support conntrack orig tuple key." introduced new fields
in struct 'pkt_metadata'.  pkt_metadata_init() is called for every
packet in the userspace datapath.  When testing a simple single
flow case with DPDK, we observe a lower throughput after the above
commit (it was 14.88 Mpps before, it is 13 Mpps after).

This patch skips initializing ct_orig_tuple in pkt_metadata_init().
It should be enough to initialize ct_state, because nobody should look
at ct_orig_tuple unless ct_state is != 0.

It's discussed at:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332419.html

Fixes: daf4d3c18da4("odp: Support conntrack orig tuple key.")
Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 7451af618e0dd55d072821caed95a86d43a54d30
      https://github.com/openvswitch/ovs/commit/7451af618e0dd55d072821caed95a86d43a54d30
  Author: Sugesh Chandran <sugesh.chandran at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/dp-packet.h

  Log Message:
  -----------
  dp-packet : Update DPDK rx checksum validation functions.

DPDK ports use masks while reporting rx checksum flags. OVS should use these
mask along with reported checksum flag while validating the good checksum.

Added two new functions to validate bad checksum reported by DPDK NIC port.
These two functions will be used in the following patch for enabling rx checksum
offload in conntrack module.

Signed-off-by: Sugesh Chandran <sugesh.chandran at intel.com>
Co-authored-by: Darrell Ball <dball at vmware.com>
Signed-off-by: Darrell Ball <dball at vmware.com>
Acked-by: Antonio Fishetti <antonio.fischetti at intel.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 324459a39d17559d1adcb42b9d141b0288a75127
      https://github.com/openvswitch/ovs/commit/324459a39d17559d1adcb42b9d141b0288a75127
  Author: Sugesh Chandran <sugesh.chandran at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/conntrack.c

  Log Message:
  -----------
  conntrack : Use Rx checksum offload feature on DPDK ports for conntrack.

Avoiding checksum validation in conntrack module if it is already verified
in DPDK physical NIC ports.

Signed-off-by: Sugesh Chandran <sugesh.chandran at intel.com>
Co-authored-by: Darrell Ball <dball at vmware.com>
Signed-off-by: Darrell Ball <dball at vmware.com>
Acked-by: Antonio Fishetti <antonio.fischetti at intel.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: ded30c74b1e57af84416cdf6c5babd66b1f48ee6
      https://github.com/openvswitch/ovs/commit/ded30c74b1e57af84416cdf6c5babd66b1f48ee6
  Author: Antonio Fischetti <antonio.fischetti at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/conntrack.c
    M lib/conntrack.h
    M lib/ct-dpif.c
    M lib/ct-dpif.h
    M lib/dpctl.c
    M lib/dpctl.man
    M lib/dpif-netdev.c
    M lib/dpif-netlink.c
    M lib/dpif-provider.h
    M lib/netlink-conntrack.c
    M lib/netlink-conntrack.h
    M tests/test-netlink-conntrack.c
    M utilities/ovs-dpctl.c

  Log Message:
  -----------
  dpctl: Add new 'ct-bkts' command.

With the command:
 ovs-appctl dpctl/ct-bkts
shows the number of connections per bucket.

By using a threshold:
 ovs-appctl dpctl/ct-bkts gt=N
for each bucket shows the number of connections when they
are greater than N.

Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com>
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: ca62bb16aba7734d83d752b3e4710111bd1eadc0
      https://github.com/openvswitch/ovs/commit/ca62bb16aba7734d83d752b3e4710111bd1eadc0
  Author: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
  Date:   2017-08-02 (Wed, 02 Aug 2017)

  Changed paths:
    M lib/dpif-netdev.c

  Log Message:
  -----------
  dpif-netdev: Reorder elements in dp_netdev_port structure.

By reordering the elements in dp_netdev_port structure, pad bytes can be
reduced there by saving a cache line. Marginal performance improvement
is also observed with this change.

Before: structure size: 136, holes: 7, sum padbytes:7, cachelines:3
After : structure size: 128, holes: 6, sum padbytes:0, cachelines:2

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
Reviewed-by: Greg Rose <gvrose8192 at gmail.com>
Tested-by: Greg Rose <gvrose8192 at gmail.com>
Signed-off-by: Darrell Ball <dlu998 at gmail.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/c1f272f9cd49...ca62bb16aba7


More information about the git mailing list