[ovs-dev] [PATCH RFC v3 1/1] netdev-dpdk: Add support for DPDK 16.07

Daniele Di Proietto diproiettod at ovn.org
Wed Aug 3 00:14:04 UTC 2016


Given that using vhost PMD doesn't seem viable in the very short term, I
think we should stick with the vhost lib.

I sent a patch for ovsrcu to add a new RCU protected array index.

http://openvswitch.org/pipermail/dev/2016-August/077097.html

Thanks,

Daniele

2016-07-28 6:26 GMT-07:00 Loftus, Ciara <ciara.loftus at intel.com>:

> >
> > Thanks for the patch.
> > I have another concern with this.  If we're still going to rely on RCU
> to protect
> > the vhost device (and as pointed out by Ilya, I think we should) we need
> to
> > use RCU-like semantics on the vid array index. I'm not sure a boolean
> flag is
> > going to be enough.
> > CCing Jarno:
> > We have this int, which is an index into an array of vhost devices (the
> array is
> > inside the DPDK library).  We want to make sure that when
> > ovsrcu_synchronize() returns nobody is using the old index anymore.
> > Should we introduce an RCU type for indexing into arrays?  I found some
> > negative opinions here:
> >
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-
> > next.git/tree/Documentation/RCU/arrayRCU.txt?id=refs/tags/next-
> > 20160722#n13
> > but I think using atomics should prevent the compiler from playing
> tricks with
> > the index.
> >
> > How about something like the code below?
> > Thanks,
> > Daniele
>
> I think the best way forward here is to avoid the RCU mechanisms by
> merging the vHost PMD first as you have previously suggested. What do you
> think?
> If we don't go with that, I think we need to make a decision ASAP on how
> to handle the RCU (ie. is below code needed?) as both DPDK and 2.6 releases
> are imminent.
>
> Thanks,
> Ciara
>
> >
> >
> > diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h
> > index dc75749..d1a57f6 100644
> > --- a/lib/ovs-rcu.h
> > +++ b/lib/ovs-rcu.h
> > @@ -130,6 +130,41 @@
> >  #include "compiler.h"
> >  #include "ovs-atomic.h"
> >
> > +typedef struct { atomic_int v; } ovsrcu_int;
> > +
> > +static inline int ovsrcu_int_get__(const ovsrcu_int *i, memory_order
> order)
> > +{
> > +    int ret;
> > +    atomic_read_explicit(CONST_CAST(atomic_int *, &i->v), &ret, order);
> > +    return ret;
> > +}
> > +
> > +static inline int ovsrcu_int_get(const ovsrcu_int *i)
> > +{
> > +    return ovsrcu_int_get__(i, memory_order_consume);
> > +}
> > +
> > +static inline int ovsrcu_int_get_protected(const ovsrcu_int *i)
> > +{
> > +    return ovsrcu_int_get__(i, memory_order_relaxed);
> > +}
> > +
> > +static inline void ovsrcu_int_set__(ovsrcu_int *i, int value,
> > +                                    memory_order order)
> > +{
> > +    atomic_store_explicit(&i->v, value, order);
> > +}
> > +
> > +static inline void ovsrcu_int_set(ovsrcu_int *i, int value)
> > +{
> > +    ovsrcu_int_set__(i, value, memory_order_release);
> > +}
> > +
> > +static inline void ovsrcu_int_set_protected(ovsrcu_int *i, int value)
> > +{
> > +    ovsrcu_int_set__(i, value, memory_order_relaxed);
> > +}
> > +
> >  #if __GNUC__
> >  #define OVSRCU_TYPE(TYPE) struct { ATOMIC(TYPE) p; }
> >  #define OVSRCU_INITIALIZER(VALUE) { ATOMIC_VAR_INIT(VALUE) }
> >
> >
> > 2016-07-22 8:55 GMT-07:00 Ciara Loftus <ciara.loftus at intel.com>:
> > This commit introduces support for DPDK 16.07 and consequently breaks
> > compatibility with DPDK 16.04.
> >
> > DPDK 16.07 introduces some changes to various APIs. These have been
> > updated in OVS, including:
> > * xstats API: changes to structure of xstats
> > * vhost API:  replace virtio-net references with 'vid'
> >
> > Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> > Tested-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> >
> > v3:
> > - fixed style issues
> > - fixed & simplified xstats frees
> > - use xcalloc & free instead of rte_mzalloc & rte_free for stats
> > - remove libnuma include
> > - fixed & simplified vHost NUMA set
> > - added flag to indicate device reconfigured at least once
> > - re-add call to rcu synchronise in destroy_device
> > - define IF_NAME_SZ and use instead of PATH_MAX
> >
> > v2:
> > - rebase with DPDK rc2
> > - rebase with OVS master
> > - fix vhost cuse compilation
> > ---
> >  .travis/linux-build.sh   |   2 +-
> >  INSTALL.DPDK-ADVANCED.md |   8 +-
> >  INSTALL.DPDK.md          |  20 ++---
> >  NEWS                     |   1 +
> >  lib/netdev-dpdk.c        | 220
> +++++++++++++++++++++++-----------------------
> > -
> >  5 files changed, 126 insertions(+), 125 deletions(-)
> >
> > diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
> > index 065de39..1b3d43d 100755
> > --- a/.travis/linux-build.sh
> > +++ b/.travis/linux-build.sh
> > @@ -68,7 +68,7 @@ fi
> >
> >  if [ "$DPDK" ]; then
> >      if [ -z "$DPDK_VER" ]; then
> > -        DPDK_VER="16.04"
> > +        DPDK_VER="16.07"
> >      fi
> >      install_dpdk $DPDK_VER
> >      if [ "$CC" = "clang" ]; then
> > diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md
> > index 9ae536d..ec1de29 100644
> > --- a/INSTALL.DPDK-ADVANCED.md
> > +++ b/INSTALL.DPDK-ADVANCED.md
> > @@ -43,7 +43,7 @@ for DPDK and OVS.
> >      For IVSHMEM case, set `export DPDK_TARGET=x86_64-ivshmem-
> > linuxapp-gcc`
> >
> >      ```
> > -    export DPDK_DIR=/usr/src/dpdk-16.04
> > +    export DPDK_DIR=/usr/src/dpdk-16.07
> >      export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
> >      make install T=$DPDK_TARGET DESTDIR=install
> >      ```
> > @@ -339,7 +339,7 @@ For users wanting to do packet forwarding using
> > kernel stack below are the steps
> >         cd /usr/src/cmdline_generator
> >         wget https://raw.githubusercontent.com/netgroup-polito/un-
> > orchestrator/master/orchestrator/compute_controller/plugins/kvm-
> > libvirt/cmdline_generator/cmdline_generator.c
> >         wget https://raw.githubusercontent.com/netgroup-polito/un-
> > orchestrator/master/orchestrator/compute_controller/plugins/kvm-
> > libvirt/cmdline_generator/Makefile
> > -       export RTE_SDK=/usr/src/dpdk-16.04
> > +       export RTE_SDK=/usr/src/dpdk-16.07
> >         export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc
> >         make
> >         ./build/cmdline_generator -m -p dpdkr0 XXX
> > @@ -363,7 +363,7 @@ For users wanting to do packet forwarding using
> > kernel stack below are the steps
> >         mount -t hugetlbfs nodev /dev/hugepages (if not already mounted)
> >
> >         # Build the DPDK ring application in the VM
> > -       export RTE_SDK=/root/dpdk-16.04
> > +       export RTE_SDK=/root/dpdk-16.07
> >         export RTE_TARGET=x86_64-ivshmem-linuxapp-gcc
> >         make
> >
> > @@ -374,7 +374,7 @@ For users wanting to do packet forwarding using
> > kernel stack below are the steps
> >
> >  ## <a name="vhost"></a> 6. Vhost Walkthrough
> >
> > -DPDK 16.04 supports two types of vhost:
> > +DPDK 16.07 supports two types of vhost:
> >
> >  1. vhost-user - enabled default
> >
> > diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
> > index 5407794..9022ad8 100644
> > --- a/INSTALL.DPDK.md
> > +++ b/INSTALL.DPDK.md
> > @@ -21,7 +21,7 @@ The DPDK support of Open vSwitch is considered
> > 'experimental'.
> >
> >  ### Prerequisites
> >
> > -* Required: DPDK 16.04, libnuma
> > +* Required: DPDK 16.07, libnuma
> >  * Hardware: [DPDK Supported NICs] when physical ports in use
> >
> >  ## <a name="build"></a> 2. Building and Installation
> > @@ -42,10 +42,10 @@ advanced install guide [INSTALL.DPDK-
> > ADVANCED.md]
> >
> >       ```
> >       cd /usr/src/
> > -     wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.04.zip
> > -     unzip dpdk-16.04.zip
> > +     wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.07.zip
> > +     unzip dpdk-16.07.zip
> >
> > -     export DPDK_DIR=/usr/src/dpdk-16.04
> > +     export DPDK_DIR=/usr/src/dpdk-16.07
> >       cd $DPDK_DIR
> >       ```
> >
> > @@ -329,9 +329,9 @@ can be found in [Vhost Walkthrough].
> >
> >    ```
> >    cd /root/dpdk/
> > -  wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.04.zip
> > -  unzip dpdk-16.04.zip
> > -  export DPDK_DIR=/root/dpdk/dpdk-16.04
> > +  wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.07.zip
> > +  unzip dpdk-16.07.zip
> > +  export DPDK_DIR=/root/dpdk/dpdk-16.07
> >    export DPDK_TARGET=x86_64-native-linuxapp-gcc
> >    export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
> >    cd $DPDK_DIR
> > @@ -487,7 +487,7 @@ can be found in [Vhost Walkthrough].
> >             </disk>
> >             <disk type='dir' device='disk'>
> >               <driver name='qemu' type='fat'/>
> > -             <source dir='/usr/src/dpdk-16.04'/>
> > +             <source dir='/usr/src/dpdk-16.07'/>
> >               <target dev='vdb' bus='virtio'/>
> >               <readonly/>
> >             </disk>
> > @@ -557,9 +557,9 @@ can be found in [Vhost Walkthrough].
> >      DPDK. It is recommended that users update Network Interface firmware
> > to
> >      match what has been validated for the DPDK release.
> >
> > -    For DPDK 16.04, the list of validated firmware versions can be
> found at:
> > +    For DPDK 16.07, the list of validated firmware versions can be
> found at:
> >
> > -    http://dpdk.org/doc/guides/rel_notes/release_16_04.html
> > +    http://dpdk.org/doc/guides/rel_notes/release_16.07.html
> >
> >
> >  Bug Reporting:
> > diff --git a/NEWS b/NEWS
> > index 1e324dc..4d7255f 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -59,6 +59,7 @@ Post-v2.5.0
> >       * PMD threads servicing vHost User ports can now come from the NUMA
> >         node that device memory is located on if
> > CONFIG_RTE_LIBRTE_VHOST_NUMA
> >         is enabled in DPDK.
> > +     * Support for DPDK 16.07
> >     - Increase number of registers to 16.
> >     - ovs-benchmark: This utility has been removed due to lack of use and
> >       bitrot.
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 7fb6457..ff5716a 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -30,7 +30,6 @@
> >  #include <sys/types.h>
> >  #include <sys/stat.h>
> >  #include <getopt.h>
> > -#include <numaif.h>
> >
> >  #include "dirs.h"
> >  #include "dp-packet.h"
> > @@ -143,6 +142,7 @@ static char *cuse_dev_name = NULL;    /* Character
> > device cuse_dev_name. */
> >  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
> */
> >
> >  #define VHOST_ENQ_RETRY_NUM 8
> > +#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> >
> >  static const struct rte_eth_conf port_conf = {
> >      .rxmode = {
> > @@ -356,8 +356,11 @@ struct netdev_dpdk {
> >       * always true.  */
> >      bool txq_needs_locking;
> >
> > -    /* virtio-net structure for vhost device */
> > -    OVSRCU_TYPE(struct virtio_net *) virtio_dev;
> > +    /* virtio identifier for vhost devices */
> > +    int vid;
> > +
> > +    /* True if vHost device is 'up' and has been reconfigured at least
> once */
> > +    bool vhost_reconfigured;
> >
> >      /* Identifier used to distinguish vhost devices from each other */
> >      char vhost_id[PATH_MAX];
> > @@ -393,8 +396,6 @@ static bool dpdk_thread_is_pmd(void);
> >
> >  static int netdev_dpdk_construct(struct netdev *);
> >
> > -struct virtio_net * netdev_dpdk_get_virtio(const struct netdev_dpdk
> > *dev);
> > -
> >  struct ingress_policer *
> >  netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
> >
> > @@ -750,6 +751,8 @@ netdev_dpdk_init(struct netdev *netdev, unsigned
> > int port_no,
> >      dev->flags = 0;
> >      dev->mtu = ETHER_MTU;
> >      dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
> > +    dev->vid = -1;
> > +    dev->vhost_reconfigured = false;
> >
> >      buf_size = dpdk_buf_size(dev->mtu);
> >      dev->dpdk_mp = dpdk_mp_get(dev->socket_id,
> > FRAME_LEN_TO_MTU(buf_size));
> > @@ -847,6 +850,7 @@ netdev_dpdk_vhost_user_construct(struct netdev
> > *netdev)
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >      const char *name = netdev->name;
> >      int err;
> > +    uint64_t flags = 0;
> >
> >      /* 'name' is appended to 'vhost_sock_dir' and used to create a
> socket in
> >       * the file system. '/' or '\' would traverse directories, so
> they're not
> > @@ -869,7 +873,7 @@ netdev_dpdk_vhost_user_construct(struct netdev
> > *netdev)
> >      snprintf(dev->vhost_id, sizeof(dev->vhost_id), "%s/%s",
> >               vhost_sock_dir, name);
> >
> > -    err = rte_vhost_driver_register(dev->vhost_id);
> > +    err = rte_vhost_driver_register(dev->vhost_id, flags);
> >      if (err) {
> >          VLOG_ERR("vhost-user socket device setup failure for socket
> %s\n",
> >                   dev->vhost_id);
> > @@ -930,7 +934,7 @@ netdev_dpdk_vhost_destruct(struct netdev
> > *netdev)
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >
> >      /* Guest becomes an orphan if still attached. */
> > -    if (netdev_dpdk_get_virtio(dev) != NULL) {
> > +    if (dev->vid >= 0) {
> >          VLOG_ERR("Removing port '%s' while vhost device still
> attached.",
> >                   netdev->name);
> >          VLOG_ERR("To restore connectivity after re-adding of port, VM on
> > socket"
> > @@ -1145,9 +1149,9 @@ ingress_policer_run(struct ingress_policer
> *policer,
> > struct rte_mbuf **pkts,
> >  }
> >
> >  static bool
> > -is_vhost_running(struct virtio_net *virtio_dev)
> > +is_vhost_running(struct netdev_dpdk *dev)
> >  {
> > -    return (virtio_dev != NULL && (virtio_dev->flags &
> > VIRTIO_DEV_RUNNING));
> > +    return (dev->vid >= 0 && dev->vhost_reconfigured);
> >  }
> >
> >  static inline void
> > @@ -1219,18 +1223,17 @@ netdev_dpdk_vhost_rxq_recv(struct
> > netdev_rxq *rxq,
> >                             struct dp_packet_batch *batch)
> >  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
> > -    struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
> >      int qid = rxq->queue_id;
> >      struct ingress_policer *policer =
> netdev_dpdk_get_ingress_policer(dev);
> >      uint16_t nb_rx = 0;
> >      uint16_t dropped = 0;
> >
> > -    if (OVS_UNLIKELY(!is_vhost_running(virtio_dev)
> > +    if (OVS_UNLIKELY(!is_vhost_running(dev)
> >                       || !(dev->flags & NETDEV_UP))) {
> >          return EAGAIN;
> >      }
> >
> > -    nb_rx = rte_vhost_dequeue_burst(virtio_dev, qid * VIRTIO_QNUM +
> > VIRTIO_TXQ,
> > +    nb_rx = rte_vhost_dequeue_burst(dev->vid, qid * VIRTIO_QNUM +
> > VIRTIO_TXQ,
> >                                      dev->dpdk_mp->mp,
> >                                      (struct rte_mbuf **) batch->packets,
> >                                      NETDEV_MAX_BURST);
> > @@ -1331,7 +1334,6 @@ __netdev_dpdk_vhost_send(struct netdev
> > *netdev, int qid,
> >                           bool may_steal)
> >  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> > -    struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
> >      struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
> >      unsigned int total_pkts = cnt;
> >      unsigned int qos_pkts = cnt;
> > @@ -1339,7 +1341,7 @@ __netdev_dpdk_vhost_send(struct netdev
> > *netdev, int qid,
> >
> >      qid = dev->tx_q[qid % netdev->n_txq].map;
> >
> > -    if (OVS_UNLIKELY(!is_vhost_running(virtio_dev) || qid < 0
> > +    if (OVS_UNLIKELY(!is_vhost_running(dev) || qid < 0
> >                       || !(dev->flags & NETDEV_UP))) {
> >          rte_spinlock_lock(&dev->stats_lock);
> >          dev->stats.tx_dropped+= cnt;
> > @@ -1357,7 +1359,7 @@ __netdev_dpdk_vhost_send(struct netdev
> > *netdev, int qid,
> >          int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
> >          unsigned int tx_pkts;
> >
> > -        tx_pkts = rte_vhost_enqueue_burst(virtio_dev, vhost_qid,
> > +        tx_pkts = rte_vhost_enqueue_burst(dev->vid, vhost_qid,
> >                                            cur_pkts, cnt);
> >          if (OVS_LIKELY(tx_pkts)) {
> >              /* Packets have been sent.*/
> > @@ -1699,60 +1701,50 @@ netdev_dpdk_vhost_get_stats(const struct
> > netdev *netdev,
> >
> >  static void
> >  netdev_dpdk_convert_xstats(struct netdev_stats *stats,
> > -                           const struct rte_eth_xstats *xstats,
> > +                           const struct rte_eth_xstat *xstats,
> > +                           const struct rte_eth_xstat_name *names,
> >                             const unsigned int size)
> >  {
> > -    /* XXX Current implementation is simple search through an array
> > -     * to find hardcoded counter names. In future DPDK release (TBD)
> > -     * XSTATS API will change so each counter will be represented by
> > -     * unique ID instead of String. */
> > -
> >      for (unsigned int i = 0; i < size; i++) {
> > -        if (strcmp(XSTAT_RX_64_PACKETS, xstats[i].name) == 0) {
> > +        if (strcmp(XSTAT_RX_64_PACKETS, names[i].name) == 0) {
> >              stats->rx_1_to_64_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_65_TO_127_PACKETS, xstats[i].name)
> == 0) {
> > +        } else if (strcmp(XSTAT_RX_65_TO_127_PACKETS, names[i].name) ==
> 0)
> > {
> >              stats->rx_65_to_127_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_128_TO_255_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_RX_128_TO_255_PACKETS, names[i].name) ==
> > 0) {
> >              stats->rx_128_to_255_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_256_TO_511_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_RX_256_TO_511_PACKETS, names[i].name) ==
> > 0) {
> >              stats->rx_256_to_511_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_512_TO_1023_PACKETS,
> > -                          xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_RX_512_TO_1023_PACKETS, names[i].name)
> ==
> > 0) {
> >              stats->rx_512_to_1023_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_1024_TO_1522_PACKETS,
> > -                          xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_RX_1024_TO_1522_PACKETS, names[i].name)
> > == 0) {
> >              stats->rx_1024_to_1522_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_1523_TO_MAX_PACKETS,
> > -                          xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_RX_1523_TO_MAX_PACKETS, names[i].name)
> > == 0) {
> >              stats->rx_1523_to_max_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_64_PACKETS, xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_TX_64_PACKETS, names[i].name) == 0) {
> >              stats->tx_1_to_64_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_65_TO_127_PACKETS, xstats[i].name)
> == 0) {
> > +        } else if (strcmp(XSTAT_TX_65_TO_127_PACKETS, names[i].name) ==
> 0)
> > {
> >              stats->tx_65_to_127_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_128_TO_255_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_TX_128_TO_255_PACKETS, names[i].name) ==
> > 0) {
> >              stats->tx_128_to_255_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_256_TO_511_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_TX_256_TO_511_PACKETS, names[i].name) ==
> > 0) {
> >              stats->tx_256_to_511_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_512_TO_1023_PACKETS,
> > -                          xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_TX_512_TO_1023_PACKETS, names[i].name)
> ==
> > 0) {
> >              stats->tx_512_to_1023_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_1024_TO_1522_PACKETS,
> > -                          xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_TX_1024_TO_1522_PACKETS, names[i].name)
> > == 0) {
> >              stats->tx_1024_to_1522_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_1523_TO_MAX_PACKETS,
> > -                          xstats[i].name) == 0) {
> > +        } else if (strcmp(XSTAT_TX_1523_TO_MAX_PACKETS, names[i].name)
> > == 0) {
> >              stats->tx_1523_to_max_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_MULTICAST_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_TX_MULTICAST_PACKETS, names[i].name) ==
> > 0) {
> >              stats->tx_multicast_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_BROADCAST_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_RX_BROADCAST_PACKETS, names[i].name) ==
> > 0) {
> >              stats->rx_broadcast_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_TX_BROADCAST_PACKETS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_TX_BROADCAST_PACKETS, names[i].name) ==
> > 0) {
> >              stats->tx_broadcast_packets = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_UNDERSIZED_ERRORS, xstats[i].name)
> == 0)
> > {
> > +        } else if (strcmp(XSTAT_RX_UNDERSIZED_ERRORS, names[i].name) ==
> > 0) {
> >              stats->rx_undersized_errors = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_FRAGMENTED_ERRORS, xstats[i].name) ==
> > 0) {
> > +        } else if (strcmp(XSTAT_RX_FRAGMENTED_ERRORS, names[i].name) ==
> > 0) {
> >              stats->rx_fragmented_errors = xstats[i].value;
> > -        } else if (strcmp(XSTAT_RX_JABBER_ERRORS, xstats[i].name) == 0)
> {
> > +        } else if (strcmp(XSTAT_RX_JABBER_ERRORS, names[i].name) == 0) {
> >              stats->rx_jabber_errors = xstats[i].value;
> >          }
> >      }
> > @@ -1768,7 +1760,8 @@ netdev_dpdk_get_stats(const struct netdev
> > *netdev, struct netdev_stats *stats)
> >      netdev_dpdk_get_carrier(netdev, &gg);
> >      ovs_mutex_lock(&dev->mutex);
> >
> > -    struct rte_eth_xstats *rte_xstats;
> > +    struct rte_eth_xstat *rte_xstats = NULL;
> > +    struct rte_eth_xstat_name *rte_xstats_names = NULL;
> >      int rte_xstats_len, rte_xstats_ret;
> >
> >      if (rte_eth_stats_get(dev->port_id, &rte_stats)) {
> > @@ -1777,20 +1770,37 @@ netdev_dpdk_get_stats(const struct netdev
> > *netdev, struct netdev_stats *stats)
> >          return EPROTO;
> >      }
> >
> > -    rte_xstats_len = rte_eth_xstats_get(dev->port_id, NULL, 0);
> > -    if (rte_xstats_len > 0) {
> > -        rte_xstats = dpdk_rte_mzalloc(sizeof(*rte_xstats) *
> rte_xstats_len);
> > -        memset(rte_xstats, 0xff, sizeof(*rte_xstats) * rte_xstats_len);
> > -        rte_xstats_ret = rte_eth_xstats_get(dev->port_id, rte_xstats,
> > -                                            rte_xstats_len);
> > -        if (rte_xstats_ret > 0 && rte_xstats_ret <= rte_xstats_len) {
> > -            netdev_dpdk_convert_xstats(stats, rte_xstats,
> rte_xstats_ret);
> > -        }
> > -        rte_free(rte_xstats);
> > +    /* Get length of statistics */
> > +    rte_xstats_len = rte_eth_xstats_get_names(dev->port_id, NULL, 0);
> > +    if (rte_xstats_len < 0) {
> > +        VLOG_WARN("Cannot get XSTATS values for port: %i",
> dev->port_id);
> > +        goto out;
> > +    }
> > +    /* Reserve memory for xstats names and values */
> > +    rte_xstats_names = xcalloc(rte_xstats_len,
> sizeof(*rte_xstats_names));
> > +    rte_xstats = xcalloc(rte_xstats_len, sizeof(*rte_xstats));
> > +
> > +    /* Retreive xstats names */
> > +    if (rte_xstats_len != rte_eth_xstats_get_names(dev->port_id,
> > +                                       rte_xstats_names,
> rte_xstats_len)) {
> > +        VLOG_WARN("Cannot get XSTATS names for port: %i.",
> dev->port_id);
> > +        goto out;
> > +    }
> > +    /* Retreive xstats values */
> > +    memset(rte_xstats, 0xff, sizeof(*rte_xstats) * rte_xstats_len);
> > +    rte_xstats_ret = rte_eth_xstats_get(dev->port_id, rte_xstats,
> > +                                        rte_xstats_len);
> > +    if (rte_xstats_ret > 0 && rte_xstats_ret <= rte_xstats_len) {
> > +        netdev_dpdk_convert_xstats(stats, rte_xstats, rte_xstats_names,
> > +                                   rte_xstats_len);
> >      } else {
> > -        VLOG_WARN("Can't get XSTATS counters for port: %i.",
> dev->port_id);
> > +        VLOG_WARN("Cannot get XSTATS values for port: %i.",
> dev->port_id);
> >      }
> >
> > +out:
> > +    free(rte_xstats);
> > +    free(rte_xstats_names);
> > +
> >      stats->rx_packets = rte_stats.ipackets;
> >      stats->tx_packets = rte_stats.opackets;
> >      stats->rx_bytes = rte_stats.ibytes;
> > @@ -1798,7 +1808,6 @@ netdev_dpdk_get_stats(const struct netdev
> > *netdev, struct netdev_stats *stats)
> >      /* DPDK counts imissed as errors, but count them here as dropped
> instead
> > */
> >      stats->rx_errors = rte_stats.ierrors - rte_stats.imissed;
> >      stats->tx_errors = rte_stats.oerrors;
> > -    stats->multicast = rte_stats.imcasts;
> >
> >      rte_spinlock_lock(&dev->stats_lock);
> >      stats->tx_dropped = dev->stats.tx_dropped;
> > @@ -1965,11 +1974,10 @@ static int
> >  netdev_dpdk_vhost_get_carrier(const struct netdev *netdev, bool
> *carrier)
> >  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> > -    struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
> >
> >      ovs_mutex_lock(&dev->mutex);
> >
> > -    if (is_vhost_running(virtio_dev)) {
> > +    if (is_vhost_running(dev)) {
> >          *carrier = 1;
> >      } else {
> >          *carrier = 0;
> > @@ -2038,10 +2046,9 @@ netdev_dpdk_update_flags__(struct
> > netdev_dpdk *dev,
> >          /* If DPDK_DEV_VHOST device's NETDEV_UP flag was changed and
> > vhost is
> >           * running then change netdev's change_seq to trigger link state
> >           * update. */
> > -        struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
> >
> >          if ((NETDEV_UP & ((*old_flagsp ^ on) | (*old_flagsp ^ off)))
> > -            && is_vhost_running(virtio_dev)) {
> > +            && is_vhost_running(dev)) {
> >              netdev_change_seq_changed(&dev->up);
> >
> >              /* Clear statistics if device is getting up. */
> > @@ -2169,15 +2176,15 @@ netdev_dpdk_set_admin_state(struct
> > unixctl_conn *conn, int argc,
> >   * Set virtqueue flags so that we do not receive interrupts.
> >   */
> >  static void
> > -set_irq_status(struct virtio_net *virtio_dev)
> > +set_irq_status(int vid)
> >  {
> >      uint32_t i;
> >      uint64_t idx;
> >
> > -    for (i = 0; i < virtio_dev->virt_qp_nb; i++) {
> > +    for (i = 0; i < rte_vhost_get_queue_num(vid); i++) {
> >          idx = i * VIRTIO_QNUM;
> > -        rte_vhost_enable_guest_notification(virtio_dev, idx +
> VIRTIO_RXQ, 0);
> > -        rte_vhost_enable_guest_notification(virtio_dev, idx +
> VIRTIO_TXQ, 0);
> > +        rte_vhost_enable_guest_notification(vid, idx + VIRTIO_RXQ, 0);
> > +        rte_vhost_enable_guest_notification(vid, idx + VIRTIO_TXQ, 0);
> >      }
> >  }
> >
> > @@ -2226,26 +2233,25 @@ netdev_dpdk_remap_txqs(struct netdev_dpdk
> > *dev)
> >   * A new virtio-net device is added to a vhost port.
> >   */
> >  static int
> > -new_device(struct virtio_net *virtio_dev)
> > +new_device(int vid)
> >  {
> >      struct netdev_dpdk *dev;
> >      bool exists = false;
> >      int newnode = 0;
> > -    long err = 0;
> > +    char ifname[IF_NAME_SZ];
> > +
> > +    rte_vhost_get_ifname(vid, ifname, sizeof(ifname));
> >
> >      ovs_mutex_lock(&dpdk_mutex);
> >      /* Add device to the vhost port with the same name as that passed
> down.
> > */
> >      LIST_FOR_EACH(dev, list_node, &dpdk_list) {
> > -        if (strncmp(virtio_dev->ifname, dev->vhost_id, IF_NAME_SZ) ==
> 0) {
> > -            uint32_t qp_num = virtio_dev->virt_qp_nb;
> > +        if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
> > +            uint32_t qp_num = rte_vhost_get_queue_num(vid);
> >
> >              ovs_mutex_lock(&dev->mutex);
> >              /* Get NUMA information */
> > -            err = get_mempolicy(&newnode, NULL, 0, virtio_dev,
> > -                                MPOL_F_NODE | MPOL_F_ADDR);
> > -            if (err) {
> > -                VLOG_INFO("Error getting NUMA info for vHost Device
> '%s'",
> > -                        virtio_dev->ifname);
> > +            newnode = rte_vhost_get_numa_node(vid);
> > +            if (newnode == -1) {
> >                  newnode = dev->socket_id;
> >              }
> >
> > @@ -2254,11 +2260,11 @@ new_device(struct virtio_net *virtio_dev)
> >              dev->requested_n_txq = qp_num;
> >              netdev_request_reconfigure(&dev->up);
> >
> > -            ovsrcu_set(&dev->virtio_dev, virtio_dev);
> > +            dev->vid = vid;
> >              exists = true;
> >
> >              /* Disable notifications. */
> > -            set_irq_status(virtio_dev);
> > +            set_irq_status(dev->vid);
> >              netdev_change_seq_changed(&dev->up);
> >              ovs_mutex_unlock(&dev->mutex);
> >              break;
> > @@ -2267,14 +2273,14 @@ new_device(struct virtio_net *virtio_dev)
> >      ovs_mutex_unlock(&dpdk_mutex);
> >
> >      if (!exists) {
> > -        VLOG_INFO("vHost Device '%s' %"PRIu64" can't be added - name
> not "
> > -                  "found", virtio_dev->ifname, virtio_dev->device_fh);
> > +        VLOG_INFO("vHost Device '%s' can't be added - name not found",
> > ifname);
> >
> >          return -1;
> >      }
> >
> > -    VLOG_INFO("vHost Device '%s' %"PRIu64" has been added on numa node
> > %i",
> > -              virtio_dev->ifname, virtio_dev->device_fh, newnode);
> > +    VLOG_INFO("vHost Device '%s' has been added on numa node %i",
> > +              ifname, newnode);
> > +
> >      return 0;
> >  }
> >
> > @@ -2297,18 +2303,21 @@ netdev_dpdk_txq_map_clear(struct
> > netdev_dpdk *dev)
> >   *  the device.
> >   */
> >  static void
> > -destroy_device(volatile struct virtio_net *virtio_dev)
> > +destroy_device(int vid)
> >  {
> >      struct netdev_dpdk *dev;
> >      bool exists = false;
> > +    char ifname[IF_NAME_SZ];
> > +
> > +    rte_vhost_get_ifname(vid, ifname, sizeof(ifname));
> >
> >      ovs_mutex_lock(&dpdk_mutex);
> >      LIST_FOR_EACH (dev, list_node, &dpdk_list) {
> > -        if (netdev_dpdk_get_virtio(dev) == virtio_dev) {
> > +        if (dev->vid == vid) {
> >
> >              ovs_mutex_lock(&dev->mutex);
> > -            virtio_dev->flags &= ~VIRTIO_DEV_RUNNING;
> > -            ovsrcu_set(&dev->virtio_dev, NULL);
> > +            dev->vhost_reconfigured = false;
> > +            dev->vid = -1;
> >              /* Clear tx/rx queue settings. */
> >              netdev_dpdk_txq_map_clear(dev);
> >              dev->requested_n_rxq = NR_QUEUE;
> > @@ -2324,7 +2333,7 @@ destroy_device(volatile struct virtio_net
> > *virtio_dev)
> >
> >      ovs_mutex_unlock(&dpdk_mutex);
> >
> > -    if (exists == true) {
> > +    if (exists) {
> >          /*
> >           * Wait for other threads to quiesce after setting the
> 'virtio_dev'
> >           * to NULL, before returning.
> > @@ -2335,21 +2344,21 @@ destroy_device(volatile struct virtio_net
> > *virtio_dev)
> >           * put thread back into quiescent state before returning.
> >           */
> >          ovsrcu_quiesce_start();
> > -        VLOG_INFO("vHost Device '%s' %"PRIu64" has been removed",
> > -                  virtio_dev->ifname, virtio_dev->device_fh);
> > +        VLOG_INFO("vHost Device '%s' has been removed", ifname);
> >      } else {
> > -        VLOG_INFO("vHost Device '%s' %"PRIu64" not found", virtio_dev-
> > >ifname,
> > -                  virtio_dev->device_fh);
> > +        VLOG_INFO("vHost Device '%s' not found", ifname);
> >      }
> >  }
> >
> >  static int
> > -vring_state_changed(struct virtio_net *virtio_dev, uint16_t queue_id,
> > -                    int enable)
> > +vring_state_changed(int vid, uint16_t queue_id, int enable)
> >  {
> >      struct netdev_dpdk *dev;
> >      bool exists = false;
> >      int qid = queue_id / VIRTIO_QNUM;
> > +    char ifname[IF_NAME_SZ];
> > +
> > +    rte_vhost_get_ifname(vid, ifname, sizeof(ifname));
> >
> >      if (queue_id % VIRTIO_QNUM == VIRTIO_TXQ) {
> >          return 0;
> > @@ -2357,7 +2366,7 @@ vring_state_changed(struct virtio_net *virtio_dev,
> > uint16_t queue_id,
> >
> >      ovs_mutex_lock(&dpdk_mutex);
> >      LIST_FOR_EACH (dev, list_node, &dpdk_list) {
> > -        if (strncmp(virtio_dev->ifname, dev->vhost_id, IF_NAME_SZ) ==
> 0) {
> > +        if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
> >              ovs_mutex_lock(&dev->mutex);
> >              if (enable) {
> >                  dev->tx_q[qid].map = qid;
> > @@ -2373,25 +2382,17 @@ vring_state_changed(struct virtio_net
> > *virtio_dev, uint16_t queue_id,
> >      ovs_mutex_unlock(&dpdk_mutex);
> >
> >      if (exists) {
> > -        VLOG_INFO("State of queue %d ( tx_qid %d ) of vhost device '%s'
> %"
> > -                  PRIu64" changed to \'%s\'", queue_id, qid,
> > -                  virtio_dev->ifname, virtio_dev->device_fh,
> > +        VLOG_INFO("State of queue %d ( tx_qid %d ) of vhost device '%s'"
> > +                  "changed to \'%s\'", queue_id, qid, ifname,
> >                    (enable == 1) ? "enabled" : "disabled");
> >      } else {
> > -        VLOG_INFO("vHost Device '%s' %"PRIu64" not found", virtio_dev-
> > >ifname,
> > -                  virtio_dev->device_fh);
> > +        VLOG_INFO("vHost Device '%s' not found", ifname);
> >          return -1;
> >      }
> >
> >      return 0;
> >  }
> >
> > -struct virtio_net *
> > -netdev_dpdk_get_virtio(const struct netdev_dpdk *dev)
> > -{
> > -    return ovsrcu_get(struct virtio_net *, &dev->virtio_dev);
> > -}
> > -
> >  struct ingress_policer *
> >  netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev)
> >  {
> > @@ -2842,7 +2843,6 @@ static int
> >  netdev_dpdk_vhost_user_reconfigure(struct netdev *netdev)
> >  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> > -    struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
> >      int err = 0;
> >
> >      ovs_mutex_lock(&dpdk_mutex);
> > @@ -2868,8 +2868,8 @@ netdev_dpdk_vhost_user_reconfigure(struct
> > netdev *netdev)
> >          }
> >      }
> >
> > -    if (virtio_dev) {
> > -        virtio_dev->flags |= VIRTIO_DEV_RUNNING;
> > +    if (dev->vid >= 0) {
> > +        dev->vhost_reconfigured = true;
> >      }
> >
> >      ovs_mutex_unlock(&dev->mutex);
> > @@ -3336,7 +3336,7 @@ dpdk_init__(const struct smap *ovs_other_config)
> >      /* Register CUSE device to handle IOCTLs.
> >       * Unless otherwise specified, cuse_dev_name is set to vhost-net.
> >       */
> > -    err = rte_vhost_driver_register(cuse_dev_name);
> > +    err = rte_vhost_driver_register(cuse_dev_name, 0);
> >
> >      if (err != 0) {
> >          VLOG_ERR("CUSE device setup failure.");
> > --
> > 2.4.3
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
>
>



More information about the dev mailing list