[ovs-dev] [PATCH v1 branch-2.9] netdev-dpdk: Reintroduce shared mempools.

Jan Scheurich jan.scheurich at ericsson.com
Tue Feb 13 15:56:09 UTC 2018


Thanks, Ian!
This will give us time to come up with a proper solution for 2.10. Let's work on that now.
/Jan

> -----Original Message-----
> From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-bounces at openvswitch.org] On Behalf Of Stokes, Ian
> Sent: Tuesday, 13 February, 2018 16:52
> To: Stokes, Ian <ian.stokes at intel.com>; Kevin Traynor <ktraynor at redhat.com>; dev at openvswitch.org
> Cc: Ilya Maximets <i.maximets at samsung.com>; Antonio Fischetti <antonio.fischetti at gmail.com>
> Subject: Re: [ovs-dev] [PATCH v1 branch-2.9] netdev-dpdk: Reintroduce shared mempools.
> 
> > >
> > > On 02/13/2018 10:59 AM, Ian Stokes wrote:
> > > > This commit manually reverts the current per port mempool model to
> > > > the previous shared mempool model for DPDK ports.
> > > >
> > > > OVS previously used a shared mempool model for ports with the same
> > > > MTU configuration. This was replaced by a per port mempool model to
> > > > address issues flagged by users such as:
> > > >
> > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/04
> > > > 25
> > > > 60.html
> > > >
> > > > However the per port model has a number of issues including:
> > > >
> > > > 1. Requires an increase in memory resource requirements to support
> > > > the same number of ports as the shared port model.
> > > > 2. Incorrect algorithm for mbuf provisioning for each mempool.
> > > >
> > > > These are considered blocking factors for current deployments of OVS
> > > > when upgrading to OVS 2.9 as a  user may have to redimension memory
> > > > for the same deployment configuration. This may not be possible for
> > > users.
> > > >
> > > > For clarity, the commits whose changes are removed include the
> > > > following:
> > > >
> > > > netdev-dpdk: Create separate memory pool for each port: d555d9b
> > > > netdev-dpdk: fix management of pre-existing mempools: b6b26021d Fix
> > > > mempool names to reflect socket id: f06546a
> > > > netdev-dpdk: skip init for existing mempools: 837c176
> > > > netdev-dpdk: manage failure in mempool name creation: 65056fd
> > > > netdev-dpdk: Reword mp_size as n_mbufs: ad9b5b9
> > > > netdev-dpdk: Rename dpdk_mp_put as dpdk_mp_free: a08a115
> > > > netdev-dpdk: Fix mp_name leak on snprintf failure: ec6edc8
> > > > netdev-dpdk: Fix dpdk_mp leak in case of EEXIST: 173ef76
> > > > netdev-dpdk: Factor out struct dpdk_mp: 24e78f9
> > > > netdev-dpdk: Remove unused MAX_NB_MBUF: bc57ed9
> > > > netdev-dpdk: Fix mempool creation with large MTU: af5b0da
> > > >
> > > > Due to the number of commits and period of time they were introduced
> > > > over, a simple revert was not possible. All code from the commits
> > > > above is removed and the shared mempool code reintroduced as it was
> > > > before its replacement.
> > > >
> > > > Code introduced by commit
> > > >
> > > > netdev-dpdk: Add debug appctl to get mempool information: be48173
> > > >
> > > > has been modified to work with the shared mempool model.
> > > >
> > >
> > > There is a couple of small changes for coding stds to this version
> > > from the rfc, but one is not correct (see below). Apart from that...
> >
> > Thanks for flagging this.
> 
> I've pushed this to dpdk_merge_2_9 and its part of the branch 2.9 pull request.
> 
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344413.html
> 
> Note this revert only applies to branch 2.9, I have not applied it to master.
> 
> Thanks
> Ian
> 
> >
> > >
> > > Acked-by: Kevin Traynor <ktraynor at redhat.com>
> > > Tested-by: Kevin Traynor <ktraynor at redhat.com>
> > >
> >
> > ...
> >
> > > >      ovs_mutex_lock(&dpdk_mp_mutex);
> > > > -    VLOG_DBG("Releasing \"%s\" mempool", mp->name);
> > > > -    rte_mempool_free(mp);
> > > > +    ovs_assert(dmp->refcount);
> > > > +
> > > > +    if (! --dmp->refcount) {
> > >
> > > The space should not be added between ! and --
> > >
> > > http://docs.openvswitch.org/en/latest/internals/contributing/coding-
> > > style/#expressions
> >
> > Good catch, I can fix this before committing if there are no other
> > comments.
> >
> > Thanks
> > Ian
> > >
> > > > +        ovs_list_remove(&dmp->list_node);
> > > > +        rte_mempool_free(dmp->mp);
> > > > +        rte_free(dmp);
> > > > +     }
> > > >      ovs_mutex_unlock(&dpdk_mp_mutex);  }
> > > >
> > > > -/* Tries to allocate a new mempool - or re-use an existing one
> > > > where
> > > > - * appropriate - on requested_socket_id with a size determined by
> > > > - * requested_mtu and requested Rx/Tx queues.
> > > > - * On success - or when re-using an existing mempool - the new
> > > > configuration
> > > > - * will be applied.
> > > > +/* Tries to allocate new mempool on requested_socket_id with
> > > > + * mbuf size corresponding to requested_mtu.
> > > > + * On success new configuration will be applied.
> > > >   * On error, device will be left unchanged. */  static int
> > > > netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
> > > >      OVS_REQUIRES(dev->mutex)
> > > >  {
> > > >      uint32_t buf_size = dpdk_buf_size(dev->requested_mtu);
> > > > -    struct rte_mempool *mp;
> > > > -    int ret = 0;
> > > > +    struct dpdk_mp *mp;
> > > >
> > > > -    mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size));
> > > > +    mp = dpdk_mp_get(dev->requested_socket_id,
> > > > + FRAME_LEN_TO_MTU(buf_size));
> > > >      if (!mp) {
> > > >          VLOG_ERR("Failed to create memory pool for netdev "
> > > >                   "%s, with MTU %d on socket %d: %s\n",
> > > >                   dev->up.name, dev->requested_mtu, dev-
> > > >requested_socket_id,
> > > > -                 rte_strerror(rte_errno));
> > > > -        ret = rte_errno;
> > > > +        rte_strerror(rte_errno));
> > > > +        return rte_errno;
> > > >      } else {
> > > > -        /* If a new MTU was requested and its rounded value equals
> > the
> > > one
> > > > -         * that is currently used, then the existing mempool is
> > > returned. */
> > > > -        if (dev->mp != mp) {
> > > > -            /* A new mempool was created, release the previous one.
> > */
> > > > -            dpdk_mp_free(dev->mp);
> > > > -        } else {
> > > > -            ret = EEXIST;
> > > > -        }
> > > > -        dev->mp = mp;
> > > > +        dpdk_mp_put(dev->dpdk_mp);
> > > > +        dev->dpdk_mp = mp;
> > > >          dev->mtu = dev->requested_mtu;
> > > >          dev->socket_id = dev->requested_socket_id;
> > > >          dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
> > > >      }
> > > >
> > > > -    return ret;
> > > > +    return 0;
> > > >  }
> > > >
> > > >  static void
> > > > @@ -742,7 +765,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk
> > > > *dev, int n_rxq, int n_txq)
> > > >
> > > >          for (i = 0; i < n_rxq; i++) {
> > > >              diag = rte_eth_rx_queue_setup(dev->port_id, i, dev-
> > > >rxq_size,
> > > > -                                          dev->socket_id, NULL, dev-
> > > >mp);
> > > > +                                          dev->socket_id, NULL,
> > > > +                                          dev->dpdk_mp->mp);
> > > >              if (diag) {
> > > >                  VLOG_INFO("Interface %s rxq(%d) setup error: %s",
> > > >                            dev->up.name, i, rte_strerror(-diag)); @@
> > > > -826,7 +850,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
> > > >      memcpy(dev->hwaddr.ea, eth_addr.addr_bytes, ETH_ADDR_LEN);
> > > >      rte_eth_link_get_nowait(dev->port_id, &dev->link);
> > > >
> > > > -    mbp_priv = rte_mempool_get_priv(dev->mp);
> > > > +    mbp_priv = rte_mempool_get_priv(dev->dpdk_mp->mp);
> > > >      dev->buf_size = mbp_priv->mbuf_data_room_size -
> > > > RTE_PKTMBUF_HEADROOM;
> > > >
> > > >      /* Get the Flow control configuration for DPDK-ETH */ @@
> > > > -1079,7
> > > > +1103,7 @@ common_destruct(struct netdev_dpdk *dev)
> > > >      OVS_EXCLUDED(dev->mutex)
> > > >  {
> > > >      rte_free(dev->tx_q);
> > > > -    dpdk_mp_free(dev->mp);
> > > > +    dpdk_mp_put(dev->dpdk_mp);
> > > >
> > > >      ovs_list_remove(&dev->list_node);
> > > >      free(ovsrcu_get_protected(struct ingress_policer *, @@ -1823,7
> > > > +1847,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
> > > >      }
> > > >
> > > >      nb_rx = rte_vhost_dequeue_burst(vid, qid * VIRTIO_QNUM +
> > > VIRTIO_TXQ,
> > > > -                                    dev->mp,
> > > > +                                    dev->dpdk_mp->mp,
> > > >                                      (struct rte_mbuf **) batch-
> > > >packets,
> > > >                                      NETDEV_MAX_BURST);
> > > >      if (!nb_rx) {
> > > > @@ -2043,7 +2067,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int
> > > >qid,
> > > struct dp_packet_batch *batch)
> > > >              continue;
> > > >          }
> > > >
> > > > -        pkts[txcnt] = rte_pktmbuf_alloc(dev->mp);
> > > > +        pkts[txcnt] = rte_pktmbuf_alloc(dev->dpdk_mp->mp);
> > > >          if (OVS_UNLIKELY(!pkts[txcnt])) {
> > > >              dropped += cnt - i;
> > > >              break;
> > > > @@ -2919,7 +2943,7 @@ netdev_dpdk_get_mempool_info(struct
> > > > unixctl_conn
> > > *conn,
> > > >          ovs_mutex_lock(&dev->mutex);
> > > >          ovs_mutex_lock(&dpdk_mp_mutex);
> > > >
> > > > -        rte_mempool_dump(stream, dev->mp);
> > > > +        rte_mempool_dump(stream, dev->dpdk_mp->mp);
> > > >
> > > >          ovs_mutex_unlock(&dpdk_mp_mutex);
> > > >          ovs_mutex_unlock(&dev->mutex); @@ -3556,9 +3580,12 @@
> > > > netdev_dpdk_reconfigure(struct netdev *netdev)
> > > >
> > > >      rte_eth_dev_stop(dev->port_id);
> > > >
> > > > -    err = netdev_dpdk_mempool_configure(dev);
> > > > -    if (err && err != EEXIST) {
> > > > -        goto out;
> > > > +    if (dev->mtu != dev->requested_mtu
> > > > +        || dev->socket_id != dev->requested_socket_id) {
> > > > +        err = netdev_dpdk_mempool_configure(dev);
> > > > +        if (err) {
> > > > +            goto out;
> > > > +        }
> > > >      }
> > > >
> > > >      netdev->n_txq = dev->requested_n_txq; @@ -3596,13 +3623,16 @@
> > > > dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
> > > >
> > > >      netdev_dpdk_remap_txqs(dev);
> > > >
> > > > -    err = netdev_dpdk_mempool_configure(dev);
> > > > -    if (!err) {
> > > > -        /* A new mempool was created. */
> > > > -        netdev_change_seq_changed(&dev->up);
> > > > -    } else if (err != EEXIST){
> > > > -        return err;
> > > > +    if (dev->requested_socket_id != dev->socket_id
> > > > +        || dev->requested_mtu != dev->mtu) {
> > > > +        err = netdev_dpdk_mempool_configure(dev);
> > > > +        if (err) {
> > > > +            return err;
> > > > +        } else {
> > > > +            netdev_change_seq_changed(&dev->up);
> > > > +        }
> > > >      }
> > > > +
> > > >      if (netdev_dpdk_get_vid(dev) >= 0) {
> > > >          if (dev->vhost_reconfigured == false) {
> > > >              dev->vhost_reconfigured = true;
> > > >
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list