[ovs-dev] [PATCH v2 2/2] netdev-dpdk.c: Add ingress-policing functionality.

Stokes, Ian ian.stokes at intel.com
Tue May 24 16:33:15 UTC 2016


Thanks for taking the time to review this Daniele, I will submit a new V3 this evening with the required changes.

Thanks
Ian

From: Daniele Di Proietto [mailto:diproiettod at ovn.org]
Sent: Tuesday, May 24, 2016 1:42 AM
To: Stokes, Ian
Cc: dev at openvswitch.org
Subject: Re: [ovs-dev] [PATCH v2 2/2] netdev-dpdk.c: Add ingress-policing functionality.

Thanks for the patch, it's simpler than I expected (all the infrastructure was already there).

Minor comments inline, otherwise this looks good to me.
If you agree with the comments, would you mind sending another version?

Daniele

2016-05-10 2:19 GMT-07:00 Ian Stokes <ian.stokes at intel.com<mailto:ian.stokes at intel.com>>:
This patch provides the modifications required in netdev-dpdk.c and
vswitch.xml to enable ingress policing for DPDK interfaces.

This patch implements the necessary netdev functions to netdev-dpdk.c as
well as various helper functions required for ingress policing.

The vswitch.xml has been modified to explain the expected parameters and
behaviour when using ingress policing.

The INSTALL.DPDK.md<http://INSTALL.DPDK.md> guide has been modified to provide an example
configuration of ingress policing.

Signed-off-by: Ian Stokes <ian.stokes at intel.com<mailto:ian.stokes at intel.com>>
---

v2:
- Rebase to latest master.

*netdev-dpdk.c
- Modied 'netdev_dpdk_set_policing()' to set default value of of
  policer_burst to 8000 kbits.
- Modified 'netdev_dpdk_policer_construct__()' to use '1000/8' for clarity
  instead of '125' when converting kbits to bytes for rate and burst.

v1:

*INSTALL.DPDK.md<http://INSTALL.DPDK.md>
- Add ingress_policing usage example.

*NEWS
- Add support for ingress_policing to DPDK section.

*netdev-dpdk.c
- Add ingress_policer struct.
- Modify struct netdev-dpdk to include a ovsrcu pointer to an ingress
  policer.
- Modify netdev_dpdk_init() to initialise ingress policer to null.
- Add function 'ingress_policer_run()'to process packets with a given
  netdev, rte_mbuf and packet count.
- Modified 'netdev_dpdk_vhost_update_rx_counters()' to accept new
  parameter int dropped representing number of packets dropped.
- Modified 'netdev_dpdk_vhost_update_rx_counters()' to update number of
  dropped packets in stats->rx_dropped.
- Modified 'netdev_dpdk_vhost_rxq_recv()' to check and call
  ingress_policing functionality with 'ingress_policer_run()'.
- Modified 'netdev_dpdk_rxq_recv()' to check and call
  ingress_policing functionality with 'ingress_policer_run()'.
- Modified 'netdev_dpdk_rxq_recv()' to update rx dropped stats.
- Modified 'netdev_dpdk_vhost_get_stats()' to add support for rx_dropped.
- Add function 'netdev_dpdk_policer_construct__()' to create an
  ingress_policer.
- Add function 'netdev_dpdk_policer_destruct__()' to destroy an ingress
  policer.
- Add function 'netdev_dpdk_set_policing()' responsible for calling
  helper functions to create and destroy an ingress policer for the
  device.
- Add function 'netdev_dpdk_get_ingress_policer()' to return a pointer
  to the ingress_policer using the ovsrcu_get function.

*vswitch.xml
- Modify ingress policing section to include support in OVS with DPDK.
---
 INSTALL.DPDK.md<http://INSTALL.DPDK.md>      |   19 ++++++
 NEWS                 |    1 +
 lib/netdev-dpdk.c    |  173 ++++++++++++++++++++++++++++++++++++++++++++++++-
 vswitchd/vswitch.xml |    4 +-
 4 files changed, 191 insertions(+), 6 deletions(-)

diff --git a/INSTALL.DPDK.md<http://INSTALL.DPDK.md> b/INSTALL.DPDK.md<http://INSTALL.DPDK.md>
index 93f92e4..68735cc 100644
--- a/INSTALL.DPDK.md<http://INSTALL.DPDK.md>
+++ b/INSTALL.DPDK.md<http://INSTALL.DPDK.md>
@@ -267,6 +267,25 @@ Using the DPDK with ovs-vswitchd:
    For more details regarding egress-policer parameters please refer to the
    vswitch.xml.

+9. Ingress Policing Example
+
+   Assuming you have a vhost-user port receiving traffic consisting of
+   packets of size 64 bytes, the following command would limit the reception
+   rate of the port to ~1,000,000 packets per second:
+
+   `ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000
+    ingress_policing_burst=1000`
+
+   To examine the ingress policer configuration of the port:
+
+   `ovs-vsctl list interface vhost-user0`
+
+   To clear the ingress policer configuration from the port use the following:
+
+   `ovs-vsctl set interface vhost-user0 ingress_policing_rate=0`
+
+   For more details regarding ingress-policer see the vswitch.xml.
+
 Performance Tuning:
 -------------------

diff --git a/NEWS b/NEWS
index 4e81cad..ba201cf 100644
--- a/NEWS
+++ b/NEWS
@@ -32,6 +32,7 @@ Post-v2.5.0
      * DB entries have been added for many of the DPDK EAL command line
        arguments. Additional arguments can be passed via the dpdk-extra
        entry.
+     * Add ingress policing functionality.
    - ovs-benchmark: This utility has been removed due to lack of use and
      bitrot.
    - ovs-appctl:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 0ed909c..dd4fa0a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -328,6 +328,12 @@ struct dpdk_ring {
     struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
 };

+struct ingress_policer {
+    struct rte_meter_srtcm_params app_srtcm_params;
+    struct rte_meter_srtcm in_policer;
+    rte_spinlock_t policer_lock;
+};
+
 struct netdev_dpdk {
     struct netdev up;
     int port_id;
@@ -373,6 +379,10 @@ struct netdev_dpdk {
     struct qos_conf *qos_conf;
     rte_spinlock_t qos_lock;

+    /* Ingress Policer */
+    OVSRCU_TYPE(struct ingress_policer *) ingress_policer;
+    uint32_t policer_rate;
+    uint32_t policer_burst;
 };

 struct netdev_rxq_dpdk {
@@ -386,6 +396,9 @@ static int netdev_dpdk_construct(struct netdev *);

 struct virtio_net * netdev_dpdk_get_virtio(const struct netdev_dpdk *dev);

+struct ingress_policer *
+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
+
 static bool
 is_dpdk_class(const struct netdev_class *class)
 {
@@ -759,6 +772,11 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int port_no,
     dev->qos_conf = NULL;
     rte_spinlock_init(&dev->qos_lock);

+    /* Initialise rcu pointer for ingress policer to NULL */
+    ovsrcu_init(&dev->ingress_policer, NULL);
+    dev->policer_rate = 0;
+    dev->policer_burst = 0;
+
     netdev->n_txq = NR_QUEUE;
     netdev->n_rxq = NR_QUEUE;
     netdev->requested_n_rxq = NR_QUEUE;
@@ -1198,6 +1216,22 @@ netdev_dpdk_policer_run__(struct rte_meter_srtcm *meter,
     return cnt;
 }

+static int
+ingress_policer_run(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
+                        int pkt_cnt)
+{
+    int cnt = 0;
+    struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
+
+    if (policer) {
+        rte_spinlock_lock(&policer->policer_lock);
+        cnt = netdev_dpdk_policer_run__(&policer->in_policer, pkts, pkt_cnt);
+        rte_spinlock_unlock(&policer->policer_lock);
+    }
+
+    return cnt;
+}
+

The callers to the above function already have to check for 'policer'. Maybe it'd be clearer to
pass 'policer' instead of 'dev' and drop the if?

 static bool
 is_vhost_running(struct virtio_net *virtio_dev)
 {
@@ -1232,13 +1266,15 @@ netdev_dpdk_vhost_update_rx_size_counters(struct netdev_stats *stats,

 static inline void
 netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats,
-                                     struct dp_packet **packets, int count)
+                                     struct dp_packet **packets, int count,
+                                     int dropped)
 {
     int i;
     unsigned int packet_size;
     struct dp_packet *packet;

     stats->rx_packets += count;
+    stats->rx_dropped += dropped;
     for (i = 0; i < count; i++) {
         packet = packets[i];
         packet_size = dp_packet_size(packet);
@@ -1273,7 +1309,9 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
     struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
     int qid = rxq->queue_id;
+    struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     uint16_t nb_rx = 0;
+    uint16_t dropped = 0;

     if (OVS_UNLIKELY(!is_vhost_running(virtio_dev))) {
         return EAGAIN;
@@ -1291,8 +1329,14 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
         return EAGAIN;
     }

+    if (policer != NULL) {

Minor: I would drop the "!= NULL"

+        dropped = nb_rx;
+        nb_rx = ingress_policer_run(dev, (struct rte_mbuf **)packets, nb_rx);
+        dropped -= nb_rx;
+    }
+
     rte_spinlock_lock(&dev->stats_lock);
-    netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx);
+    netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx, dropped);
     rte_spinlock_unlock(&dev->stats_lock);

     *c = (int) nb_rx;
@@ -1305,7 +1349,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet **packets,
 {
     struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq);
     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
+    struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     int nb_rx;
+    int dropped = 0;

     /* There is only one tx queue for this core.  Do not flush other
      * queues.
@@ -1323,6 +1369,19 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet **packets,
         return EAGAIN;
     }

+    if (policer != NULL) {

same here

+        dropped = nb_rx;
+        nb_rx = ingress_policer_run(dev, (struct rte_mbuf **) packets, nb_rx);
+        dropped -= nb_rx;
+    }
+
+    /* Update stats to reflect dropped packets */
+    if (OVS_UNLIKELY(dropped)) {
+        rte_spinlock_lock(&dev->stats_lock);
+        dev->stats.rx_dropped += dropped;
+        rte_spinlock_unlock(&dev->stats_lock);
+    }
+
     *c = nb_rx;

     return 0;
@@ -1756,6 +1815,7 @@ netdev_dpdk_vhost_get_stats(const struct netdev *netdev,
     /* Supported Stats */
     stats->rx_packets += dev->stats.rx_packets;
     stats->tx_packets += dev->stats.tx_packets;
+    stats->rx_dropped = dev->stats.rx_dropped;
     stats->tx_dropped += dev->stats.tx_dropped;
     stats->multicast = dev->stats.multicast;
     stats->rx_bytes = dev->stats.rx_bytes;
@@ -1882,11 +1942,12 @@ netdev_dpdk_get_stats(const struct netdev *netdev, struct netdev_stats *stats)

     rte_spinlock_lock(&dev->stats_lock);
     stats->tx_dropped = dev->stats.tx_dropped;
+    stats->rx_dropped = dev->stats.rx_dropped;
     rte_spinlock_unlock(&dev->stats_lock);

     /* These are the available DPDK counters for packets not received due to
      * local resource constraints in DPDK and NIC respectively. */
-    stats->rx_dropped = rte_stats.rx_nombuf + rte_stats.imissed;
+    stats->rx_dropped += rte_stats.rx_nombuf + rte_stats.imissed;
     stats->rx_missed_errors = rte_stats.imissed;

     ovs_mutex_unlock(&dev->mutex);
@@ -1941,6 +2002,104 @@ netdev_dpdk_get_features(const struct netdev *netdev,
 }

 static int
+netdev_dpdk_policer_construct__(struct netdev *netdev_, uint32_t rate,
+                                uint32_t burst)

I would drop the "__" part from the name.

+{
+    struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);
+    struct ingress_policer *policer;
+    uint64_t rate_bytes;
+    uint64_t burst_bytes;
+    int err = 0;
+
+    policer = xmalloc(sizeof *policer);
+    rte_spinlock_init(&policer->policer_lock);
+
+    /* rte_meter requires bytes so convert kbits rate and burst to bytes. */
+    rate_bytes = rate * 1000/8;
+    burst_bytes = burst * 1000/8;
+
+    policer->app_srtcm_params.cir = rate_bytes;
+    policer->app_srtcm_params.cbs = burst_bytes;
+    policer->app_srtcm_params.ebs = 0;
+    err = rte_meter_srtcm_config(&policer->in_policer,
+                                    &policer->app_srtcm_params);
+    ovsrcu_set(&netdev->ingress_policer, policer);
+
+    return err;
+}

IMHO it will be clearer if this function returns 'policer', instead of setting it in
'netdev_'

+
+static int netdev_dpdk_policer_destruct__(struct ingress_policer * policer)
+{
+    if (policer == NULL) {
+        return 0;
+    }
+    else {
+        free(policer);
+    }
+    return 0;
+}

There's no need for this function. free(NULL) is perfectly valid (plus, the caller already checks).

+
+static int
+netdev_dpdk_set_policing(struct netdev* netdev_, uint32_t policer_rate,
+                            uint32_t policer_burst)
+{
+    struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);

Not a big deal, but in the rest of the functions we use 'netdev' for 'struct netdev *',
and 'dev' for 'struct netdev_dpdk *' (since commit d46285a2206f)

+    struct ingress_policer *policer;
+    int err = 0;

There's no way for this function to fail, so this variable is unnecessary.

+
+    /* Force to 0 if no rate specified,
+     * default to 8000 kbits if burst is 0,
+     * else stick with user-specified value.
+     */
+    policer_burst = (!policer_rate ? 0
+                    : !policer_burst ? 8000
+                    : policer_burst);
+
+    ovs_mutex_lock(&netdev->mutex);
+
+    policer = ovsrcu_get_protected(struct ingress_policer *,
+                                    &netdev->ingress_policer);
+
+    if (netdev->policer_rate == policer_rate &&
+    netdev->policer_burst == policer_burst) {
+        /* Assume that settings haven't changed since we last set them. */
+        ovs_mutex_unlock(&netdev->mutex);
+        return err;
+    }
+



+    /* Destroy any existing ingress policer for the device if one exists
+     * and the policer_rate and policer_burst are being set to 0
+     */
+    if ((policer != NULL) && (policer_rate == 0) && (policer_burst == 0)) {
+        /* User has set rate and burst to 0, destroy the policer */
+        ovsrcu_postpone(netdev_dpdk_policer_destruct__, policer);
+        ovsrcu_set(&netdev->ingress_policer, NULL);
+        netdev->policer_rate = 0;
+        netdev->policer_burst = 0;
+        ovs_mutex_unlock(&netdev->mutex);
+        return err;
+    }
+
+    /* Destroy existing policer before adding new policer */
+    if (policer != NULL) {
+        ovsrcu_postpone(netdev_dpdk_policer_destruct__, policer);
+        ovsrcu_set(&netdev->ingress_policer, NULL);
+        netdev->policer_rate = 0;
+        netdev->policer_burst = 0;
+    }
+
+    /* Add an ingress policer */
+    err = netdev_dpdk_policer_construct__(netdev_, policer_rate,
+                                            policer_burst);
+
+    /* Update policer_rate and policer_burst for the netdev */
+    netdev->policer_rate = policer_rate;
+    netdev->policer_burst = policer_burst;
+    ovs_mutex_unlock(&netdev->mutex);

The above block seems more complicated than necessary.
How about something like this?

    if (policer) {
        ovsrcu_postpone(free, policer);
    }

    if (policer_rate != 0) {
        new_policer = netdev_dpdk_policer_construct__(rate, burst);
    } else {
        new_policer = NULL;
    }
    ovsrcu_set(&netdev->ingress_policer, new_policer);
    netdev->policer_rate = policer_rate;
    netdev->policer_burst = policer_burst;
    ovs_mutex_unlock(&netdev->mutex);

+    return err;
+}
+
+static int
 netdev_dpdk_get_ifindex(const struct netdev *netdev)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
@@ -2387,6 +2546,12 @@ netdev_dpdk_get_virtio(const struct netdev_dpdk *dev)
     return ovsrcu_get(struct virtio_net *, &dev->virtio_dev);
 }

+struct ingress_policer *
+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev)
+{
+    return ovsrcu_get(struct ingress_policer *, &dev->ingress_policer);
+}
+
 /*
  * These callbacks allow virtio-net devices to be added to vhost ports when
  * configuration has been fully complete.
@@ -2827,7 +2992,7 @@ static const struct dpdk_qos_ops egress_policer_ops = {
     GET_FEATURES,                                             \
     NULL,                       /* set_advertisements */      \
                                                               \
-    NULL,                       /* set_policing */            \
+    netdev_dpdk_set_policing,   /* set_policing */            \

Minor: since the member is not NULL anymore we can remove the comment

     netdev_dpdk_get_qos_types,                                \
     NULL,                       /* get_qos_capabilities */    \
     netdev_dpdk_get_qos,                                      \
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 944d8ec..0c9e60c 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -2502,8 +2502,8 @@
         table="Queue"/> tables).
       </p>
       <p>
-        Policing is currently implemented only on Linux.  The Linux
-        implementation uses a simple ``token bucket'' approach:
+        Policing is currently implemented on Linux and OVS with DPDK. Both
+        implementations use a simple ``token bucket'' approach:
       </p>
       <ul>
         <li>
--
1.7.4.1

_______________________________________________
dev mailing list
dev at openvswitch.org<mailto:dev at openvswitch.org>
http://openvswitch.org/mailman/listinfo/dev



More information about the dev mailing list