[ovs-dev] [PATCH] netdev_dpdk.c: Add QoS functionality.

Stokes, Ian ian.stokes at intel.com
Mon Oct 12 17:02:25 UTC 2015


Hi Daniele,

Thanks for providing feedback, answers inline.

> -----Original Message-----
> From: Daniele Di Proietto [mailto:diproiettod at vmware.com]
> Sent: Friday, October 09, 2015 6:53 PM
> To: Stokes, Ian
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] netdev_dpdk.c: Add QoS functionality.
> 
> Hi,
> 
> Thanks for the patch, the implementation looks simpler than I expected.
> 
> General questions:
> 
> * It appears that we're using the srCTM, but
>   * We're only sending the green packets and dropping the rest.
>   * The `input_color` is always green.
> 
>   Therefore
>   * `other-config:ebs` is ignored (isn't it?). I would rather not put in
> the
>     database a value that is ignored.

Although we do not use the ebs value it is a requirement for the 
rte_meter initialization function. DPDK expects an ebs value 
greater than 0 or it fails to initialize the rte_meter. 

We can remove the other_config:ebs from the command line but
will have to add a define in netdev_dpdk so as to supply some ebs 
value.

Would this be acceptable?

>   * The `us_policer_table` seems redundant
>   * We just need two `us_policer_action`s, or perhaps just a boolean.
> 
>   Can we just call this a token bucket? Are there any good reasons not
> to?
>   (Maybe you're planning to expand this, I'm not sure)

Is this in relation to the name 'us-policer'or the name for the cbs and
ebs paramaters? ?

I think we should change the us-policer name anyhow to signify that it
is egress. Would 'egress_policer' be acceptable? Failing that we ca use
token_bucket can be used. Once it's clear to the end user.

> 
> * Out of curiosity, are there plans to support different output queues?

There are plans to introduce user space queues at a later date to allow 
buffering of traffic. This would be included as maybe a rate limiting type
implementation where traffic is not dropped but instead buffered. This 
will be a separate submission in the future however.
 
> 
> * I did't experience any performance degradation when QoS is not
> configured,
>   but one can imagine that taking a spinlock might have some performance
> impact
>   when multiple threads insist on the same NIC. Have you considered
> using
> RCU
>   for the `qos_conf` pointer? (I'm not saying that we absolutely should
> at
> this
>   point)
> 
I haven't tried an RCU for this. I can look at this but if it's not a
blocking issue perhaps it could be implemented at a later stage?
> Minor style issues:
> 
> * Lines should be limited to 79 characters
> * We use 4 spaces to indent, not tabs.
>
Noted, will make these changes for version 2 submission.
 
> That said the code looks good, a couple of more comments inline
> 
> Thanks!
> 
> Daniele
> 
> On 30/09/2015 13:45, "Ian Stokes" <ian.stokes at intel.com> wrote:
> 
> >This patch provides the modifications required in netdev-dpdk.c and
> >vswitch.xml to allow for a DPDK user space QoS algorithm.
> >
> >This patch adds a QoS configuration structure for netdev-dpdk and
> >expected QoS operations 'dpdk_qos_ops'. Various helper functions
> >are also supplied.
> >
> >Also included are the modifications required for vswitch.xml to allow a
> >new QoS implementation for netdev-dpdk devices. This includes a new QoS
> >type
> >`us-policer` as well as its expected QoS table entries.
> >
> >The QoS functionality implemented for DPDK devices is `us-policer`.
> >This is an egress policer used to drop packets at configurable rate.
> >
> >The INSTALL.DPDK.md guide has also been modified to provide an example
> >configuration of `us-policer` QoS.
> >
> >Signed-off-by: Ian Stokes <ian.stokes at intel.com>
> >---
> > INSTALL.DPDK.md      |   52 +++++++
> > lib/netdev-dpdk.c    |  414
> >+++++++++++++++++++++++++++++++++++++++++++++++++-
> > vswitchd/vswitch.xml |   40 +++++
> > 3 files changed, 500 insertions(+), 6 deletions(-)
> [...]
> >
> >@@ -142,6 +145,122 @@ static int rte_eal_init_ret = ENODEV;
> >
> > static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER;
> >
> >+/* Quality of Service */
> >+
> >+/* An instance of a QoS configuration.  Always associated with a
> >particular
> >+ * network device.
> >+ *
> >+ * Each QoS implementation subclasses this with whatever additional
> data
> >it
> >+ * needs.
> >+ */
> >+struct qos_conf{
> 
> A space is needed before open brace
> 
> >+    const struct dpdk_qos_ops *ops;
> >+};
> >+
> >+/* A particular implementation of dpdk QoS operations.
> >+ *
> >+ * The functions below return 0 if successful or a positive errno
> value
> >on
> >+ * failure, except where otherwise noted. All of them must be
> provided,
> >except
> >+ * where otherwise noted.
> >+ */
> >+struct dpdk_qos_ops{
> 
> Same here
> 
> [...]
> >@@ -1289,9 +1432,18 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int
> >qid,
> >             }
> >         }
> 
> Right before this, there is another call to dpdk_queue_pkts, that should
> go through qos_alg_process as well. Have you considered adding a helper
> function?
> 
> >         if (next_tx_idx != cnt) {
> >-           dpdk_queue_pkts(dev, qid,
> >-                            (struct rte_mbuf **)&pkts[next_tx_idx],
> >-                            cnt-next_tx_idx);
> >+            cnt -= next_tx_idx;
> >+
> >+            /* Check if QoS has been configured for this netdev. */
> >+            rte_spinlock_lock(&dev->qos_lock);
> >+            if (dev->qos_conf != NULL) {
> >+                struct netdev *netdev = &dev->up;
> >+                cnt = dev->qos_conf->ops->qos_alg_process(netdev,
> >+                        (struct rte_mbuf**)pkts, cnt);
> >+            }
> >+            rte_spinlock_unlock(&dev->qos_lock);
> >+
> >+            dpdk_queue_pkts(dev, qid, (struct rte_mbuf
> >**)&pkts[next_tx_idx], cnt);
> >         }
> >
> >         if (OVS_UNLIKELY(dropped)) {
> [...]
> >+    ovs_mutex_lock(&netdev->mutex);
> >+    if(netdev->qos_conf){
> 
> A space is needed here
> 
> [...]
> >+static inline int
> >+__us_policer_pkt_handle(struct rte_meter_srtcm *meter, struct rte_mbuf
> >*pkt, uint64_t time)
> 
> CodingStyle.md says:
> 
>   - Do not use names that begin with _.  If you need a name for
>     "internal use only", use __ as a suffix instead of a prefix.
> 
> Unfortunately we have many exceptions, even in this file, but I would
> 
> prefer to have it changed

Thanks for these catches, will make these changes for version 2.

Regards
Ian



More information about the dev mailing list