[ovs-dev] [PATCH] netdev_dpdk.c: Add QoS functionality.
Daniele Di Proietto
diproiettod at vmware.com
Fri Oct 9 17:52:36 UTC 2015
Hi,
Thanks for the patch, the implementation looks simpler than I expected.
General questions:
* It appears that we're using the srCTM, but
* We're only sending the green packets and dropping the rest.
* The `input_color` is always green.
Therefore
* `other-config:ebs` is ignored (isn't it?). I would rather not put in
the
database a value that is ignored.
* The `us_policer_table` seems redundant
* We just need two `us_policer_action`s, or perhaps just a boolean.
Can we just call this a token bucket? Are there any good reasons not to?
(Maybe you're planning to expand this, I'm not sure)
* Out of curiosity, are there plans to support different output queues?
* I did't experience any performance degradation when QoS is not
configured,
but one can imagine that taking a spinlock might have some performance
impact
when multiple threads insist on the same NIC. Have you considered using
RCU
for the `qos_conf` pointer? (I'm not saying that we absolutely should at
this
point)
Minor style issues:
* Lines should be limited to 79 characters
* We use 4 spaces to indent, not tabs.
That said the code looks good, a couple of more comments inline
Thanks!
Daniele
On 30/09/2015 13:45, "Ian Stokes" <ian.stokes at intel.com> wrote:
>This patch provides the modifications required in netdev-dpdk.c and
>vswitch.xml to allow for a DPDK user space QoS algorithm.
>
>This patch adds a QoS configuration structure for netdev-dpdk and
>expected QoS operations 'dpdk_qos_ops'. Various helper functions
>are also supplied.
>
>Also included are the modifications required for vswitch.xml to allow a
>new QoS implementation for netdev-dpdk devices. This includes a new QoS
>type
>`us-policer` as well as its expected QoS table entries.
>
>The QoS functionality implemented for DPDK devices is `us-policer`.
>This is an egress policer used to drop packets at configurable rate.
>
>The INSTALL.DPDK.md guide has also been modified to provide an example
>configuration of `us-policer` QoS.
>
>Signed-off-by: Ian Stokes <ian.stokes at intel.com>
>---
> INSTALL.DPDK.md | 52 +++++++
> lib/netdev-dpdk.c | 414
>+++++++++++++++++++++++++++++++++++++++++++++++++-
> vswitchd/vswitch.xml | 40 +++++
> 3 files changed, 500 insertions(+), 6 deletions(-)
[...]
>
>@@ -142,6 +145,122 @@ static int rte_eal_init_ret = ENODEV;
>
> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER;
>
>+/* Quality of Service */
>+
>+/* An instance of a QoS configuration. Always associated with a
>particular
>+ * network device.
>+ *
>+ * Each QoS implementation subclasses this with whatever additional data
>it
>+ * needs.
>+ */
>+struct qos_conf{
A space is needed before open brace
>+ const struct dpdk_qos_ops *ops;
>+};
>+
>+/* A particular implementation of dpdk QoS operations.
>+ *
>+ * The functions below return 0 if successful or a positive errno value
>on
>+ * failure, except where otherwise noted. All of them must be provided,
>except
>+ * where otherwise noted.
>+ */
>+struct dpdk_qos_ops{
Same here
[...]
>@@ -1289,9 +1432,18 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int
>qid,
> }
> }
Right before this, there is another call to dpdk_queue_pkts, that should
go through qos_alg_process as well. Have you considered adding a helper
function?
> if (next_tx_idx != cnt) {
>- dpdk_queue_pkts(dev, qid,
>- (struct rte_mbuf **)&pkts[next_tx_idx],
>- cnt-next_tx_idx);
>+ cnt -= next_tx_idx;
>+
>+ /* Check if QoS has been configured for this netdev. */
>+ rte_spinlock_lock(&dev->qos_lock);
>+ if (dev->qos_conf != NULL) {
>+ struct netdev *netdev = &dev->up;
>+ cnt = dev->qos_conf->ops->qos_alg_process(netdev,
>+ (struct rte_mbuf**)pkts, cnt);
>+ }
>+ rte_spinlock_unlock(&dev->qos_lock);
>+
>+ dpdk_queue_pkts(dev, qid, (struct rte_mbuf
>**)&pkts[next_tx_idx], cnt);
> }
>
> if (OVS_UNLIKELY(dropped)) {
[...]
>+ ovs_mutex_lock(&netdev->mutex);
>+ if(netdev->qos_conf){
A space is needed here
[...]
>+static inline int
>+__us_policer_pkt_handle(struct rte_meter_srtcm *meter, struct rte_mbuf
>*pkt, uint64_t time)
CodingStyle.md says:
- Do not use names that begin with _. If you need a name for
"internal use only", use __ as a suffix instead of a prefix.
Unfortunately we have many exceptions, even in this file, but I would
prefer to have it changed
More information about the dev
mailing list