[ovs-dev] [PATCH] netdev_dpdk.c: Add QoS functionality.

Stokes, Ian ian.stokes at intel.com
Wed Oct 14 13:14:12 UTC 2015

> -----Original Message-----
> From: Ben Pfaff [mailto:blp at nicira.com]
> Sent: Tuesday, October 13, 2015 5:24 PM
> To: Stokes, Ian
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] netdev_dpdk.c: Add QoS functionality.
> On Wed, Sep 30, 2015 at 01:45:15PM +0100, Ian Stokes wrote:
> > This patch provides the modifications required in netdev-dpdk.c and
> > vswitch.xml to allow for a DPDK user space QoS algorithm.
> >
> > This patch adds a QoS configuration structure for netdev-dpdk and
> > expected QoS operations 'dpdk_qos_ops'. Various helper functions are
> > also supplied.
> >
> > Also included are the modifications required for vswitch.xml to allow
> > a new QoS implementation for netdev-dpdk devices. This includes a new
> > QoS type `us-policer` as well as its expected QoS table entries.
> >
> > The QoS functionality implemented for DPDK devices is `us-policer`.
> > This is an egress policer used to drop packets at configurable rate.
> >
> > The INSTALL.DPDK.md guide has also been modified to provide an example
> > configuration of `us-policer` QoS.
> >
> > Signed-off-by: Ian Stokes <ian.stokes at intel.com>
> Hi Ian.  I've reviewed the documentation changes and the conceptual
> design, but not the code itself.

Hi Ben, thanks for taking the time to review this.
> First, I'm not sure why it's a good idea to introduce a policer to begin
> with.  Our experience is that ingress policing does not produce useful
> results.  Perhaps it will be more effective for egress, but I have my
> doubts about that; otherwise why would shaping generally be preferred
> for egress?

I understand and agree. From our side when we originally proposed the QoS API (netdev functions, QoS function pointers + helper functions) it was suggested that we provide a simple QoS algorithm along with it just to demonstrate how all these components are used.

The policer is a simple implementation and as such is limited in its application as you described above, this is certainly true for the physical port types, however there is a use case for virtual ports attached to virtual machines. 

It can be the case that the virtual machine performs some type of packet processing and will receive traffic from multiple ports to process. A user may make a conscious decision to limit traffic from one of these ports in order to allow traffic from other attached ports to be prioritized. This requires some pre-knowledge of ports and their traffic types but it is a use case we have heard echoed from both OVS and Openstack users.
Ultimately our goal will be to implement an egress rate limiter. However there are still design decisions regarding how it should be implemented that we need to discuss with the community first (user space queues for buffereing, algorithms for reading buffered packets etc). 

Our aim with this patch was to provide the QoS API required along with a simple QoS implementation so as to enable others in the community begin contributing on the QoS front as soon as possible.
> With what kinds of traffic has this implementation been tested?  How
> accurate is it?  TCP and UDP respond quite differently to policing.
> Additionally, policing can be especially hard on traffic that involves
> IP fragments, since dropping a single fragment causes the entire IP
> datagram to be discarded after soaking up considerable CPU time on the
> destination host for reassembly.
We've tested with UDP and TCP. There are no provisions made for ip fragments and re-assembly cases. The situation you describe above can occur, but this is the nature of policing and as such that type of traffic will not be an ideal fit, the user should be aware(or made aware) of this, It can be flagged in the vswtich.xml maybe?
> Since DPDK is all about performance, I'd expect some commentary on the
> performance of this technique.  How much CPU overhead does it require on
> the sender?
There is an overhead cost as the rte_meter will use cpu cyclces. How this can be measured is up for debate. The PMD will report 100% usage due to the nature of DPDK, so looking for the extra overhead here in terms of CPU % usage will not be immediately clear. We can provide the drop in expected performance in terms of traffic processing with the QoS on the port enabled and disabled? 

> I don't like this trend toward putting essentially all of the DPDK
> documentation in INSTALL.DPDK.md.  This commit adds far more of the
> configuration details to that file than to vswitch.xml.  I'd prefer it
> to be the opposite, only adding an example or two to INSTALL.DPDK.md and
> the bulk of the information to vswitch.xml.

Agreed, apologies for putting this info in the INSTALL.DPDK.md. I will move it to the vswitch.xml as per your recommendation.
> I would have guessed that Intel NICs have hardware queuing and QoS
> features.  If so, they are undoubtedly more CPU-efficient than software
> versions.  Are there plans to make use of those features in DPDK?
This would be correct for the physical ports at least; however this support is not present in DPDK as we speak. There are plans for this type of support but I'm not sure when it will be implemented. 

However a software solution can still be used for virtual port types and it could also support more basic physical port types that may not support QoS offload functionality to hardware.

> Thanks,
> Ben.


More information about the dev mailing list