[ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

Stokes, Ian ian.stokes at intel.com
Wed Jan 15 13:19:24 UTC 2020



On 1/15/2020 12:19 PM, Ilya Maximets wrote:
> On 15.01.2020 12:28, Stokes, Ian wrote:
>>
>>
>> On 1/14/2020 4:12 PM, Eelco Chaudron wrote:
>>> This patch adds a new policer to the DPDK datapath based on RFC 4115's
>>> Two-Rate, Three-Color marker. It's a two-level hierarchical policer
>>> which first does a color-blind marking of the traffic at the queue
>>> level, followed by a color-aware marking at the port level. At the end
>>> traffic marked as Green or Yellow is forwarded, Red is dropped. For
>>> details on how traffic is marked, see RFC 4115.
>>>
>>> This egress policer can be used to limit traffic at different rated
>>> based on the queues the traffic is in. In addition, it can also be used
>>> to prioritize certain traffic over others at a port level.
>>>
>>> For example, the following configuration will limit the traffic rate at a
>>> port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
>>> 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
>>> Information Rate). High priority traffic is routed to queue 10, which marks
>>> all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
>>> marked as EIR, i.e. Yellow.
>>>
>>> ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
>>>     --id=@myqos create qos type=trtcm-policer \
>>>     other-config:cir=52000 other-config:cbs=2048 \
>>>     other-config:eir=52000 other-config:ebs=2048  \
>>>     queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
>>>     --id=@dpdk1Q10 create queue \
>>>       other-config:cir=41600000 other-config:cbs=2048 \
>>>       other-config:eir=0 other-config:ebs=0 -- \
>>>     --id=@dpdk1Q20 create queue \
>>>       other-config:cir=0 other-config:cbs=0 \
>>>       other-config:eir=41600000 other-config:ebs=2048 \
>>>
>>> This configuration accomplishes that the high priority traffic has a
>>> guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
>>> use the EIR, so a total of 2000pps at max. These additional 1000pps is
>>> shared with the low priority traffic. The low priority traffic can use at
>>> maximum 1000pps.
>>>
>> Thanks for the patch Eelco, minor comment below.
>>
>> <snip>
>>
>>>      Rate Limiting (Ingress Policing)
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index 128963f..1ed4a47 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -26,6 +26,12 @@
>>>    #include <sys/socket.h>
>>>    #include <linux/if.h>
>>>    +/* Include rte_compat.h first to allow experimental API's needed for the
>>> + * rte_meter.h rfc4115 functions. Once they are no longer marked as
>>> + * experimental the #define and rte_compat.h include can be removed.
>>> + */
>>> +#define ALLOW_EXPERIMENTAL_API
>>> +#include <rte_compat.h>
>>
>> I guess the risk here, from what I understand, this approach is all or nothing in terms of experimental APIs from the included headers, other experimental APIs could be used from DPDK going forward without causing warning?
>>
>> If so, there would have to be extra dilligence taken when reviewing future patches and a discussion if an API is expereimental, should it wait until it is marked as non experimental. (In this case The TRTCM looks stable and is highly unlikey to be removed so it's not an issue IMO).
>>
>> @Ilya/Kevin: Would you agree with above? Thoughts?
> 
> I think it make sense to wait for API to become non-experimental
> if we have no easy way to enable it only on functions we need.
> I agree that having widly enabled experimental api might produce
> additional issues and will require more careful review of all the
> DPDK related patches.

So I'm ok with making an exception in this case for the feature, I think 
it was an oversight but the API is stable. But what I'm hearing is in 
general the rule should be for new features the API should not be 
experimental and if it is then it would have to have it's experimental 
tag removed by the next DPDK upgrade (in this case 20.11 for arguments 
sake) and then can be upstreamed to OVS master.

> 
> BTW, another thought I have in mind about all the release management is:
> Shouldn't we hold OVS updates to new DPDK LTS until the first correction
> release is out?  I mean, for example, Ubuntu triggers updates from one
> LTS release to another only after .1 correcting relese is out (users
> of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1
> is released).  Shouldn't we do the same thing?  Shouldn't we upgrade
> to the next DPDK LTS only after XX.11.1 is ready?  This might make sense
> in order to not have obviously broken functionality in OVS releases but
> at the same time might just defer actual revealing of DPDK issues, so
> I'm not fully sure about this.  Since OVS is not the only user of DPDK,
> this still might make sense anyway.  Would like to hear some thoughts.
> 

Yes, I've thought about this as well. There certainly is advantage to 
moving to a .1 release in terms of stability instead of .0. However when 
I thought about the release two things came to mind.

(i) By moving to the .0 release, is OVS in a position to better 
contribute feedback for the .1 release and ensure relevant patches for 
OVS with DPDK fixes are upstreamed in the .1 (rather than a .2). 
Feedback coming not from just the developers but also from end users.

(ii) With the timeing of the OVS to .1 releases, does it make a massive 
differemce? For example with OVS 2.13 being released late February and 
19.11.1 being released early March, I would think OVS would move to .1 
pretty quickly to benefit from the latest fixes on OVS master. This 
still allows our OVS release to use the latest stable DPDK without 
waiting for OVS 2.14 in August.

There's definitely advantages to both approaches and worth further 
discussion. I'd be interested in hearing what others think?

BR
Ian

> Best regards, Ilya Maximets.
> 


More information about the dev mailing list