[ovs-dev] [PATCH v4 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

Eelco Chaudron echaudro at redhat.com
Tue Jan 14 11:25:37 UTC 2020



On 14 Jan 2020, at 12:23, Stokes, Ian wrote:

> On 1/13/2020 8:32 PM, Stokes, Ian wrote:
>>
>>
>> On 1/13/2020 3:56 PM, Eelco Chaudron wrote:
>>> This patch adds a new policer to the DPDK datapath based on RFC 
>>> 4115's
>>> Two-Rate, Three-Color marker. It's a two-level hierarchical policer
>>> which first does a color-blind marking of the traffic at the queue
>>> level, followed by a color-aware marking at the port level. At the 
>>> end
>>> traffic marked as Green or Yellow is forwarded, Red is dropped. For
>>> details on how traffic is marked, see RFC 4115.
>>>
>>> This egress policer can be used to limit traffic at different rated
>>> based on the queues the traffic is in. In addition, it can also be 
>>> used
>>> to prioritize certain traffic over others at a port level.
>>>
>>> For example, the following configuration will limit the traffic rate 
>>> at a
>>> port level to a maximum of 2000 packets a second (64 bytes IPv4 
>>> packets).
>>> 100pps as CIR (Committed Information Rate) and 1000pps as EIR 
>>> (Excess
>>> Information Rate). High priority traffic is routed to queue 10, 
>>> which marks
>>> all traffic as CIR, i.e. Green. All low priority traffic, queue 20, 
>>> is
>>> marked as EIR, i.e. Yellow.
>>>
>>> ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
>>>    --id=@myqos create qos type=trtcm-policer \
>>>    other-config:cir=52000 other-config:cbs=2048 \
>>>    other-config:eir=52000 other-config:ebs=2048  \
>>>    queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
>>>    --id=@dpdk1Q10 create queue \
>>>      other-config:cir=41600000 other-config:cbs=2048 \
>>>      other-config:eir=0 other-config:ebs=0 -- \
>>>    --id=@dpdk1Q20 create queue \
>>>      other-config:cir=0 other-config:cbs=0 \
>>>      other-config:eir=41600000 other-config:ebs=2048 \
>>>
>>> This configuration accomplishes that the high priority traffic has a
>>> guaranteed bandwidth egressing the ports at CIR (1000pps), but it 
>>> can also
>>> use the EIR, so a total of 2000pps at max. These additional 1000pps 
>>> is
>>> shared with the low priority traffic. The low priority traffic can 
>>> use at
>>> maximum 1000pps.
>>
>> Thanks for the patch Eelco, LGTM, I just want to finish a little more 
>> testing but I intend to merge this tomorrow to master if nothing else 
>> comes up/is raised.
>>
>
> Hi Eelco, I'm seeing a crash in OVS while running this with just a 
> port and a default queue 0 (phy to phy setup). It seems related to the 
> call to rte_meter_trtcm_rfc4115_color_blind_check. I've provided more 
> detail below in the trtcm_policer_run_single_packet function, just 
> wondering if you've come across it?

Looks like something that was fixed before in DPDK, I’ll take a look a 
bit later…

//Eelco

> Heres the output for the qos configuration I'm using
>
> -bash-4.4$ovs-appctl -t ovs-vswitchd qos/show dpdk1
> QoS: dpdk1 trtcm-policer
> eir: 52000
> cbs: 2048
> ebs: 2048
> cir: 52000
>
> Default:
>   eir: 52000
>   cbs: 2048
>   ebs: 2048
>   cir: 52000
>   tx_packets: 672150
>   tx_bytes: 30918900
>   tx_errors: 489562233
>
> I'll try to investigate further with DPDK and GDB also.
>
>> Regards
>> Ian
>>
>>>
>>> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
>>> ---
>>>   Documentation/topics/dpdk/qos.rst |   43 ++++
>>>   lib/netdev-dpdk.c                 |  358 
>>> +++++++++++++++++++++++++++++++++++--
>>>   vswitchd/vswitch.xml              |   34 ++++
>>>   3 files changed, 420 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/Documentation/topics/dpdk/qos.rst 
>>> b/Documentation/topics/dpdk/qos.rst
>>> index c0aec5d..1034954 100644
>>> --- a/Documentation/topics/dpdk/qos.rst
>>> +++ b/Documentation/topics/dpdk/qos.rst
>>> @@ -33,6 +33,9 @@ datapath. These are referred to as *QoS* and *Rate 
>>> Limiting*, respectively.
>>>   QoS (Egress Policing)
>>>   ---------------------
>>> +Single Queue Policer
>>> +~~~~~~~~~~~~~~~~~~~~
>>> +
>>>   Assuming you have a :doc:`vhost-user port <vhost-user>` 
>>> transmitting traffic
>>>   consisting of packets of size 64 bytes, the following command 
>>> would limit the
>>>   egress transmission rate of the port to ~1,000,000 packets per 
>>> second::
>>> @@ -49,6 +52,46 @@ To clear the QoS configuration from the port and 
>>> ovsdb, run::
>>>       $ ovs-vsctl destroy QoS vhost-user0 -- clear Port 
>>> vhost-user0 qos
>>> +
>>> +Multi Queue Policer
>>> +~~~~~~~~~~~~~~~~~~~
>>> +
>>> +In addition to the egress-policer OVS-DPDK also has support for a 
>>> RFC
>>> +4115's Two-Rate, Three-Color marker meter. It's a two-level 
>>> hierarchical
>>> +policer which first does a color-blind marking of the traffic at 
>>> the queue
>>> +level, followed by a color-aware marking at the port level. At the 
>>> end
>>> +traffic marked as Green or Yellow is forwarded, Red is dropped. For
>>> +details on how traffic is marked, see RFC 4115.
>>> +
>>> +This egress policer can be used to limit traffic at different rated
>>> +based on the queues the traffic is in. In addition, it can also be 
>>> used
>>> +to prioritize certain traffic over others at a port level.
>>> +
>>> +For example, the following configuration will limit the traffic 
>>> rate at a
>>> +port level to a maximum of 2000 packets a second (64 bytes IPv4 
>>> packets).
>>> +100pps as CIR (Committed Information Rate) and 1000pps as EIR 
>>> (Excess
>>> +Information Rate). High priority traffic is routed to queue 10, 
>>> which marks
>>> +all traffic as CIR, i.e. Green. All low priority traffic, queue 20, 
>>> is
>>> +marked as EIR, i.e. Yellow::
>>> +
>>> +    $ ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
>>> +        --id=@myqos create qos type=trtcm-policer \
>>> +        other-config:cir=52000 other-config:cbs=2048 \
>>> +        other-config:eir=52000 other-config:ebs=2048  \
>>> +        queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
>>> +         --id=@dpdk1Q10 create queue \
>>> +          other-config:cir=41600000 other-config:cbs=2048 
>>> \
>>> +          other-config:eir=0 other-config:ebs=0 -- \
>>> +         --id=@dpdk1Q20 create queue \
>>> +           other-config:cir=0 other-config:cbs=0 \
>>> +           other-config:eir=41600000 
>>> other-config:ebs=2048 \
>>> +
>>> +This configuration accomplishes that the high priority traffic has 
>>> a
>>> +guaranteed bandwidth egressing the ports at CIR (1000pps), but it 
>>> can also
>>> +use the EIR, so a total of 2000pps at max. These additional 1000pps 
>>> is
>>> +shared with the low priority traffic. The low priority traffic can 
>>> use at
>>> +maximum 1000pps.
>>> +
>>>   Refer to ``vswitch.xml`` for more details on egress policer.
>>>   Rate Limiting (Ingress Policing)
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index 128963f..3f164c8 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -26,7 +26,10 @@
>>>   #include <sys/socket.h>
>>>   #include <linux/if.h>
>>> +/* The below is needed as rte_meter.h gets included trough 
>>> rte_bus_pci.h. */
>>> +#define ALLOW_EXPERIMENTAL_API
>>>   #include <rte_bus_pci.h>
>>> +#undef ALLOW_EXPERIMENTAL_API
>>>   #include <rte_config.h>
>>>   #include <rte_cycles.h>
>>>   #include <rte_errno.h>
>>> @@ -35,7 +38,9 @@
>>>   #include <rte_flow.h>
>>>   #include <rte_malloc.h>
>>>   #include <rte_mbuf.h>
>>> +#define ALLOW_EXPERIMENTAL_API
>>>   #include <rte_meter.h>
>>> +#undef ALLOW_EXPERIMENTAL_API
>>>   #include <rte_pci.h>
>>>   #include <rte_version.h>
>>>   #include <rte_vhost.h>
>>> @@ -329,8 +334,9 @@ struct dpdk_qos_ops {
>>>                                        
>>> struct netdev_dpdk_queue_state *state);
>>>   };
>>> -/* dpdk_qos_ops for each type of user space QoS implementation */
>>> +/* dpdk_qos_ops for each type of user space QoS implementation. */
>>>   static const struct dpdk_qos_ops egress_policer_ops;
>>> +static const struct dpdk_qos_ops trtcm_policer_ops;
>>>   /*
>>>    * Array of dpdk_qos_ops, contains pointer to all supported QoS
>>> @@ -338,6 +344,7 @@ static const struct dpdk_qos_ops 
>>> egress_policer_ops;
>>>    */
>>>   static const struct dpdk_qos_ops *const qos_confs[] = {
>>>       &egress_policer_ops,
>>> +    &trtcm_policer_ops,
>>>       NULL
>>>   };
>>> @@ -2162,9 +2169,9 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk 
>>> *dev, int qid,
>>>   }
>>>   static inline bool
>>> -netdev_dpdk_policer_pkt_handle(struct rte_meter_srtcm *meter,
>>> -                               struct 
>>> rte_meter_srtcm_profile *profile,
>>> -                               struct 
>>> rte_mbuf *pkt, uint64_t time)
>>> +netdev_dpdk_srtcm_policer_pkt_handle(struct rte_meter_srtcm *meter,
>>> +                                     
>>> struct rte_meter_srtcm_profile *profile,
>>> +                                     
>>> struct rte_mbuf *pkt, uint64_t time)
>>>   {
>>>       uint32_t pkt_len = rte_pktmbuf_pkt_len(pkt) - 
>>> sizeof(struct rte_ether_hdr);
>>> @@ -2173,10 +2180,10 @@ netdev_dpdk_policer_pkt_handle(struct 
>>> rte_meter_srtcm *meter,
>>>   }
>>>   static int
>>> -netdev_dpdk_policer_run(struct rte_meter_srtcm *meter,
>>> -                        struct 
>>> rte_meter_srtcm_profile *profile,
>>> -                        struct rte_mbuf 
>>> **pkts, int pkt_cnt,
>>> -                        bool should_steal)
>>> +srtcm_policer_run_single_packet(struct rte_meter_srtcm *meter,
>>> +                                
>>> struct rte_meter_srtcm_profile *profile,
>>> +                                
>>> struct rte_mbuf **pkts, int pkt_cnt,
>>> +                                bool 
>>> should_steal)
>>>   {
>>>       int i = 0;
>>>       int cnt = 0;
>>> @@ -2186,8 +2193,8 @@ netdev_dpdk_policer_run(struct rte_meter_srtcm 
>>> *meter,
>>>       for (i = 0; i < pkt_cnt; i++) {
>>>           pkt = pkts[i];
>>>           /* Handle current packet */
>>> -        if (netdev_dpdk_policer_pkt_handle(meter, profile,
>>> -                                           
>>> pkt, current_time)) {
>>> +        if (netdev_dpdk_srtcm_policer_pkt_handle(meter, 
>>> profile,
>>> +                                                 
>>> pkt, current_time)) {
>>>               if (cnt != i) {
>>>                   pkts[cnt] = pkt;
>>>               }
>>> @@ -2209,8 +2216,9 @@ ingress_policer_run(struct ingress_policer 
>>> *policer, struct rte_mbuf **pkts,
>>>       int cnt = 0;
>>>       rte_spinlock_lock(&policer->policer_lock);
>>> -    cnt = netdev_dpdk_policer_run(&policer->in_policer, 
>>> &policer->in_prof,
>>> -                                  
>>> pkts, pkt_cnt, should_steal);
>>> +    cnt = srtcm_policer_run_single_packet(&policer->in_policer,
>>> +                                          
>>> &policer->in_prof,
>>> +                                          
>>> pkts, pkt_cnt, should_steal);
>>>       rte_spinlock_unlock(&policer->policer_lock);
>>>       return cnt;
>>> @@ -4480,9 +4488,9 @@ egress_policer_run(struct qos_conf *conf, 
>>> struct rte_mbuf **pkts, int pkt_cnt,
>>>       struct egress_policer *policer =
>>>           CONTAINER_OF(conf, struct egress_policer, 
>>> qos_conf);
>>> -    cnt = netdev_dpdk_policer_run(&policer->egress_meter,
>>> -                                  
>>> &policer->egress_prof, pkts,
>>> -                                  
>>> pkt_cnt, should_steal);
>>> +    cnt = 
>>> srtcm_policer_run_single_packet(&policer->egress_meter,
>>> +                                          
>>> &policer->egress_prof, pkts,
>>> +                                          
>>> pkt_cnt, should_steal);
>>>       return cnt;
>>>   }
>>> @@ -4496,6 +4504,326 @@ static const struct dpdk_qos_ops 
>>> egress_policer_ops = {
>>>       .qos_run = egress_policer_run
>>>   };
>>> +/* trtcm-policer details */
>>> +
>>> +struct trtcm_policer {
>>> +    struct qos_conf qos_conf;
>>> +    struct rte_meter_trtcm_rfc4115_params meter_params;
>>> +    struct rte_meter_trtcm_rfc4115_profile meter_profile;
>>> +    struct rte_meter_trtcm_rfc4115 meter;
>>> +    struct netdev_queue_stats stats;
>>> +    struct hmap queues;
>>> +};
>>> +
>>> +struct trtcm_policer_queue {
>>> +    struct hmap_node hmap_node;
>>> +    uint32_t queue_id;
>>> +    struct rte_meter_trtcm_rfc4115_params meter_params;
>>> +    struct rte_meter_trtcm_rfc4115_profile meter_profile;
>>> +    struct rte_meter_trtcm_rfc4115 meter;
>>> +    struct netdev_queue_stats stats;
>>> +};
>>> +
>>> +static void
>>> +trtcm_policer_details_to_param(const struct smap *details,
>>> +                               struct 
>>> rte_meter_trtcm_rfc4115_params *params)
>>> +{
>>> +    memset(params, 0, sizeof *params);
>>> +    params->cir = smap_get_ullong(details, "cir", 0);
>>> +    params->eir = smap_get_ullong(details, "eir", 0);
>>> +    params->cbs = smap_get_ullong(details, "cbs", 0);
>>> +    params->ebs = smap_get_ullong(details, "ebs", 0);
>>> +}
>>> +
>>> +static void
>>> +trtcm_policer_param_to_detail(
>>> +    const struct rte_meter_trtcm_rfc4115_params *params,
>>> +    struct smap *details)
>>> +{
>>> +    smap_add_format(details, "cir", "%"PRIu64, params->cir);
>>> +    smap_add_format(details, "eir", "%"PRIu64, params->eir);
>>> +    smap_add_format(details, "cbs", "%"PRIu64, params->cbs);
>>> +    smap_add_format(details, "ebs", "%"PRIu64, params->ebs);
>>> +}
>>> +
>>> +
>>> +static int
>>> +trtcm_policer_qos_construct(const struct smap *details,
>>> +                            struct 
>>> qos_conf **conf)
>>> +{
>>> +    struct trtcm_policer *policer;
>>> +    int err = 0;
>>> +
>>> +    policer = xmalloc(sizeof *policer);
>>> +    qos_conf_init(&policer->qos_conf, &trtcm_policer_ops);
>>> +    trtcm_policer_details_to_param(details, 
>>> &policer->meter_params);
>>> +    err = 
>>> rte_meter_trtcm_rfc4115_profile_config(&policer->meter_profile,
>>> +                                                 
>>> &policer->meter_params);
>>> +    if (!err) {
>>> +        err = 
>>> rte_meter_trtcm_rfc4115_config(&policer->meter,
>>> +                                             
>>> &policer->meter_profile);
>>> +    }
>>> +
>>> +    if (!err) {
>>> +        *conf = &policer->qos_conf;
>>> +        memset(&policer->stats, 0, sizeof policer->stats);
>>> +        hmap_init(&policer->queues);
>>> +    } else {
>>> +        free(policer);
>>> +        *conf = NULL;
>>> +        err = -err;
>>> +    }
>>> +
>>> +    return err;
>>> +}
>>> +
>>> +static void
>>> +trtcm_policer_qos_destruct(struct qos_conf *conf)
>>> +{
>>> +    struct trtcm_policer_queue *queue, *next_queue;
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    HMAP_FOR_EACH_SAFE (queue, next_queue, hmap_node, 
>>> &policer->queues) {
>>> +        hmap_remove(&policer->queues, &queue->hmap_node);
>>> +        free(queue);
>>> +    }
>>> +    hmap_destroy(&policer->queues);
>>> +    free(policer);
>>> +}
>>> +
>>> +static int
>>> +trtcm_policer_qos_get(const struct qos_conf *conf, struct smap 
>>> *details)
>>> +{
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    trtcm_policer_param_to_detail(&policer->meter_params, 
>>> details);
>>> +    return 0;
>>> +}
>>> +
>>> +static bool
>>> +trtcm_policer_qos_is_equal(const struct qos_conf *conf,
>>> +                           const struct 
>>> smap *details)
>>> +{
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +    struct rte_meter_trtcm_rfc4115_params params;
>>> +
>>> +    trtcm_policer_details_to_param(details, &params);
>>> +
>>> +    return !memcmp(&params, &policer->meter_params, sizeof 
>>> params);
>>> +}
>>> +
>>> +static struct trtcm_policer_queue *
>>> +trtcm_policer_qos_find_queue(struct trtcm_policer *policer, 
>>> uint32_t queue_id)
>>> +{
>>> +    struct trtcm_policer_queue *queue;
>>> +    HMAP_FOR_EACH_WITH_HASH (queue, hmap_node, 
>>> hash_2words(queue_id, 0),
>>> +                             
>>> &policer->queues) {
>>> +        if (queue->queue_id == queue_id) {
>>> +            return queue;
>>> +        }
>>> +    }
>>> +    return NULL;
>>> +}
>>> +
>>> +static inline bool
>>> +trtcm_policer_run_single_packet(struct trtcm_policer *policer,
>>> +                                
>>> struct rte_mbuf *pkt, uint64_t time)
>>> +{
>>> +    enum rte_color pkt_color;
>>> +    struct trtcm_policer_queue *queue;
>>> +    uint32_t pkt_len = rte_pktmbuf_pkt_len(pkt) - sizeof(struct 
>>> rte_ether_hdr);
>>> +    struct dp_packet *dpkt = CONTAINER_OF(pkt, struct dp_packet, 
>>> mbuf);
>>> +
>>> +    queue = trtcm_policer_qos_find_queue(policer, 
>>> dpkt->md.skb_priority);
>>> +    if (!queue) {
>>> +        /* If no queue is found, use the default queue, 
>>> which MUST exist. */
>>> +        queue = trtcm_policer_qos_find_queue(policer, 0);
>>> +        if (!queue) {
>>> +            return false;
>>> +        }
>>> +    }
>>> +
>>> +    pkt_color = 
>>> rte_meter_trtcm_rfc4115_color_blind_check(&queue->meter,
>
> A few times during testing I have seen OVS crash with the following
>
> ./launch_vswitch.sh: line 66: 11694 Floating point exception sudo 
> $OVS_DIR/vswitchd/ovs-vswitchd unix:$DB_SOCK --pidfile
>
> Looking into it with GDB it sems related to the 
> rte_meter_trtcm_rfc4115_color_blind_check above. See GDB output below.
>
> Thread 12 "pmd-c03/id:9" received signal SIGFPE, Arithmetic exception.
> [Switching to Thread 0x7f3dce734700 (LWP 26465)]
> 0x0000000000d4b92d in rte_meter_trtcm_rfc4115_color_blind_check 
> (m=0x328e178, p=0x328e148, time=29107058565136113, pkt_len=46) at 
> /opt/istokes/dpdk-19.11//x86_64-native-linuxapp-gcc/include/rte_meter.h:599
> 599             n_periods_te = time_diff_te / p->eir_period;
> (gdb) bt
> #0  0x0000000000d4b92d in rte_meter_trtcm_rfc4115_color_blind_check 
> (m=0x328e178, p=0x328e148, time=29107058565136113, pkt_len=46) at 
> /opt/istokes/dpdk-19.11//x86_64-native-linuxapp-gcc/include/rte_meter.h:599
> #1  0x0000000000d4abc1 in trtcm_policer_run_single_packet 
> (policer=0x2774200, pkt=0x1508fbd40, time=29107058565136113) at 
> lib/netdev-dpdk.c:4649
> #2  0x0000000000d4ad24 in trtcm_policer_run (conf=0x2774200, 
> pkts=0x7f3db8005100, pkt_cnt=32, should_steal=true) at 
> lib/netdev-dpdk.c:4691
> #3  0x0000000000d45299 in netdev_dpdk_qos_run (dev=0x17fd68840, 
> pkts=0x7f3db8005100, cnt=32, should_steal=true) at 
> lib/netdev-dpdk.c:2421
> #4  0x0000000000d45db0 in netdev_dpdk_send__ (dev=0x17fd68840, qid=1, 
> batch=0x7f3db80050f0, concurrent_txq=false) at lib/netdev-dpdk.c:2683
> #5  0x0000000000d45ee9 in netdev_dpdk_eth_send (netdev=0x17fd688c0, 
> qid=1, batch=0x7f3db80050f0, concurrent_txq=false) at 
> lib/netdev-dpdk.c:2710
> #6  0x0000000000c342ba in netdev_send (netdev=0x17fd688c0, qid=1, 
> batch=0x7f3db80050f0, concurrent_txq=false) at lib/netdev.c:814
> #7  0x0000000000beb3de in dp_netdev_pmd_flush_output_on_port 
> (pmd=0x7f3dce735010, p=0x7f3db80050c0) at lib/dpif-netdev.c:4224
> #8  0x0000000000beb5c4 in dp_netdev_pmd_flush_output_packets 
> (pmd=0x7f3dce735010, force=false) at lib/dpif-netdev.c:4264
> #9  0x0000000000beb814 in dp_netdev_process_rxq_port 
> (pmd=0x7f3dce735010, rxq=0x328d930, port_no=2) at 
> lib/dpif-netdev.c:4319
> #10 0x0000000000bef432 in pmd_thread_main (f_=0x7f3dce735010) at 
> lib/dpif-netdev.c:5556
> #11 0x0000000000cb24e5 in ovsthread_wrapper (aux_=0x326d220) at 
> lib/ovs-thread.c:383
> #12 0x00007f3de11d236d in start_thread (arg=0x7f3dce734700) at 
> pthread_create.c:456
> #13 0x00007f3de06bab4f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>
> Regards
> Ian
>>> +                                                          
>>> &queue->meter_profile,
>>> +                                                          
>>> time,
>>> +                                                          
>>> pkt_len);
>>> +
>>> +    if (pkt_color == RTE_COLOR_RED) {
>>> +        queue->stats.tx_errors++;
>>> +    } else {
>>> +        queue->stats.tx_bytes += pkt_len;
>>> +        queue->stats.tx_packets++;
>>> +    }
>>> +
>>> +    pkt_color = 
>>> rte_meter_trtcm_rfc4115_color_aware_check(&policer->meter,
>>> +                                                     
>>> &policer->meter_profile,
>>> +                                                     
>>> time, pkt_len,
>>> +                                                     
>>> pkt_color);
>>> +
>>> +    if (pkt_color == RTE_COLOR_RED) {
>>> +        policer->stats.tx_errors++;
>>> +        return false;
>>> +    }
>>> +
>>> +    policer->stats.tx_bytes += pkt_len;
>>> +    policer->stats.tx_packets++;
>>> +    return true;
>>> +}
>>> +
>>> +static int
>>> +trtcm_policer_run(struct qos_conf *conf, struct rte_mbuf **pkts, 
>>> int pkt_cnt,
>>> +                  bool should_steal)
>>> +{
>>> +    int i = 0;
>>> +    int cnt = 0;
>>> +    struct rte_mbuf *pkt = NULL;
>>> +    uint64_t current_time = rte_rdtsc();
>>> +
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    for (i = 0; i < pkt_cnt; i++) {
>>> +        pkt = pkts[i];
>>> +
>>> +        if (trtcm_policer_run_single_packet(policer, pkt, 
>>> current_time)) {
>>> +            if (cnt != i) {
>>> +                pkts[cnt] = pkt;
>>> +            }
>>> +            cnt++;
>>> +        } else {
>>> +            if (should_steal) {
>>> +                rte_pktmbuf_free(pkt);
>>> +            }
>>> +        }
>>> +    }
>>> +    return cnt;
>>> +}
>>> +
>>> +static int
>>> +trtcm_policer_qos_queue_construct(const struct smap *details,
>>> +                                  
>>> uint32_t queue_id, struct qos_conf *conf)
>>> +{
>>> +    int err = 0;
>>> +    struct trtcm_policer_queue *queue;
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    queue = trtcm_policer_qos_find_queue(policer, queue_id);
>>> +    if (!queue) {
>>> +        queue = xmalloc(sizeof *queue);
>>> +        queue->queue_id = queue_id;
>>> +        memset(&queue->stats, 0, sizeof queue->stats);
>>> +        queue->stats.created = time_msec();
>>> +        hmap_insert(&policer->queues, &queue->hmap_node,
>>> +                    hash_2words(queue_id, 0));
>>> +    }
>>> +    if (queue_id == 0 && smap_is_empty(details)) {
>>> +        /* No default queue configured, use port values */
>>> +        memcpy(&queue->meter_params, &policer->meter_params,
>>> +               sizeof queue->meter_params);
>>> +    } else {
>>> +        trtcm_policer_details_to_param(details, 
>>> &queue->meter_params);
>>> +    }
>>> +
>>> +    err = 
>>> rte_meter_trtcm_rfc4115_profile_config(&queue->meter_profile,
>>> +                                                 
>>> &queue->meter_params);
>>> +
>>> +    if (!err) {
>>> +        err = rte_meter_trtcm_rfc4115_config(&queue->meter,
>>> +                                             
>>> &queue->meter_profile);
>>> +    }
>>> +    if (err) {
>>> +        hmap_remove(&policer->queues, &queue->hmap_node);
>>> +        free(queue);
>>> +        err = -err;
>>> +    }
>>> +    return err;
>>> +}
>>> +
>>> +static void
>>> +trtcm_policer_qos_queue_destruct(struct qos_conf *conf, uint32_t 
>>> queue_id)
>>> +{
>>> +    struct trtcm_policer_queue *queue;
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    queue = trtcm_policer_qos_find_queue(policer, queue_id);
>>> +    if (queue) {
>>> +        hmap_remove(&policer->queues, &queue->hmap_node);
>>> +        free(queue);
>>> +    }
>>> +}
>>> +
>>> +static int
>>> +trtcm_policer_qos_queue_get(struct smap *details, uint32_t 
>>> queue_id,
>>> +                            const struct 
>>> qos_conf *conf)
>>> +{
>>> +    struct trtcm_policer_queue *queue;
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    queue = trtcm_policer_qos_find_queue(policer, queue_id);
>>> +    if (!queue) {
>>> +        return EINVAL;
>>> +    }
>>> +
>>> +    trtcm_policer_param_to_detail(&queue->meter_params, 
>>> details);
>>> +    return 0;
>>> +}
>>> +
>>> +static int
>>> +trtcm_policer_qos_queue_get_stats(const struct qos_conf *conf,
>>> +                                  
>>> uint32_t queue_id,
>>> +                                  
>>> struct netdev_queue_stats *stats)
>>> +{
>>> +    struct trtcm_policer_queue *queue;
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    queue = trtcm_policer_qos_find_queue(policer, queue_id);
>>> +    if (!queue) {
>>> +        return EINVAL;
>>> +    }
>>> +    memcpy(stats, &queue->stats, sizeof *stats);
>>> +    return 0;
>>> +}
>>> +
>>> +static int
>>> +trtcm_policer_qos_queue_dump_state_init(const struct qos_conf 
>>> *conf,
>>> +                                        
>>> struct netdev_dpdk_queue_state *state)
>>> +{
>>> +    uint32_t i = 0;
>>> +    struct trtcm_policer_queue *queue;
>>> +    struct trtcm_policer *policer = CONTAINER_OF(conf, struct 
>>> trtcm_policer,
>>> +                                                 
>>> qos_conf);
>>> +
>>> +    state->n_queues = hmap_count(&policer->queues);
>>> +    state->cur_queue = 0;
>>> +    state->queues = xmalloc(state->n_queues * sizeof 
>>> *state->queues);
>>> +
>>> +    HMAP_FOR_EACH (queue, hmap_node, &policer->queues) {
>>> +        state->queues[i++] = queue->queue_id;
>>> +    }
>>> +    return 0;
>>> +}
>>> +
>>> +static const struct dpdk_qos_ops trtcm_policer_ops = {
>>> +    .qos_name = "trtcm-policer",
>>> +    .qos_construct = trtcm_policer_qos_construct,
>>> +    .qos_destruct = trtcm_policer_qos_destruct,
>>> +    .qos_get = trtcm_policer_qos_get,
>>> +    .qos_is_equal = trtcm_policer_qos_is_equal,
>>> +    .qos_run = trtcm_policer_run,
>>> +    .qos_queue_construct = trtcm_policer_qos_queue_construct,
>>> +    .qos_queue_destruct = trtcm_policer_qos_queue_destruct,
>>> +    .qos_queue_get = trtcm_policer_qos_queue_get,
>>> +    .qos_queue_get_stats = trtcm_policer_qos_queue_get_stats,
>>> +    .qos_queue_dump_state_init = 
>>> trtcm_policer_qos_queue_dump_state_init
>>> +};
>>> +
>>>   static int
>>>   netdev_dpdk_reconfigure(struct netdev *netdev)
>>>   {
>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>>> index 0ec726c..c43cb1a 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -4441,6 +4441,19 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 
>>> type=patch options:peer=p1 \
>>>             in performance will be noticed in terms of 
>>> overall aggregate traffic
>>>             throughput.
>>>           </dd>
>>> +        <dt><code>trtcm-policer</code></dt>
>>> +        <dd>
>>> +          A DPDK egress policer algorithm using RFC 4115's 
>>> Two-Rate,
>>> +          Three-Color marker. It's a two-level 
>>> hierarchical policer
>>> +          which first does a color-blind marking of the 
>>> traffic at the queue
>>> +          level, followed by a color-aware marking at the 
>>> port level. At the
>>> +          end traffic marked as Green or Yellow is 
>>> forwarded, Red is dropped.
>>> +          For details on how traffic is marked, see RFC 
>>> 4115.
>>> +
>>> +          If the ``default queue'', 0, is not configured 
>>> it's automatically
>>> +          created with the same <code>other_config</code> 
>>> values as the
>>> +          physical port.
>>> +        </dd>
>>>         </dl>
>>>       </column>
>>> @@ -4508,6 +4521,27 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 
>>> type=patch options:peer=p1 \
>>>           bytes/tokens of the packet. If there are not 
>>> enough tokens in the cbs
>>>           bucket the packet will be dropped.
>>>         </column>
>>> +      <column name="other_config" key="eir" type='{"type": 
>>> "integer"}'>
>>> +        The Excess Information Rate (EIR) is measured in 
>>> bytes of IP
>>> +        packets per second, i.e. it includes the IP header, 
>>> but not link
>>> +        specific (e.g. Ethernet) headers. This represents 
>>> the bytes per second
>>> +        rate at which the token bucket will be updated. The 
>>> eir value is
>>> +        calculated by (pps x packet data size).  For 
>>> example assuming a user
>>> +        wishes to limit a stream consisting of 64 byte 
>>> packets to 1 million
>>> +        packets per second the EIR would be set to to to 
>>> 46000000. This value
>>> +        can be broken into '1,000,000 x 46'. Where 1,000,000 
>>> is the policing
>>> +        rate for the number of packets per second and 46 
>>> represents the size
>>> +        of the packet data for a 64 byte ip packet.
>>> +      </column>
>>> +      <column name="other_config" key="ebs" type='{"type": 
>>> "integer"}'>
>>> +        The Excess Burst Size (EBS) is measured in bytes and 
>>> represents a
>>> +        token bucket. At a minimum this value should be be 
>>> set to the expected
>>> +        largest size packet in the traffic stream. In 
>>> practice larger values
>>> +        may be used to increase the size of the token 
>>> bucket. If a packet can
>>> +        be transmitted then the ebs will be decremented by 
>>> the number of
>>> +        bytes/tokens of the packet. If there are not enough 
>>> tokens in the cbs
>>> +        bucket the packet might be dropped.
>>> +      </column>
>>>       </group>
>>>       <group title="Configuration for linux-sfq">
>>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev



More information about the dev mailing list