[ovs-dev] [PATCH 0/3] Output packet batching.

Bodireddy, Bhanuprakash bhanuprakash.bodireddy at intel.com
Mon Jul 3 10:15:28 UTC 2017


>This patch-set inspired by [1] from Bhanuprakash Bodireddy.
>Implementation of [1] looks very complex and introduces many pitfalls for
>later code modifications like possible packet stucks.
>
>This version targeted to make simple and flexible output packet batching on
>higher level without introducing and even simplifying netdev layer.

I didn't test the patches yet. In this series, the batching is done at dpif layer where as in [1] it's in netdev layer.
In [1], batching was implemented by introducing intermediate queue in netdev layer and had some added 
complexity due to XPS.

However I think [1] is more flexible and can be easily tweaked to suit different use cases along with some of the
APIs potentially consumed by future implementations.

1. Why [1] is flexible?

PMD thread polls the rxq[s] mapped to it and post classification transmits the packets on the tx ports.
For optimum performance, we need to queue and burst maximum no. of packets to mitigate transmission (MMIO) cost.
As it is now (on master),  we end up transmitting fewer packets due to current instant send logic.

Bit of background on this patch series:
In *v1* of [1], we added intermediate queue and tried flushing the packets once every 1024 PMD polling 
cycles as below.

----------------------pmd_thread_main()--------------------------------------------------
pmd_thread_main(void *f_) {

     for (i = 0; i < poll_cnt; i++) {
            dp_netdev_process_rxq_port(pmd, poll_list[i].rx, poll_list[i].port_no);
     }

     if (lc++ > 1024) {
                 if ((now - prev) > DRAIN_TSC) {
                          HMAP_FOR_EACH (tx_port, node, &pmd->send_port_cache) {
                                      dp_netdev_flush_txq_port(pmd, tx_port->port, now);
                          }
                 }
    }
..
}
---------------------------------------------------------------------------------------------------

Pros: 
-  The idea behind bursting them once(lc > 1024 cycles) instead of per rxq port processing, was to queue more 
    packets in the respective txq[s] ports and burst them to greatly improve throughput.
    
Cons:
-  Increases latency as flushing happens once every 1024 polling cycle.

Minimizing Latency:
   To minimize latency 'INTERIM_QUEUE_BURST_THRESHOLD' was introduced that can be tuned 
   based on use case (throughput hungry vs latency sensitive). Infact we also published nos. with
   BURST_THRESHOLD set to 16 and flushing triggered every 1024 Polling cycles.  This was done to
   allow users to tune thresholds for their respective use cases.

However in *V3 [1]* the flushing was performed per rxq processing to get the patches accepted
as latency was raised as primary concern. 

2. Why flush APIs in netdev layer?

The Flush APIs were introduced for dpdk and vHost User ports as they can be consumed in future
 by other implementation(QoS Priority queues).

Also the current queueing and flushing logic can be abstracted using DPDK APIs rte_eth_tx_buffer(),
rte_eth_tx_buffer_flush() further simplifying the code a lot. 

We were targeting some of the optimizations in the future like using rte functions for buffering, 
flushing and further introducing timer based flushing logic that invokes flushing in a timely manner
instead of per rxq port there by having a balance between throughput and latency.

>
>Patch set consists of 3 patches. All the functionality introduced in the first
>patch. Two others are just cleanups of netdevs to not do unnecessary things.

The cleanup of code in netdevs is good.

- Bhanuprakash.



More information about the dev mailing list