[ovs-dev] [PATCH] dpif-netdev: proper tx queue id

Ilya Maximets i.maximets at samsung.com
Thu Aug 6 10:41:30 UTC 2015



On 05.08.2015 20:04, Daniele Di Proietto wrote:
> 
> 
> On 05/08/2015 17:47, "Ilya Maximets" <i.maximets at samsung.com> wrote:
> 
>>
>>
>> On 05.08.2015 19:26, Daniele Di Proietto wrote:
>>>
>>>
>>> On 05/08/2015 16:42, "Ilya Maximets" <i.maximets at samsung.com> wrote:
>>>
>>>> Sorry, I agree that example is incorrect. It is really not true,
>>>> because
>>>> of using ovs_numa_get_n_cores() to call netdev_set_multiq().
>>>> No, I didn't actually observe a bug.
>>>>
>>>> But there is another example:
>>>>
>>>> 	same configuration(2 pmd threads with 1 port,
>>>> 	2 rxqs per port and pmd-cpu-mask = 00000014).
>>>>
>>>> 	pmd_1->tx_qid = 2, pmd_2->tx_qid = 4,
>>>> 	txq_needs_locking = true (if device hasn't ovs_numa_get_n_cores()
>>>> queues)
>>>>
>>>> 	Lets netdev->real_n_txq = 2; (device has 2 queues)
>>>>
>>>> 	In that case, after truncating in netdev_dpdk_send__()
>>>> 		'qid = qid % dev->real_n_txq;'
>>>> 	pmd_1: qid = 2 % 2 = 0
>>>> 	pmd_2: qid = 4 % 2 = 0
>>>>
>>>> 	So, both threads will call dpdk_queue_pkts() with same qid = 0.
>>>> 	This is unexpected behavior if there is 2 tx queues in device.
>>>> 	Queue #1 will not be used and both threads will lock queue #0
>>>> 	on each send.
>>>
>>> Yes, that is true.  In general it is hard to properly distribute
>>> transmission queues because potentially every pmd thread can send
>>> a packet to any netdev.
>>>
>>> I agree that we can do better than this. Currently we create a txq
>>> for every core (not for every pmd thread), because we don't want
>>> to reconfigure every device when a new thread is added (I rembember
>>> discussing this with Alex, perhaps he can provide more insight).
>>> We can always change the code to create one txq for every pmd
>>> thread (I'd appreciate other opinions on this).
>>>
>>>>
>>>> About your example:
>>>> 	2 pmd threads can't call netdev_send() with same tx_qid,
>>>> 	because pmd->tx_qid = pmd->core_id and there is only one thread
>>>> 	with core_id = 0. See dp_netdev_configure_pmd().
>>>>
>>>> 	So,
>>>> 	pmd1 will call netdev_send(netdev=dpdk0, tx_qid= *pmd1->core_id* )
>>>> 	pmd2 will call netdev_send(netdev=dpdk0, tx_qid= *pmd2->core_id* )
>>>
>>> I agree, on current master they can't. I meant with this patch applied.
>>>
>>
>> Yes, with patch applied calls with same tx_qid can happen concurrently,
>> but there is rte_spinlock_lock(&dev->tx_q[qid].tx_lock) for that case
>> inside netdev_dpdk_send__(). Other netdevs doesn't use qid at all.
> 
> The lock is used only if the device doesn't have enough transmission
> queues.
> If the device has enough txqs, the concurrent calls won't take a lock.
> In general netdev_send should be called concurrently with different txq_id
> (https://github.com/openvswitch/ovs/blob/master/lib/netdev.c#L678)

Yes, I understood. Problem is that in my patch tx_qid depends on total
number of rx queues. It is bad. I'll work on it. Current version shouldn't
be applied.
Idea2: How about adding a cmap for tx queues to struct dp_netdev_pmd_thread,
where will be stored all tx queues, available for that pmd thread? And
port_no will be a key(hash) for that cmap.


About that example:

>>>> But there is another example:
>>>>
>>>> 	same configuration(2 pmd threads with 1 port,
>>>> 	2 rxqs per port and pmd-cpu-mask = 00000014).
>>>>
>>>> 	pmd_1->tx_qid = 2, pmd_2->tx_qid = 4,
>>>> 	txq_needs_locking = true (if device hasn't ovs_numa_get_n_cores()
>>>> queues)
>>>>
>>>> 	Lets netdev->real_n_txq = 2; (device has 2 queues)
>>>>
>>>> 	In that case, after truncating in netdev_dpdk_send__()
>>>> 		'qid = qid % dev->real_n_txq;'
>>>> 	pmd_1: qid = 2 % 2 = 0
>>>> 	pmd_2: qid = 4 % 2 = 0
>>>>
>>>> 	So, both threads will call dpdk_queue_pkts() with same qid = 0.
>>>> 	This is unexpected behavior if there is 2 tx queues in device.
>>>> 	Queue #1 will not be used and both threads will lock queue #0
>>>> 	on each send.

To fix that we may use pmd->index instead of core_id. They are also unique,
but sequential.

Like this:

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 83e55e7..77bb4ab 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2848,10 +2848,10 @@ dp_netdev_pmd_get_next(struct dp_netdev *dp, struct cmap_position *pos)
 }
 
 static int
-core_id_to_qid(unsigned core_id)
+index_to_qid(unsigned core_id, unsigned index)
 {
     if (core_id != NON_PMD_CORE_ID) {
-        return core_id;
+        return index;
     } else {
         return ovs_numa_get_n_cores();
     }
@@ -2865,7 +2865,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
     pmd->dp = dp;
     pmd->index = index;
     pmd->core_id = core_id;
-    pmd->tx_qid = core_id_to_qid(core_id);
+    pmd->tx_qid = index_to_qid(core_id, index);
     pmd->numa_id = numa_id;
 
     ovs_refcount_init(&pmd->ref_cnt);

What do you think?



More information about the dev mailing list