[ovs-dev] dpif-netdev: Fix improper use of CMAP_FOR_EACH.

Ilya Maximets i.maximets at samsung.com
Thu Jan 28 06:26:37 UTC 2016


On 28.01.2016 04:47, Daniele Di Proietto wrote:
> 
> 
> On 27/01/2016 01:34, "Ilya Maximets" <i.maximets at samsung.com> wrote:
> 
>> Thanks for this patch.
>> I have seen this bug while testing my patches for rx queues, but I thought
>> that it was the problem in my specific testing system.
>>
>> It was easily reproducible. Now it's gone (with this patch applied).
>> Thanks, that you figured out the origin of the problem.
> 
> Yeah, I noticed that too while testing that series, but this bug is
> pre-existing and unrelated :-)
> 
>>
>> One comment inline.
>>
>> Other than this:
>> Tested-by: Ilya Maximets <i.maximets at samsung.com>
> 
> Thanks for testing this
> 
>>
>> On 27.01.2016 08:13, Daniele Di Proietto wrote:
>>> It is ok to iterate a cmap with CMAP_FOR_EACH and remove elements with
>>> cmap_remove(), but having quiescent states inside the loop might create
>>> problems, since some of the postponed cleanup done inside the cmap might
>>> be executed, leaving the iterator in an inconsisted state.
>>>
>>> We had several of these errors in dpif-netdev, because when we rearrange
>>> ports or threads we ofter need to wait on a condition variable (which
>>> implies a quiescent state).
>>>
>>> This problem caused iterations to skip elements or to list them twice,
>>> resulting in the main thread waiting on a condition without anyone else
>>> to signal.
>>>
>>> Fix these cases by moving the possible quiescent states outside
>>> CMAP_FOR_EACH loops.
>>>
>>> Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
>>> ---
>>>  lib/dpif-netdev.c | 32 +++++++++++++++++++++++++++++---
>>>  1 file changed, 29 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index ad6b202..1ec04d5 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -933,6 +933,7 @@ dp_netdev_free(struct dp_netdev *dp)
>>>  
>>>      ovs_mutex_lock(&dp->port_mutex);
>>>      CMAP_FOR_EACH (port, node, &dp->ports) {
>>> +        /* PMD threads are destroyed here. do_del_port() cannot
>>> quiesce */
>>
>> This comment is wrong. There is a path to quiescent state in do_del_port:
>> do_del_port() -> dp_netdev_reload_pmd__() -> ovs_mutex_cond_wait() ->
>> 	-> ovsrcu_quiesce_start().
>> May be, this also should be fixed.
> 
> The pmd threads have been destroyed a few lines above, so this particular
> do_del_port() will never call dp_netdev_reload_pmd__().

Oh. Sorry, missed that.

Acked-by: Ilya Maximets <i.maximets at samsung.com>



More information about the dev mailing list