[ovs-dev] [PATCH] ofproto-dpif-upcall: Fix flow setup/delete race.

Paul Blakey paulb at mellanox.com
Wed Mar 29 06:59:53 UTC 2017



On 27/03/2017 22:55, Joe Stringer wrote:
> On 21 March 2017 at 02:44, Paul Blakey <paulb at mellanox.com> wrote:
>>
>>
>> On 21/03/2017 10:04, Paul Blakey wrote:
>>>
>>>
>>>
>>> On 20/03/2017 23:08, Joe Stringer wrote:
>>>>
>>>> If a handler thread takes a long time to set up a set of flows, it is
>>>> possible for one of the installed flows to be dumped and scheduled
>>>> for deletion by a revalidator thread before the handler is able to
>>>> transition the ukey into an operational state---Between the
>>>> dpif_operate() above this function and the ukey lock / transition logic
>>>> modified by this patch.
>>>>
>>>> Only transition the ukey for the flow if it wasn't already transitioned
>>>> to a later state by a revalidator thread.
>>>>
>>>> Fixes: 54ebeff4c03d ("upcall: Track ukey states.")
>>>> Reported-by: Paul Blakey <paulb at mellanox.com>
>>>> Signed-off-by: Joe Stringer <joe at ovn.org>
>>>> ---
>>>>  ofproto/ofproto-dpif-upcall.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/ofproto/ofproto-dpif-upcall.c
>>>> b/ofproto/ofproto-dpif-upcall.c
>>>> index 07086ee385cc..0854807e4482 100644
>>>> --- a/ofproto/ofproto-dpif-upcall.c
>>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>>> @@ -1404,7 +1404,7 @@ handle_upcalls(struct udpif *udpif, struct
>>>> upcall *upcalls,
>>>>              ovs_mutex_lock(&ukey->mutex);
>>>>              if (ops[i].dop.error) {
>>>>                  transition_ukey(ukey, UKEY_EVICTED);
>>>> -            } else {
>>>> +            } else if (ukey->state < UKEY_OPERATIONAL) {
>>>>                  transition_ukey(ukey, UKEY_OPERATIONAL);
>>>>              }
>>>>              ovs_mutex_unlock(&ukey->mutex);
>>>>
>>>
>>> Hi Joe,
>>> As per other thread, I think there is a trouble locking the mutex in
>>> case there is no error, as the flow is installed it can be removed
>>> entirely by revalidator's revalidate/sweep:
>>>
>>> Here is the timing:
>>>
>>> Handler installs the flow
>>> Handler thread is scheduled out before trying to lock the mutex
>>> Revalidators revalidate dump the installed flow and decide to evict it
>>> Revalidators evict the flow (push_dp_ops in revalidate)
>>> Revalidator sweep delete the ukey ukey_delete(umap, ukey);
>>> Revalidator calling ukey_delete postpones ukey_delete__ to free the ukey
>>> and mutex
>>> Handler thread is scheduled again and tries to acquire the freed lock.
>>>
>>> Is this correct?
>>> If so, and I didn't miss something, I've suggested a alternate patch in
>>> the other thread (only lock the mutex and transition to EVICTED in case
>>> of an ops[i] error, leave OEPRATIONAL for dump)
>>>
>>
>> Hi,
>> The above timing/scheduling can't happen as the actual freeing of the ukey
>> (ukey_delete__) is postponed after the handler returns from handle_upcalls.
>> I've missed that with the other thread's patch because it used xsleep as you
>> said which calls ovsrcu_quiesce_start. With sleep instead it seems to be
>> postponed to after poll_block in the handler thread runs (and we don't have
>> references there anymore).
>>
>> I don't know how the handler thread actually signals it doesn't have any
>> references to the ukey anymore, can you expand on that mechanism?
>
> Hi Paul,
>
> Thanks for testing this out. I've been a bit busy last week so didn't
> get a chance to respond.
>
> The handler thread will not quiesce while installing a flow, so it is
> guaranteed to be within the same RCU "locked" period. Until it
> quiesces, any RCU-protected structures must not be freed.


It adds the
> ukey into the cmap which makes it available to be read from other
> threads, and retains a reference to it. The revalidator will dump the
> flow, lookup the entry and find it, then apparently decide to delete
> the ukey. It removes the ukey from the cmap right now, but it is not
> allowed to actually free it until the next RCU grace period. The
> revalidator thread may continue on and reach an RCU grace period,
> quiesce, and then from that thread's perspective it should be fine to
> free the ukey. However, all threads must quiesce before any of the
> memory currently referenced can be freed. So the handler thread must
> finish dealing with this upcall then quiesce before the ukey will be
> freed. Finally, once the handler quiesces, both threads have reached
> the end of that RCU "locked" period and the RCU thread may go through
> and clean up the ukey.
>
> I hope that clears things up?
>
> Cheers,
> Joe
>

Hi,
Yes it does, but where exactly does the handler thread enters a quiesce 
state after installing the flow? (Doesn't it have to call 
ovsrcu_quiesce_start? is the thread entering a sleeping state enough?)

Thanks.





More information about the dev mailing list