[ovs-dev] [PATCH] ofproto-dpif-upcall: Fix flow setup/delete race.

Paul Blakey paulb at mellanox.com
Tue Mar 21 09:44:01 UTC 2017



On 21/03/2017 10:04, Paul Blakey wrote:
>
>
> On 20/03/2017 23:08, Joe Stringer wrote:
>> If a handler thread takes a long time to set up a set of flows, it is
>> possible for one of the installed flows to be dumped and scheduled
>> for deletion by a revalidator thread before the handler is able to
>> transition the ukey into an operational state---Between the
>> dpif_operate() above this function and the ukey lock / transition logic
>> modified by this patch.
>>
>> Only transition the ukey for the flow if it wasn't already transitioned
>> to a later state by a revalidator thread.
>>
>> Fixes: 54ebeff4c03d ("upcall: Track ukey states.")
>> Reported-by: Paul Blakey <paulb at mellanox.com>
>> Signed-off-by: Joe Stringer <joe at ovn.org>
>> ---
>>  ofproto/ofproto-dpif-upcall.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c
>> b/ofproto/ofproto-dpif-upcall.c
>> index 07086ee385cc..0854807e4482 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -1404,7 +1404,7 @@ handle_upcalls(struct udpif *udpif, struct
>> upcall *upcalls,
>>              ovs_mutex_lock(&ukey->mutex);
>>              if (ops[i].dop.error) {
>>                  transition_ukey(ukey, UKEY_EVICTED);
>> -            } else {
>> +            } else if (ukey->state < UKEY_OPERATIONAL) {
>>                  transition_ukey(ukey, UKEY_OPERATIONAL);
>>              }
>>              ovs_mutex_unlock(&ukey->mutex);
>>
>
> Hi Joe,
> As per other thread, I think there is a trouble locking the mutex in
> case there is no error, as the flow is installed it can be removed
> entirely by revalidator's revalidate/sweep:
>
> Here is the timing:
>
> Handler installs the flow
> Handler thread is scheduled out before trying to lock the mutex
> Revalidators revalidate dump the installed flow and decide to evict it
> Revalidators evict the flow (push_dp_ops in revalidate)
> Revalidator sweep delete the ukey ukey_delete(umap, ukey);
> Revalidator calling ukey_delete postpones ukey_delete__ to free the ukey
> and mutex
> Handler thread is scheduled again and tries to acquire the freed lock.
>
> Is this correct?
> If so, and I didn't miss something, I've suggested a alternate patch in
> the other thread (only lock the mutex and transition to EVICTED in case
> of an ops[i] error, leave OEPRATIONAL for dump)
>

Hi,
The above timing/scheduling can't happen as the actual freeing of the 
ukey  (ukey_delete__) is postponed after the handler returns from 
handle_upcalls. I've missed that with the other thread's patch because 
it used xsleep as you said which calls ovsrcu_quiesce_start. With sleep 
instead it seems to be postponed to after poll_block in the handler 
thread runs (and we don't have references there anymore).

I don't know how the handler thread actually signals it doesn't have any 
references to the ukey anymore, can you expand on that mechanism?

Thanks,
Paul.










More information about the dev mailing list