[ovs-dev] [PATCH 18/19] datapath: Hold mutex for DP while userspace is blocking on read.

Jesse Gross jesse at nicira.com
Thu Dec 9 21:34:42 UTC 2010


On Thu, Dec 9, 2010 at 10:22 AM, Ben Pfaff <blp at nicira.com> wrote:
> On Wed, Dec 08, 2010 at 10:14:16PM -0800, Jesse Gross wrote:
>> Currently we get a pointer to the DP in openvswitch_read() and
>> openvswitch_poll() and use it without any synchronization.  This means
>> that the DP could disappear from underneath us while we are using it.
>> Currently, this isn't a problem because userspace is single threaded but
>> it's better for the locking to be correct.
>>
>> With this change we hold the mutex while doing a blocking wait, which
>> means that no changes can be made, including adding/removing flows.  It's
>> possible to make this finer grained but for the time being that isn't done,
>> since current userspace doesn't care.
>>
>> Found with lockdep.
>>
>> Signed-off-by: Jesse Gross <jesse at nicira.com>
>
> I'm only OK with this because:
>
>        1. We're going to get rid of this function and thus this locking
>           strategy for it anyway with the move to Netlink.
>
>        2. Current userspace always opens these devices in nonblocking
>           mode anyway.

I agree, it's not great but I wasn't that interested in optimizing
something that's going away.  I did want to at least make it correct
though.

>
> It's really weird that I haven't seen this lockdep warning.  I always
> run with lockdep enabled.  I wonder if there's something wonky with it,
> it puzzled me why didn't report something else a few days ago too.

The version in master won't flag this, you need the extra checks
introduced in 2.6.37 that I've started using locally.  In this case,
get_dp() is documented as requiring either rcu_read_lock() or
dp_mutex, so I changed the dereference to:

rcu_dereference_check(dps[dp_idx], rcu_read_lock_held() ||
					 lockdep_is_held(&dp_mutex))

Also, by default lockdep only flags the first RCU warning and we have
a lot of false positives currently (like the one you mentioned the
other day), so it reported that and stopped.




More information about the dev mailing list