[ovs-dev] [PATCH 1/2] datapath: Always execute actions in BH context.

Jesse Gross jesse at nicira.com
Thu Oct 6 03:40:56 UTC 2011


On Wed, Oct 5, 2011 at 12:39 PM, Pravin B Shelar <pshelar at nicira.com> wrote:
>    Following patch make sure that BH is disabled while running
> execute_actions() on packets from userspace. RCU annotation for BH is
> used to fix false positive warning message.
>
> Signed-off-by: Pravin B Shelar <pshelar at nicira.com>
> Bug #7621

I spent quite a bit of time looking at this and reading the RCU
implementation today and I think that we actually have a correctness
problem that existed before this patch but this makes it worse.

In certain situations, the lockdep warning is actually not a false
positive because
 the difference between rcu_read_lock and rcu_read_lock_bh is more
than disabling bottom halves.  In most situations, we are fine because
rcu_read_lock amounts to preempt_disable() and the _bh version is
local_bh_disable().  In these cases our interchanged use of them
doesn't matter as long as you don't assume that bottom halves are
disabled, which we don't.

However, with CONFIG_PREEMPT_RCU rcu_read_lock turns into a counter,
while _bh continues to only disable bottom halves.  This means that
rcu_read_lock_bh is essentially invisible to call_rcu in this mode and
we can have a use after free bug.

As a result, there are two completely separate issues here:
* Disabling of bottom halves.  The previous assumptions here were
correct - all received packets with the exception of those coming from
our userspace have bottom halves disabled because transmit from the
network stack uses rcu_read_lock_bh and receive occurs in softirq
context.

* Handling of rcu_read_lock: We need to be more careful here about
mixing and matching.  For the most part it probably doesn't really
matter all that much since we largely already run with bottom halves
disabled and don't generally need faster grace periods.  The two
exceptions are:
 - Whether we want to disable bottom halves in all the various places
that access our data structures (such as reading port config).
- Getting faster grace periods for removed flows.  This is the one
area that it might be useful to have faster grace periods since we can
accumulate a lot of garbage with a DoS attack.  This may not matter
all that much because userspace has to delete the flows, which gives
us an opportunity for a grace period.  Running with a preemptible
kernel could be bad though because a lower priority task could hold
rcu_read_lock and prevent a grace period (hopefully fixed by using
CONFIG_RCU_BOOST).

Given all that, what I would do is:
 * Use the non-bh versions of the rcu functions as we currently do and
take rcu_read_lock around vport_receive() in internal_dev_xmit().  We
should then run lockdep to make sure that it doesn't flag warnings any
more.
 * Still disable bottom halves for packets coming from userspace as
you have done here.  I think the removal of _bh from locks can be
moved into this patch and there are some other places like the
in_interrupt() in vport-internal_dev.c can be removed.



More information about the dev mailing list