[ovs-dev] [PATCH net v2] net: openvswitch: fix to make sure flow_lookup() is not preempted

Sebastian Andrzej Siewior bigeasy at linutronix.de
Thu Oct 15 12:34:34 UTC 2020

On 2020-10-15 11:46:53 [+0200], Eelco Chaudron wrote:
> The flow_lookup() function uses per CPU variables, which must not be
> preempted. However, this is fine in the general napi use case where
> the local BH is disabled. But, it's also called in the netlink
> context, which is preemptible. The below patch makes sure that even
> in the netlink path, preemption is disabled.

I would suggest to rephrase it: the term preemption usually means
preempt_disable(). A preempt disabled section can be preempted /
interrupted by hardirq and softirq. The later is mentioned and I think
is confusing.

> In addition, the u64_stats_update_begin() sync point was not protected,
> making the sync point part of the per CPU variable fixed this.

I would rephrase it and mention the key details:
u64_stats_update_begin() requires a lock to ensure one writer which is
not ensured here. Making it per-CPU and disabling NAPI (softirq) ensures
that there is always only one writer.

Regarding the annotation which were mentioned here in the thread.
Basically the this_cpu_ptr() warning worked as expected and got us here.
I don't think it is wise to add annotation distinguished from the actual
problem like assert_the_softirq_is_switched_off() in flow_lookup(). The
assert may become obsolete once the reason is removed and gets overseen
and remains in the code. The commits

	c60c32a577561 ("posix-cpu-timers: Remove lockdep_assert_irqs_disabled()")
	f9dae5554aed4 ("dpaa2-eth: Remove preempt_disable() from seed_pool()")

are just two examples which came to mind while writing this.

Instead I would prefer lockdep annotation in u64_stats_update_begin()
which is around also in 64bit kernels and complains if it is seen
without disabled BH if observed in-serving-softirq.
PeterZ, wasn't this mentioned before?

> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -851,9 +852,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
>  	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
>  	u32 __always_unused n_mask_hit;
>  	u32 __always_unused n_cache_hit;
> +	struct sw_flow *flow;
>  	u32 index = 0;
> -	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> +	/* This function gets called trough the netlink interface and therefore
> +	 * is preemptible. However, flow_lookup() function needs to be called
> +	 * with preemption disabled due to CPU specific variables.

preemption vs BH.

> +	 */
> +	local_bh_disable();
> +	flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> +	local_bh_enable();
> +	return flow;
>  }
>  struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,

Otherwise it looks good.


More information about the dev mailing list