[ovs-dev] [PATCH net v2] net: openvswitch: fix to make sure flow_lookup() is not preempted
Sebastian Andrzej Siewior
bigeasy at linutronix.de
Thu Oct 15 12:34:34 UTC 2020
On 2020-10-15 11:46:53 [+0200], Eelco Chaudron wrote:
> The flow_lookup() function uses per CPU variables, which must not be
> preempted. However, this is fine in the general napi use case where
> the local BH is disabled. But, it's also called in the netlink
> context, which is preemptible. The below patch makes sure that even
> in the netlink path, preemption is disabled.
I would suggest to rephrase it: the term preemption usually means
preempt_disable(). A preempt disabled section can be preempted /
interrupted by hardirq and softirq. The later is mentioned and I think
> In addition, the u64_stats_update_begin() sync point was not protected,
> making the sync point part of the per CPU variable fixed this.
I would rephrase it and mention the key details:
u64_stats_update_begin() requires a lock to ensure one writer which is
not ensured here. Making it per-CPU and disabling NAPI (softirq) ensures
that there is always only one writer.
Regarding the annotation which were mentioned here in the thread.
Basically the this_cpu_ptr() warning worked as expected and got us here.
I don't think it is wise to add annotation distinguished from the actual
problem like assert_the_softirq_is_switched_off() in flow_lookup(). The
assert may become obsolete once the reason is removed and gets overseen
and remains in the code. The commits
c60c32a577561 ("posix-cpu-timers: Remove lockdep_assert_irqs_disabled()")
f9dae5554aed4 ("dpaa2-eth: Remove preempt_disable() from seed_pool()")
are just two examples which came to mind while writing this.
Instead I would prefer lockdep annotation in u64_stats_update_begin()
which is around also in 64bit kernels and complains if it is seen
without disabled BH if observed in-serving-softirq.
PeterZ, wasn't this mentioned before?
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -851,9 +852,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
> struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
> u32 __always_unused n_mask_hit;
> u32 __always_unused n_cache_hit;
> + struct sw_flow *flow;
> u32 index = 0;
> - return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> + /* This function gets called trough the netlink interface and therefore
> + * is preemptible. However, flow_lookup() function needs to be called
> + * with preemption disabled due to CPU specific variables.
preemption vs BH.
> + */
> + local_bh_disable();
> + flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> + local_bh_enable();
> + return flow;
> struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
Otherwise it looks good.
More information about the dev