[ovs-dev] [PATCH net-next] net: openvswitch: fix to make sure flow_lookup() is not preempted

Wed Oct 14 10:44:23 UTC 2020


On 13 Oct 2020, at 14:53, Sebastian Andrzej Siewior wrote:

> On 2020-10-13 14:44:19 [+0200], Eelco Chaudron wrote:
>> The flow_lookup() function uses per CPU variables, which must not be
>> preempted. However, this is fine in the general napi use case where
>> the local BH is disabled. But, it's also called in the netlink
>> context, which is preemptible. The below patch makes sure that even
>> in the netlink path, preemption is disabled.
>>
>> Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on 
>> usage")
>> Reported-by: Juri Lelli <jlelli at redhat.com>
>> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
>> ---
>>  net/openvswitch/flow_table.c |   10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/openvswitch/flow_table.c 
>> b/net/openvswitch/flow_table.c
>> index 87c286ad660e..16289386632b 100644
>> --- a/net/openvswitch/flow_table.c
>> +++ b/net/openvswitch/flow_table.c
>> @@ -850,9 +850,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct 
>> flow_table *tbl,
>>  	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
>>  	u32 __always_unused n_mask_hit;
>>  	u32 __always_unused n_cache_hit;
>> +	struct sw_flow *flow;
>>  	u32 index = 0;
>>
>> -	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, 
>> &index);
>> +	/* This function gets called trough the netlink interface and 
>> therefore
>> +	 * is preemptible. However, flow_lookup() function needs to be 
>> called
>> +	 * with preemption disabled due to CPU specific variables.
>> +	 */
>
> Once again. u64_stats_update_begin(). What protects you against
> concurrent access.

Thanks Sebastian for repeating this, as I thought I went over the 
seqcount code and thought it should be fine for my use case. However 
based on this comment I went over it again, and found the logic part I 
was constantly missing :)

My idea is to send a v2 patch and in addition to the preempt_disable() 
also make the seqcount part per CPU. I noticed other parts of the 
networking stack doing it the same way. So the patch would look 
something like:

@@ -731,7 +732,7 @@ static struct sw_flow *flow_lookup(struct flow_table 
*tbl,
                                    u32 *n_cache_hit,
                                    u32 *index)
  {
-       u64 *usage_counters = this_cpu_ptr(ma->masks_usage_cntr);
+       struct mask_array_stats *stats = 
this_cpu_ptr(ma->masks_usage_stats);
         struct sw_flow *flow;
         struct sw_flow_mask *mask;
         int i;
@@ -741,9 +742,9 @@ static struct sw_flow *flow_lookup(struct flow_table 
*tbl,
                 if (mask) {
                         flow = masked_flow_lookup(ti, key, mask, 
n_mask_hit);
                         if (flow) {
-                               u64_stats_update_begin(&ma->syncp);
-                               usage_counters[*index]++;
-                               u64_stats_update_end(&ma->syncp);
+                               u64_stats_update_begin(&stats->syncp);
+                               stats->usage_cntr[*index]++;
+                               u64_stats_update_end(&stats->syncp);
                                 (*n_cache_hit)++;
                                 return flow;
                         }

Let me know your thoughts.


Thanks,

Eelco