[ovs-dev] [PATCH] classifier: Prevent tries vs n_tries race leading to NULL dereference.
Eiichi Tsukata
eiichi.tsukata at nutanix.com
Mon Apr 20 00:08:38 UTC 2020
> On Apr 18, 2020, at 1:14, Ilya Maximets <i.maximets at ovn.org> wrote:
>
> On 4/15/20 2:24 AM, Eiichi Tsukata wrote:
>> Thanks for comments, Ilya.
>>
>>> On Apr 14, 2020, at 21:11, Ilya Maximets <i.maximets at ovn.org> wrote:
>>>
>>> On 4/14/20 6:00 AM, Eiichi Tsukata wrote:
>>>>
>>>>
>>>> @@ -1575,11 +1573,16 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], unsigned int n_tries,
>>>> * fields using the prefix tries. The trie checks are done only as
>>>> * needed to avoid folding in additional bits to the wildcards mask. */
>>>> for (j = 0; j < n_tries; j++) {
>>>> - /* Is the trie field relevant for this subtable, and
>>>> - is the trie field within the current range of fields? */
>>>> - if (field_plen[j] &&
>>>> - flowmap_is_set(&range_map, trie_ctx[j].be32ofs / 2)) {
>>>> - struct trie_ctx *ctx = &trie_ctx[j];
>>>> + struct trie_ctx *ctx = &trie_ctx[j];
>>>> + /* Check if trie field is relevant for this subtable and
>>>> + * synchronized with field_plen. */
>>>> + if (!field_plen[j] || !ctx->trie->field) {
>>>> + continue;
>>>> + }
>>>> +
>>>> + uint8_t be32ofs = ctx->trie->field->flow_be32ofs;
>>>
>>> What will prevent 'field' being cleared between above two lines and produce
>>> a NULL pointer dereference here or inside trie_lookup()?
>>
>>
>> Main thread calls ovsrcu_synchronize() before it sets cls->tries[i]->field = NULL.
>> Here ovsrcu_synchronize() waits for other readers(revalidator/handler thread)
>> of rcu protected pointers going through RCU quiescent state.
>>
>> So here, main thread who called ovsrcu_synchronize() waits for revalidator/handler thread
>> going through quiescent state. It means 'be32ofs = ctx->trie->field->flow_be32ofs;'
>> is executed before main thread executes 'cls->tries[i]->field = NULL;’.
>
> It's correct that ovsrcu_synchronize() waits for other threads to go through RCU
> quiescent state, but it doesn't work as a thread barrier. Threads are free to
> run, noone stops them and they could be at any point of their execution flow.
>
> And at the moment main thread sets field to NULL it's immediately visible to all
> threads in common case. And if one of the readers was right between above two
> lines it will crash.
The code checks "!field_plen[j]” (RCU protected subtable->plen[j]) before it checks !ctx->trie->field.
It means that reader threads do not see NULL field before main thread completes calling
osvrcu_synchronize().
>
>>
>>>
>>> trie->field is not protected by anything and not even an atomic variable.
>>> Since it always points to some static memory objects it doesn't need to
>>> be RCU protected, but it should be an atomic variable and it must be checked
>>> after any read. In fact it might be made rcu protected just to make things
>>> easier and more accurate.
>>>
>>
>> I also had thought that protecting trie->field by RCU is a good idea, but found the
>> following race condition:
>>
>> [main thread] [revalidator/handler thread]
>> ========================================================================
>> ovsrcu_set(&trie->field, NULL);
>> ovsrcu_synchronize();
>> Quiecense State;
>> for (int i = 0; i < cls->n_tries; i++)
>> /* field == NULL */
>> field = ovsrcu_get(struct mf_field *, &trie->field);
>> be32ofs = field->flow_be32ofs;
>
> If fields is an RCU protected variable, you must check the value after each
> ovsrcu_get() operation.
I see.
Now I’ve come to think that my initial approach is not easy to understand and fragile through
discussion with you. I’ll rewrite my patch to make mf_field RCU protected.
Please let me know if you have any other good idea.
Thanks
Eiichi
More information about the dev
mailing list