[ovs-dev] [PATCH] classifier: Prevent tries vs n_tries race leading to NULL dereference.

Ilya Maximets i.maximets at ovn.org
Tue Apr 14 12:11:05 UTC 2020


On 4/14/20 6:00 AM, Eiichi Tsukata wrote:
> Currently classifier tries and n_tries can be updated not atomically,
> there is a race condition which can lead to NULL dereference.
> The race can happen when main thread updates a classifier tries and
> n_tries in classifier_set_prefix_fields() and at the same time revalidator
> or handler thread try to lookup them in classifier_lookup__(). Such race
> can be triggered when user changes prefixes of flow_table.
> 
> Race(user changes flow_table prefixes: ip_dst,ip_src => none):
> 
>   [main thread]             [revalidator/handler thread]
>   ===========================================================
>                             /* cls->n_tries == 2 */
>                             for (int i = 0; i < cls->n_tries; i++) {
>   trie_init(cls, i, NULL);
>   /* n_tries == 0 */
>   cls->n_tries = n_tries;
>                             /* cls->tries[i]->feild is NULL */
>                             trie_ctx_init(&trie_ctx[i],&cls->tries[i]);
>                             /* trie->field is NULL */
>                             ctx->be32ofs = trie->field->flow_be32ofs;

Thanks, Eiichi!
Good catch.  The code here is definitely broken and needs to be fixed.

Some comments inline.

Best regards, Ilya Maximets.

> 
> To prevent the race, instead of re-introducing internal mutex
> implemented in the commit fccd7c092e09 ("classifier: Remove internal
> mutex."), this patch uses trie field after it checked subtable trie_plen
> is synchronized with field. This leaves classifier lookup lockless
> and prevent the NULL deref.
> 
> Fixes: fccd7c092e09 ("classifier: Remove internal mutex.")
> Signed-off-by: Eiichi Tsukata <eiichi.tsukata at nutanix.com>
> ---
>  lib/classifier.c | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/classifier.c b/lib/classifier.c
> index 0fad953..9f70180 100644
> --- a/lib/classifier.c
> +++ b/lib/classifier.c
> @@ -839,7 +839,6 @@ classifier_remove_assert(struct classifier *cls,
>  struct trie_ctx {
>      const struct cls_trie *trie;
>      bool lookup_done;        /* Status of the lookup. */
> -    uint8_t be32ofs;         /* U32 offset of the field in question. */
>      unsigned int maskbits;   /* Prefix length needed to avoid false matches. */
>      union trie_prefix match_plens;  /* Bitmask of prefix lengths with possible
>                                       * matches. */
> @@ -849,7 +848,6 @@ static void
>  trie_ctx_init(struct trie_ctx *ctx, const struct cls_trie *trie)
>  {
>      ctx->trie = trie;
> -    ctx->be32ofs = trie->field->flow_be32ofs;
>      ctx->lookup_done = false;
>  }
>  
> @@ -1575,11 +1573,16 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], unsigned int n_tries,
>       * fields using the prefix tries.  The trie checks are done only as
>       * needed to avoid folding in additional bits to the wildcards mask. */
>      for (j = 0; j < n_tries; j++) {
> -        /* Is the trie field relevant for this subtable, and
> -           is the trie field within the current range of fields? */
> -        if (field_plen[j] &&
> -            flowmap_is_set(&range_map, trie_ctx[j].be32ofs / 2)) {
> -            struct trie_ctx *ctx = &trie_ctx[j];
> +        struct trie_ctx *ctx = &trie_ctx[j];
> +        /* Check if trie field is relevant for this subtable and
> +         * synchronized with field_plen. */
> +        if (!field_plen[j] || !ctx->trie->field) {
> +            continue;
> +        }
> +
> +        uint8_t be32ofs = ctx->trie->field->flow_be32ofs;

What will prevent 'field' being cleared between above two lines and produce
a NULL pointer dereference here or inside trie_lookup()?

trie->field is not protected by anything and not even an atomic variable.
Since it always points to some static memory objects it doesn't need to
be RCU protected, but it should be an atomic variable and it must be checked
after any read.  In fact it might be made rcu protected just to make things
easier and more accurate.

> +        /* Is the trie field within the current range of fields? */
> +        if (flowmap_is_set(&range_map, be32ofs / 2)) {
>  
>              /* On-demand trie lookup. */
>              if (!ctx->lookup_done) {
> @@ -1601,14 +1604,14 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], unsigned int n_tries,
>                   * than this subtable would otherwise. */
>                  if (ctx->maskbits <= field_plen[j]) {
>                      /* Unwildcard the bits and skip the rest. */
> -                    mask_set_prefix_bits(wc, ctx->be32ofs, ctx->maskbits);
> +                    mask_set_prefix_bits(wc, be32ofs, ctx->maskbits);
>                      /* Note: Prerequisite already unwildcarded, as the only
>                       * prerequisite of the supported trie lookup fields is
>                       * the ethertype, which is always unwildcarded. */
>                      return true;
>                  }
>                  /* Can skip if the field is already unwildcarded. */
> -                if (mask_prefix_bits_set(wc, ctx->be32ofs, ctx->maskbits)) {
> +                if (mask_prefix_bits_set(wc, be32ofs, ctx->maskbits)) {
>                      return true;
>                  }
>              }
> 



More information about the dev mailing list