[ovs-dev] [PATCH] dpif-netdev: Fix unsafe accesses to pmd polling lists.

Aaron Conole aconole at redhat.com
Mon Feb 11 19:15:00 UTC 2019


Ilya Maximets <i.maximets at samsung.com> writes:

> All accesses to 'pmd->poll_list' should be guarded by
> 'pmd->port_mutex'. Additionally fixed inappropriate usage
> of 'HMAP_FOR_EACH_SAFE' (hmap doesn't change in a loop)
> and dropped not needed local variable 'proc_cycles'.
>
> CC: Nitin Katiyar <nitin.katiyar at ericsson.com>
> Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
> ---
>  lib/dpif-netdev.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 4d5bc576a..914b2bb8b 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -3776,9 +3776,12 @@ set_pmd_auto_lb(struct dp_netdev *dp)
>              continue;
>          }
>  
> +        ovs_mutex_lock(&pmd->port_mutex);
>          if (hmap_count(&pmd->poll_list) > 1) {
>              multi_rxq = true;
>          }
> +        ovs_mutex_unlock(&pmd->port_mutex);

What does this mutex provide here?  I ask because as soon as we unlock,
the result is no longer assured, and we've used it to inform the
multi_rxq value.

Are you afraid that the read is non-atomic so you'll end up with the
multi_rxq condition true, even when it should not be?  That can happen
anyway - as soon as we unlock the count can change.

Maybe there's a better change to support that, where hmap_count does an
atomic read, or there's a version that can guarantee a full load with
no intervening writes.  But I don't see 

Maybe the pmd object lifetime goes away... but that wouldn't make sense,
because then the port_mutex itself would be invalid.

I'm confused what it does to add a lock here for either the safety, or
the logic.  I probably missed something.  Or maybe it's just trying to
be safe in case the hmap implementation would change in the future?  I
think if that's the case, there might be an alternate to implement?

> +
>          if (cnt && multi_rxq) {
>                  enable_alb = true;
>                  break;
> @@ -5095,7 +5098,7 @@ pmd_rebalance_dry_run(struct dp_netdev *dp)
>      uint64_t improvement = 0;
>      uint32_t num_pmds;
>      uint32_t *pmd_corelist;
> -    struct rxq_poll *poll, *poll_next;
> +    struct rxq_poll *poll;
>      bool ret;
>  
>      num_pmds = cmap_count(&dp->poll_threads);
> @@ -5121,13 +5124,14 @@ pmd_rebalance_dry_run(struct dp_netdev *dp)
>          /* Estimate the cycles to cover all intervals. */
>          total_cycles *= PMD_RXQ_INTERVAL_MAX;
>  
> -        HMAP_FOR_EACH_SAFE (poll, poll_next, node, &pmd->poll_list) {
> -            uint64_t proc_cycles = 0;
> +        ovs_mutex_lock(&pmd->port_mutex);
> +        HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
>              for (unsigned i = 0; i < PMD_RXQ_INTERVAL_MAX; i++) {
> -                proc_cycles += dp_netdev_rxq_get_intrvl_cycles(poll->rxq, i);
> +                total_proc += dp_netdev_rxq_get_intrvl_cycles(poll->rxq, i);
>              }
> -            total_proc += proc_cycles;
>          }
> +        ovs_mutex_unlock(&pmd->port_mutex);
> +

This lock, otoh, makes sense to me.

>          if (total_proc) {
>              curr_pmd_usage[num_pmds] = (total_proc * 100) / total_cycles;
>          }


More information about the dev mailing list