[ovs-dev] [PATCH v2] dpif-netdev: Allow PMD auto load balance with cross-numa.

David Marchand david.marchand at redhat.com
Tue Mar 16 12:46:15 UTC 2021


On Mon, Mar 15, 2021 at 4:44 PM Kevin Traynor <ktraynor at redhat.com> wrote:
>
> Previously auto load balance did not trigger a reassignment when
> there was any cross-numa polling as an rxq could be polled from a
> different numa after reassign and it could impact estimates.
>
> In the case where there is only one numa with pmds available, the
> same numa will always poll before and after reassignment, so estimates
> are valid. Allow PMD auto load balance to trigger a reassignment in
> this case.
>
> Signed-off-by: Kevin Traynor <ktraynor at redhat.com>
> Acked-by: Eelco Chaudron <echaudro at redhat.com>
>
> ---
> v2:
> - Same logic as v1, combined two "ifs" as per David suggestion
> - Updated comments/logs
> - Updated the doc note that said it will not work for cross NUMA to
>   include new condition
> - Kept Eelco's Ack, as no logic changed
> ---
>  Documentation/topics/dpdk/pmd.rst |  9 ++++++---
>  lib/dpif-netdev.c                 | 16 +++++++++++++---
>  2 files changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst
> index caa7d97be..1f61bddb6 100644
> --- a/Documentation/topics/dpdk/pmd.rst
> +++ b/Documentation/topics/dpdk/pmd.rst
> @@ -238,7 +238,10 @@ If not set, the default variance improvement threshold is 25%.
>  .. note::
>
> -    PMD Auto Load Balancing doesn't currently work if queues are assigned
> -    cross NUMA as actual processing load could get worse after assignment
> -    as compared to what dry run predicts.
> +    PMD Auto Load Balancing doesn't request a reassignment if queues are
> +    assigned cross NUMA and there are multiple NUMA nodes available for
> +    reassignment. This is because reassignment to a different NUMA node could
> +    lead to an unpredictable change in processing cycles required for a queue.
> +    However, if there is only one cross NUMA node available then a dry run and
> +    possible request to reassign may continue as normal.
>
>  The minimum time between 2 consecutive PMD auto load balancing iterations can
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 816945375..29e74ee43 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4888,4 +4888,10 @@ struct rr_numa {
>  };
>
> +static size_t
> +rr_numa_list_count(struct rr_numa_list *rr)
> +{
> +    return hmap_count(&rr->numas);
> +}
> +
>  static struct rr_numa *
>  rr_numa_list_lookup(struct rr_numa_list *rr, int numa_id)
> @@ -5600,8 +5606,12 @@ get_dry_run_variance(struct dp_netdev *dp, uint32_t *core_list,
>          int numa_id = netdev_get_numa_id(rxqs[i]->port->netdev);
>          numa = rr_numa_list_lookup(&rr, numa_id);
> +        /* If there is no available pmd on the local numa but there is only one
> +         * numa for cross-numa polling, we can estimate the dry run. */
> +        if (!numa && rr_numa_list_count(&rr) == 1) {
> +            numa = rr_numa_list_next(&rr, NULL);
> +        }
>          if (!numa) {
> -            /* Abort if cross NUMA polling. */
> -            VLOG_DBG("PMD auto lb dry run."
> -                     " Aborting due to cross-numa polling.");
> +            VLOG_DBG("PMD auto lb dry run. Aborting due to "
> +                     "multiple numa nodes available for cross-numa polling.");
>              goto cleanup;
>          }

Acked-by: David Marchand <david.marchand at redhat.com>


-- 
David Marchand



More information about the dev mailing list