[ovs-dev] [PATCH] dpif-netdev: Allow PMD auto load balance with cross-numa.

Kevin Traynor ktraynor at redhat.com
Mon Mar 15 13:55:52 UTC 2021


Hi Eelco, thanks for reviewing.

On 15/03/2021 11:45, Eelco Chaudron wrote:
> 
> 
> On 11 Mar 2021, at 16:04, Kevin Traynor wrote:
> 
>> Previously auto load balance did not trigger a reassignment when
>> there was any cross-numa polling as an rxq could be polled from a
>> different numa after reassign and it could impact estimates.
>>
>> In the case where there is only one numa with pmds available, the
>> same numa will always poll before and after reassignment, so estimates
>> are valid. Allow PMD auto load balance to trigger a reassignment in
>> this case.
>>
>> Signed-off-by: Kevin Traynor <ktraynor at redhat.com>
>> ---
>>  lib/dpif-netdev.c | 20 ++++++++++++++++----
>>  1 file changed, 16 insertions(+), 4 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 816945375..19a6ac2d9 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -4888,4 +4888,10 @@ struct rr_numa {
>>  };
>>
>> +static size_t
>> +rr_numa_list_count(struct rr_numa_list *rr)
>> +{
>> +    return hmap_count(&rr->numas);
>> +}
>> +
>>  static struct rr_numa *
>>  rr_numa_list_lookup(struct rr_numa_list *rr, int numa_id)
>> @@ -5601,8 +5607,14 @@ get_dry_run_variance(struct dp_netdev *dp, 
>> uint32_t *core_list,
>>          numa = rr_numa_list_lookup(&rr, numa_id);
>>          if (!numa) {
>> -            /* Abort if cross NUMA polling. */
>> -            VLOG_DBG("PMD auto lb dry run."
>> -                     " Aborting due to cross-numa polling.");
>> -            goto cleanup;
>> +            /* Check if there is just one NUMA with pmds,
>> +             * in that case estimates will be valid. */
>> +            if (rr_numa_list_count(&rr) == 1) {
>> +                numa = rr_numa_list_next(&rr, NULL);
>> +            }
>> +            if (!numa) {
>> +                VLOG_DBG("PMD auto lb dry run."
>> +                        " Aborting due to cross-numa polling.");
>> +                goto cleanup;
>> +            }
> 
> The checks look a bit miss-ordered? Would just adding the check, and 
> remove it if we ever support it, be more clear? So something like this:
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 816945375..ed348782f 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -5598,14 +5598,21 @@ get_dry_run_variance(struct dp_netdev *dp, 
> uint32_t *core_list,
> 
>       for (int i = 0; i < n_rxqs; i++) {
>           int numa_id = netdev_get_numa_id(rxqs[i]->port->netdev);
> -        numa = rr_numa_list_lookup(&rr, numa_id);
> -        if (!numa) {
> +
> +        if (rr_numa_list_count(&rr) > 1) {


Having more than one numa is fine in general. It only becomes an issue
for estimates if we can't get a pmd on the local numa and we need to do
cross-numa polling. So we check it after we find that there are no local
numa pmd cores available.

>               /* Abort if cross NUMA polling. */
>               VLOG_DBG("PMD auto lb dry run."
>                        " Aborting due to cross-numa polling.");
>               goto cleanup;
>           }
> 
> +        numa = rr_numa_list_lookup(&rr, numa_id);
> +        if (!numa) {
> +            VLOG_DBG("PMD auto lb dry run."
> +                     " Aborting can't find numa id %d.”, numa_id);
> +            goto cleanup;
> +        }
> +
>           pmd = rr_numa_get_pmd(numa, true);
>           VLOG_DBG("PMD auto lb dry run. Predicted: Core %d on numa node 
> %d "
>                     "to be assigned port \'%s\' rx queue %d "
> 
> 
>>          }
>>
>> -- 
>> 2.29.2
>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev



More information about the dev mailing list