[ovs-dev] [PATCH 3/4] dpif-netdev: Avoid port's reconfiguration on pmd-cpu-mask changes.

Ilya Maximets i.maximets at samsung.com
Mon May 29 15:37:37 UTC 2017


On 29.05.2017 18:00, Kevin Traynor wrote:
> On 05/29/2017 02:31 PM, Ilya Maximets wrote:
>> On 29.05.2017 16:26, Kevin Traynor wrote:
>>> On 05/29/2017 01:22 PM, Ilya Maximets wrote:
>>>> On 26.05.2017 20:14, Kevin Traynor wrote:
>>>>> On 05/26/2017 03:55 PM, Ilya Maximets wrote:
>>>>>> On 10.03.2017 07:27, Daniele Di Proietto wrote:
>>>>>>> 2017-02-21 6:49 GMT-08:00 Ilya Maximets <i.maximets at samsung.com>:
>>>>>>>> Reconfiguration of HW NICs may lead to packet drops.
>>>>>>>> In current model all physical ports will be reconfigured each
>>>>>>>> time number of PMD threads changed. Since we not stopping
>>>>>>>> threads on pmd-cpu-mask changes, this patch will help to further
>>>>>>>> decrease port's downtime by setting the maximum possible number
>>>>>>>> of wanted tx queues to avoid unnecessary reconfigurations.
>>>>>>>>
>>>>>>>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
>>>>>>>
>>>>>>> I haven't thought this through a lot, but the last big series we pushed
>>>>>>> on master went exactly in the opposite direction, i.e. created one txq
>>>>>>> for each thread in the datapath.
>>>>>>>
>>>>>>> I thought this was a good idea because:
>>>>>>>
>>>>>>> * On some systems with hyperthreading we can have a lot of cpus (we received
>>>>>>>    reports of systems with 72 cores). If you want to use only a couple of cores
>>>>>>>    you're still forced to have a lot of unused txqs, which prevent you
>>>>>>> from having
>>>>>>>    lockless tx.
>>>>>>> * We thought that reconfiguring the number of pmds would not be a frequent
>>>>>>>    operation.
>>>>>>>
>>>>>>> Why do you want to reconfigure the threads that often?  Is it to be
>>>>>>> able to support more throughput quickly?
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>> In this case I think we shouldn't use the number of cpus,
>>>>>>> but something else coming from the user, so that the administrator can
>>>>>>> balance how
>>>>>>> quickly pmd threads can be reconfigured vs how many txqs should be on
>>>>>>> the system.
>>>>>>> I'm not sure how to make this user friendly though.
>>>>>>>
>>>>>>> What do you think?
>>>>>>
>>>>>> Right now I'm thinking about combined solution which will respect
>>>>>> both issues (too big number of TXQs and frequent device reconfiguration).
>>>>>> I think, we can implement additional function to get port's limitations.
>>>>>> For now we can request the maximum number of TX queues from netdev and
>>>>>> use it while number of cores less then number of queues.
>>>>>> Something like this:
>>>>>> 	
>>>>>> lib/netdev-dpdk.c:
>>>>>> uint32_t netdev_dpdk_get_max_txqs(struct netdev *netdev)
>>>>>> {
>>>>>>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>>>     struct rte_eth_dev_info dev_info;
>>>>>>
>>>>>>     ovs_mutex_lock(&dev->mutex);
>>>>>>     rte_eth_dev_info_get(dev->port_id, &dev_info);
>>>>>>     ovs_mutex_unlock(&dev->mutex);
>>>>>>
>>>>>>     return dev_info.max_tx_queues;
>>>>>> }
>>>>>>
>>>>>> lib/dpif-netdev.c:reconfigure_datapath():
>>>>>>
>>>>>>     <----->
>>>>>>     max_tx_queues = netdev_get_max_txqs(port->netdev);
>>>>>>     number_of_threads = cmap_count(&dp->poll_threads);
>>>>>>     wanted_txqs = MAX(max_tx_queues, number_of_threads);
>>>>>>     <----->
>>>>>>
>>>>>> In this implementation there will be no additional locking if number of
>>>>>> threads less or equal to maximum possible number of tx queues in HW.
>>>>>>
>>>>>> What do you think? Ian? Darrell?
>>>>>>
>>>>>
>>>>> I'm not sure about using rte_eth_dev_info_get() as IIRC there was
>>>>> problems previously with it reporting a number, but then when you went
>>>>> to configure them they were not all available depending on mode. That
>>>>> was why the trial and error queue configure was put in.
>>>>>
>>>>> What about replacing 'max_tx_queues' above with a number like 16 that is
>>>>> likely to be supported by the ports and unlikely be exceeded by
>>>>> number_of_threads?
>>>>>
>>>>> Kevin.
>>>>
>>>> Hi Kevin. Thank you for reminding me of this issue.
>>>>
>>>> But I think that magic numbers is not a good solution.
>>>>
>>>> One issue in my first implementation is that desired number of queues is
>>>> not actually the same as required number.
>>>>
>>>> How about something like this:
>>>> <----------------------------------------------------------------->
>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>>> index 97f72d3..1a18e4f 100644
>>>> --- a/lib/dpif-netdev.c
>>>> +++ b/lib/dpif-netdev.c
>>>> @@ -3448,7 +3448,7 @@ reconfigure_datapath(struct dp_netdev *dp)
>>>>  {
>>>>      struct dp_netdev_pmd_thread *pmd;
>>>>      struct dp_netdev_port *port;
>>>> -    int wanted_txqs;
>>>> +    int needed_txqs, wanted_txqs;
>>>>  
>>>>      dp->last_reconfigure_seq = seq_read(dp->reconfigure_seq);
>>>>  
>>>> @@ -3456,7 +3456,15 @@ reconfigure_datapath(struct dp_netdev *dp)
>>>>       * on the system and the user configuration. */
>>>>      reconfigure_pmd_threads(dp);
>>>>  
>>>> -    wanted_txqs = cmap_count(&dp->poll_threads);
>>>> +    /* We need 1 Tx queue for each thread to avoid locking, but we will try
>>>> +     * to allocate the maximum possible value to minimize the number of port
>>>> +     * reconfigurations. */
>>>> +    needed_txqs = cmap_count(&dp->poll_threads);
>>>> +    /* (n_cores + 1) is the maximum that we might need to have.
>>>> +     * Additional queue is for non-PMD threads. */
>>>> +    wanted_txqs = ovs_numa_get_n_cores();
>>>> +    ovs_assert(wanted_txqs != OVS_CORE_UNSPEC);
>>>> +    wanted_txqs++;
>>>>  
>>>>      /* The number of pmd threads might have changed, or a port can be new:
>>>>       * adjust the txqs. */
>>>> @@ -3469,9 +3477,13 @@ reconfigure_datapath(struct dp_netdev *dp)
>>>>  
>>>>      /* Check for all the ports that need reconfiguration.  We cache this in
>>>>       * 'port->reconfigure', because netdev_is_reconf_required() can change at
>>>> -     * any time. */
>>>> +     * any time.  Also mark for reconfiguration all ports which will likely
>>>> +     * change their 'dynamic_txqs' parameter. It's required to stop using
>>>> +     * them before changing this setting. */
>>>>      HMAP_FOR_EACH (port, node, &dp->ports) {
>>>> -        if (netdev_is_reconf_required(port->netdev)) {
>>>> +        if (netdev_is_reconf_required(port->netdev)
>>>> +            || (port->dynamic_txqs
>>>> +                !=  netdev_n_txq(port->netdev) < needed_txqs)) {
>>>
>>> I'm not quite sure why there needs to be a reconfigure here. It will
>>> again try with the same wanted_txqs. Is it not enough to just update the
>>> dynamic_txqs flag as is done later ?
>>
>> It's not enough because reconfiguration will not be requested if we're
>> trying to set up same number of queues second time.
>>
> 
> My question is - why request a reconfigure at all based on needed_txqs?
> 
> We already configured requesting wanted_txqs (=ovs_numa_get_n_cores())
> and now we would configure again requesting the same num wanted_txqs,
> regardless of the new needed_txqs value.
> 
> Could we just update the dynamic_txqs flag (as you have below) to
> reflect any new difference between configured txqs and needed_txqs.
> Maybe I missed some part?
> 
> Kevin.

Ok. I got it. The reason is in fact that we need to remove these ports
from PMD threads. It's simpler to mark ports here and allow
'pmd_remove_stale_ports' to remove them from threads.

After that where will be no actual reconfiguration because 'port_reconfigure'
will just check that nothing changed (branch /* Reconfiguration is unnecessary */).

If we want to avoid calling 'port_reconfigure' we'll have to remove
ports from threads using additional loop and handle them separately
in reconfiguration loop. I think, my solution just simpler and not
really slower because there will be no actual reconfiguration.

What do you think?

> 
>> See lib/netdev-dpdk.c:netdev_dpdk_set_tx_multiq() for details:
>>
>>      if (dev->requested_n_txq == n_txq) {                                       
>>          goto out;                                                              
>>      }
>>
>>
>>  
>>>>              port->need_reconfigure = true;
>>>>          }
>>>>      }
>>>> @@ -3505,7 +3517,7 @@ reconfigure_datapath(struct dp_netdev *dp)
>>>>              seq_change(dp->port_seq);
>>>>              port_destroy(port);
>>>>          } else {
>>>> -            port->dynamic_txqs = netdev_n_txq(port->netdev) < wanted_txqs;
>>>> +            port->dynamic_txqs = netdev_n_txq(port->netdev) < needed_txqs;
>>>>          }
>>>>      }
>>>>  
>>>> <----------------------------------------------------------------->
>>>>
>>>> Best regards, Ilya Maximets.
>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Daniele
>>>>>>>
>>>>>>>> ---
>>>>>>>>  lib/dpif-netdev.c | 6 +++++-
>>>>>>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>>>>>>> index 6e575ab..e2b4f39 100644
>>>>>>>> --- a/lib/dpif-netdev.c
>>>>>>>> +++ b/lib/dpif-netdev.c
>>>>>>>> @@ -3324,7 +3324,11 @@ reconfigure_datapath(struct dp_netdev *dp)
>>>>>>>>       * on the system and the user configuration. */
>>>>>>>>      reconfigure_pmd_threads(dp);
>>>>>>>>
>>>>>>>> -    wanted_txqs = cmap_count(&dp->poll_threads);
>>>>>>>> +    /* We need 1 Tx queue for each possible cpu core. */
>>>>>>>> +    wanted_txqs = ovs_numa_get_n_cores();
>>>>>>>> +    ovs_assert(wanted_txqs != OVS_CORE_UNSPEC);
>>>>>>>> +    /* And 1 Tx queue for non-PMD threads. */
>>>>>>>> +    wanted_txqs++;
>>>>>>>>
>>>>>>>>      /* The number of pmd threads might have changed, or a port can be new:
>>>>>>>>       * adjust the txqs. */
>>>>>>>> --
>>>>>>>> 2.7.4
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> dev at openvswitch.org
>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>>>
> 
> 
> 
> 


More information about the dev mailing list