[ovs-dev] [PATCH] dpif-netdev: Fix unsafe accesses to pmd polling lists.

Ilya Maximets i.maximets at samsung.com
Tue Feb 12 07:40:30 UTC 2019


On 11.02.2019 22:15, Aaron Conole wrote:
> Ilya Maximets <i.maximets at samsung.com> writes:
> 
>> All accesses to 'pmd->poll_list' should be guarded by
>> 'pmd->port_mutex'. Additionally fixed inappropriate usage
>> of 'HMAP_FOR_EACH_SAFE' (hmap doesn't change in a loop)
>> and dropped not needed local variable 'proc_cycles'.
>>
>> CC: Nitin Katiyar <nitin.katiyar at ericsson.com>
>> Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
>> ---
>>  lib/dpif-netdev.c | 14 +++++++++-----
>>  1 file changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 4d5bc576a..914b2bb8b 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -3776,9 +3776,12 @@ set_pmd_auto_lb(struct dp_netdev *dp)
>>              continue;
>>          }
>>  
>> +        ovs_mutex_lock(&pmd->port_mutex);
>>          if (hmap_count(&pmd->poll_list) > 1) {
>>              multi_rxq = true;
>>          }
>> +        ovs_mutex_unlock(&pmd->port_mutex);
> 
> What does this mutex provide here?  I ask because as soon as we unlock,
> the result is no longer assured, and we've used it to inform the
> multi_rxq value.
> 
> Are you afraid that the read is non-atomic so you'll end up with the
> multi_rxq condition true, even when it should not be?  That can happen
> anyway - as soon as we unlock the count can change.
> 
> Maybe there's a better change to support that, where hmap_count does an
> atomic read, or there's a version that can guarantee a full load with
> no intervening writes.  But I don't see 
> 
> Maybe the pmd object lifetime goes away... but that wouldn't make sense,
> because then the port_mutex itself would be invalid.
> 
> I'm confused what it does to add a lock here for either the safety, or
> the logic.  I probably missed something.  Or maybe it's just trying to
> be safe in case the hmap implementation would change in the future?  I
> think if that's the case, there might be an alternate to implement?

The mutex here is mostly for a code consistency. We're marking the 'poll_list'
as OVS_GUARDED. This means that we should guard it. Unfortunately, unlike С++,
in C we can not use OVS_GUARDED_BY(port_mutex) inside the structure and
clang could not track what metex should protect the variable. I agree that we're
not really need to protect this. Furthermore, the only writer for 'poll_list'
is the main thread and 'set_pmd_auto_lb' is also called only in a main thread.
So we don't need to protect any read accesses.
However, for the safety, these details should not be taken into account to not
provide code snippets that could be copied to the other (less safe) places and
produce issues there. Especially because clang can't protect us from this kind
of mistakes. Code architecture also could be changed in the future.
As a developer I'd like to always assume that any access to hmap is not thread
safe regardless of its internal implementation. Just like I'm assuming that
cmap is safe for readers.
This is affordable since we're not on a hot path.

> 
>> +
>>          if (cnt && multi_rxq) {
>>                  enable_alb = true;
>>                  break;
>> @@ -5095,7 +5098,7 @@ pmd_rebalance_dry_run(struct dp_netdev *dp)
>>      uint64_t improvement = 0;
>>      uint32_t num_pmds;
>>      uint32_t *pmd_corelist;
>> -    struct rxq_poll *poll, *poll_next;
>> +    struct rxq_poll *poll;
>>      bool ret;
>>  
>>      num_pmds = cmap_count(&dp->poll_threads);
>> @@ -5121,13 +5124,14 @@ pmd_rebalance_dry_run(struct dp_netdev *dp)
>>          /* Estimate the cycles to cover all intervals. */
>>          total_cycles *= PMD_RXQ_INTERVAL_MAX;
>>  
>> -        HMAP_FOR_EACH_SAFE (poll, poll_next, node, &pmd->poll_list) {
>> -            uint64_t proc_cycles = 0;
>> +        ovs_mutex_lock(&pmd->port_mutex);
>> +        HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
>>              for (unsigned i = 0; i < PMD_RXQ_INTERVAL_MAX; i++) {
>> -                proc_cycles += dp_netdev_rxq_get_intrvl_cycles(poll->rxq, i);
>> +                total_proc += dp_netdev_rxq_get_intrvl_cycles(poll->rxq, i);
>>              }
>> -            total_proc += proc_cycles;
>>          }
>> +        ovs_mutex_unlock(&pmd->port_mutex);
>> +
> 
> This lock, otoh, makes sense to me.
> 
>>          if (total_proc) {
>>              curr_pmd_usage[num_pmds] = (total_proc * 100) / total_cycles;
>>          }
> 
> 


More information about the dev mailing list