[ovs-dev] [PATCH] dpif-netdev: Fix improper use of CMAP_FOR_EACH.

Daniele Di Proietto diproiettod at vmware.com
Wed Feb 3 05:10:30 UTC 2016



On 02/02/2016 10:39, "Jarno Rajahalme" <jarno at ovn.org> wrote:

>With the small nits below:
>
>Acked-by: Jarno Rajahalme <jarno at ovn.org>

Thanks Jarno and Ilya for the reviews!

I applied this on master and branch-2.5

>
>> On Jan 26, 2016, at 9:13 PM, Daniele Di Proietto
>><diproiettod at vmware.com> wrote:
>> 
>> It is ok to iterate a cmap with CMAP_FOR_EACH and remove elements with
>> cmap_remove(), but having quiescent states inside the loop might create
>> problems, since some of the postponed cleanup done inside the cmap might
>> be executed, leaving the iterator in an inconsisted state.
>
>³inconsisted² -> ³inconsistent²
>
>It seems to me that the root of the problem is that when CMAP is rehashed
>by cmap_remove() and the thread quiesces during the iteration, the old
>cmap_impl that is still referred to by the iterator is freed and either
>trashed by the allocator or allocated for some other purpose. The rest of
>the iteration can then follow random pointers, which is not good. Hence
>the ³leaving the iterator in an inconsistent state², while correct, is a
>bit of an understatement.

I changed it to "freeing the memory that the iterator is using"

>
>I think we should add a comment about not quiescing while iterating to
>lib/cmap.h.

Good idea, I sent a patch:

http://openvswitch.org/pipermail/dev/2016-February/065560.html

>
>> 
>> We had several of these errors in dpif-netdev, because when we rearrange
>> ports or threads we ofter need to wait on a condition variable (which
>> implies a quiescent state).
>> 
>> This problem caused iterations to skip elements or to list them twice,
>> resulting in the main thread waiting on a condition without anyone else
>> to signal.
>> 
>> Fix these cases by moving the possible quiescent states outside
>> CMAP_FOR_EACH loops.
>> 
>> Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
>> ---
>> lib/dpif-netdev.c | 32 +++++++++++++++++++++++++++++---
>> 1 file changed, 29 insertions(+), 3 deletions(-)
>> 
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index ad6b202..1ec04d5 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -933,6 +933,7 @@ dp_netdev_free(struct dp_netdev *dp)
>> 
>>     ovs_mutex_lock(&dp->port_mutex);
>>     CMAP_FOR_EACH (port, node, &dp->ports) {
>> +        /* PMD threads are destroyed here. do_del_port() cannot
>>quiesce */
>>         do_del_port(dp, port);
>>     }
>>     ovs_mutex_unlock(&dp->port_mutex);
>> @@ -2914,10 +2915,24 @@ static void
>> dp_netdev_destroy_all_pmds(struct dp_netdev *dp)
>> {
>>     struct dp_netdev_pmd_thread *pmd;
>> +    struct dp_netdev_pmd_thread **pmd_list;
>> +    size_t i = 0, n_pmds;
>> +
>> +    n_pmds = cmap_count(&dp->poll_threads);
>> +    pmd_list = xcalloc(n_pmds, sizeof *pmd_list);
>> 
>>     CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
>> -        dp_netdev_del_pmd(dp, pmd);
>> +        /* We cannot call dp_netdev_del_pmd(), since it alters
>> +         * 'dp->poll_threads' (while we're iterating it) and it
>> +         * might quiesce. */
>> +        ovs_assert(i < n_pmds);
>> +        pmd_list[i++] = pmd;
>> +    }
>> +
>
>
>So ŒI¹ should be equal to ¹n_pmd¹ here? If you are asserting inside the
>loop above, maybe assert here, too, or then remove also the assert above?
>Or better yet, use another iteration variable for the second loop, like
>you do for dp_netdev_del_pmds_on_numa() below.

Makes sense, thanks.  I've introduced another variable
and made it like the loop below.

>
>> +    for (i = 0; i < n_pmds; i++) {
>> +        dp_netdev_del_pmd(dp, pmd_list[i]);
>>     }
>> +    free(pmd_list);
>> }
>> 
>> /* Deletes all pmd threads on numa node 'numa_id' and
>> @@ -2928,18 +2943,28 @@ dp_netdev_del_pmds_on_numa(struct dp_netdev
>>*dp, int numa_id)
>>     struct dp_netdev_pmd_thread *pmd;
>>     int n_pmds_on_numa, n_pmds;
>>     int *free_idx, k = 0;
>> +    struct dp_netdev_pmd_thread **pmd_list;
>> 
>>     n_pmds_on_numa = get_n_pmd_threads_on_numa(dp, numa_id);
>> -    free_idx = xmalloc(n_pmds_on_numa * sizeof *free_idx);
>> +    free_idx = xcalloc(n_pmds_on_numa, sizeof *free_idx);
>> +    pmd_list = xcalloc(n_pmds_on_numa, sizeof *pmd_list);
>> 
>>     CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
>> +        /* We cannot call dp_netdev_del_pmd(), since it alters
>> +         * 'dp->poll_threads' (while we're iterating it) and it
>> +         * might quiesce. */
>>         if (pmd->numa_id == numa_id) {
>>             atomic_read_relaxed(&pmd->tx_qid, &free_idx[k]);
>> +            pmd_list[k] = pmd;
>> +            ovs_assert(k < n_pmds_on_numa);
>>             k++;
>> -            dp_netdev_del_pmd(dp, pmd);
>>         }
>>     }
>> 
>> +    for (int i = 0; i < k; i++) {
>> +        dp_netdev_del_pmd(dp, pmd_list[i]);
>> +    }
>> +
>>     n_pmds = get_n_pmd_threads(dp);
>>     CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
>>         int old_tx_qid;
>> @@ -2953,6 +2978,7 @@ dp_netdev_del_pmds_on_numa(struct dp_netdev *dp,
>>int numa_id)
>>         }
>>     }
>> 
>> +    free(pmd_list);
>>     free(free_idx);
>> }
>> 
>> -- 
>> 2.1.4
>> 
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>




More information about the dev mailing list