[ovs-dev] [PATCH] dpif-netdev: Fix improper use of CMAP_FOR_EACH.

Jarno Rajahalme jarno at ovn.org
Tue Feb 2 18:39:55 UTC 2016


With the small nits below:

Acked-by: Jarno Rajahalme <jarno at ovn.org>

> On Jan 26, 2016, at 9:13 PM, Daniele Di Proietto <diproiettod at vmware.com> wrote:
> 
> It is ok to iterate a cmap with CMAP_FOR_EACH and remove elements with
> cmap_remove(), but having quiescent states inside the loop might create
> problems, since some of the postponed cleanup done inside the cmap might
> be executed, leaving the iterator in an inconsisted state.

“inconsisted” -> “inconsistent”

It seems to me that the root of the problem is that when CMAP is rehashed by cmap_remove() and the thread quiesces during the iteration, the old cmap_impl that is still referred to by the iterator is freed and either trashed by the allocator or allocated for some other purpose. The rest of the iteration can then follow random pointers, which is not good. Hence the “leaving the iterator in an inconsistent state”, while correct, is a bit of an understatement.

I think we should add a comment about not quiescing while iterating to lib/cmap.h.

> 
> We had several of these errors in dpif-netdev, because when we rearrange
> ports or threads we ofter need to wait on a condition variable (which
> implies a quiescent state).
> 
> This problem caused iterations to skip elements or to list them twice,
> resulting in the main thread waiting on a condition without anyone else
> to signal.
> 
> Fix these cases by moving the possible quiescent states outside
> CMAP_FOR_EACH loops.
> 
> Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
> ---
> lib/dpif-netdev.c | 32 +++++++++++++++++++++++++++++---
> 1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index ad6b202..1ec04d5 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -933,6 +933,7 @@ dp_netdev_free(struct dp_netdev *dp)
> 
>     ovs_mutex_lock(&dp->port_mutex);
>     CMAP_FOR_EACH (port, node, &dp->ports) {
> +        /* PMD threads are destroyed here. do_del_port() cannot quiesce */
>         do_del_port(dp, port);
>     }
>     ovs_mutex_unlock(&dp->port_mutex);
> @@ -2914,10 +2915,24 @@ static void
> dp_netdev_destroy_all_pmds(struct dp_netdev *dp)
> {
>     struct dp_netdev_pmd_thread *pmd;
> +    struct dp_netdev_pmd_thread **pmd_list;
> +    size_t i = 0, n_pmds;
> +
> +    n_pmds = cmap_count(&dp->poll_threads);
> +    pmd_list = xcalloc(n_pmds, sizeof *pmd_list);
> 
>     CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
> -        dp_netdev_del_pmd(dp, pmd);
> +        /* We cannot call dp_netdev_del_pmd(), since it alters
> +         * 'dp->poll_threads' (while we're iterating it) and it
> +         * might quiesce. */
> +        ovs_assert(i < n_pmds);
> +        pmd_list[i++] = pmd;
> +    }
> +


So ‘I’ should be equal to ’n_pmd’ here? If you are asserting inside the loop above, maybe assert here, too, or then remove also the assert above? Or better yet, use another iteration variable for the second loop, like you do for dp_netdev_del_pmds_on_numa() below.

> +    for (i = 0; i < n_pmds; i++) {
> +        dp_netdev_del_pmd(dp, pmd_list[i]);
>     }
> +    free(pmd_list);
> }
> 
> /* Deletes all pmd threads on numa node 'numa_id' and
> @@ -2928,18 +2943,28 @@ dp_netdev_del_pmds_on_numa(struct dp_netdev *dp, int numa_id)
>     struct dp_netdev_pmd_thread *pmd;
>     int n_pmds_on_numa, n_pmds;
>     int *free_idx, k = 0;
> +    struct dp_netdev_pmd_thread **pmd_list;
> 
>     n_pmds_on_numa = get_n_pmd_threads_on_numa(dp, numa_id);
> -    free_idx = xmalloc(n_pmds_on_numa * sizeof *free_idx);
> +    free_idx = xcalloc(n_pmds_on_numa, sizeof *free_idx);
> +    pmd_list = xcalloc(n_pmds_on_numa, sizeof *pmd_list);
> 
>     CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
> +        /* We cannot call dp_netdev_del_pmd(), since it alters
> +         * 'dp->poll_threads' (while we're iterating it) and it
> +         * might quiesce. */
>         if (pmd->numa_id == numa_id) {
>             atomic_read_relaxed(&pmd->tx_qid, &free_idx[k]);
> +            pmd_list[k] = pmd;
> +            ovs_assert(k < n_pmds_on_numa);
>             k++;
> -            dp_netdev_del_pmd(dp, pmd);
>         }
>     }
> 
> +    for (int i = 0; i < k; i++) {
> +        dp_netdev_del_pmd(dp, pmd_list[i]);
> +    }
> +
>     n_pmds = get_n_pmd_threads(dp);
>     CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
>         int old_tx_qid;
> @@ -2953,6 +2978,7 @@ dp_netdev_del_pmds_on_numa(struct dp_netdev *dp, int numa_id)
>         }
>     }
> 
> +    free(pmd_list);
>     free(free_idx);
> }
> 
> -- 
> 2.1.4
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev




More information about the dev mailing list