[ovs-dev] [PATCH v10 1/1] Avoid dp_hash recirculation for balance-tcp bond selection mode

Vishal Deep Ajmera vishal.deep.ajmera at ericsson.com
Thu Jan 30 11:30:00 UTC 2020


>
> So, the root cause of all the issues in this patch, in my understanding,
> is the fact that you need to collect statistics for all the bond hashes
> in order to be able to rebalance traffic.  This forces you to have access
> to PMD local caches.
>
> The basic idea how to overcome this issue is to not have PMD local bond
> cache, but have an RCU-protected data structure instead.
>
> Memory layout scheme in current patch consists of 3 layers:
>   1. Global hash map for all bonds. One for the whole datapath.
>      Protected by dp->bond_mutex.
>   2. Hash map of all the bonds. One for each PMD thread.
>      Protected by pmd->bond_mutex.
>   3. Hash map of all the bonds. Local copy that could be used
>      lockless, but only by the owning PMD thread.
>
> Suggested layout #1:
>   Single global concurrent hash map (cmap) for all bonds. One for the whole
>   datapath.  Bond addition/deletion protected by the dp->bond_mutex.
>   Reads are lockless since protected by RCU.  Statistics updates must be
>   fully atomic (i.e. atomic_add_relaxed()).
>
> Suggested layout #2:
>   One cmap for each PMD thread (no global one).  Bond addition/deletion
>   protected by the pmd->bond_mutex.  Reads are lockless since protected
>   by RCU.  Statistics updates should be atomic in terms of reads and writes.
>   (non_atomic_ullong_add() function could be used).
>   (This is similar to how we store per-PMD flow tables.)
>
> #1 will consume a lot less memory, but could scale worse in case of too many
> threads trying to send traffic to the same bond port.  #2 might be a bit
> faster and more scalable in terms of performance, but less efficient in
> memory consumption and might be slower in terms of response to slave updates
> since we have to update hash maps on all the threads.
> Both solutions doesn't require any reconfiguration of running PMD threads.
>
> Note that to update cmap entry, you will likely need to prepare the new cmap
> node and use a cmap_replace().  Statistics copy in this case might be a bit
> tricky because you may lost part of additions within the period while PMD
> threads are still using the old entry. To avoid that, statistics copying
> should be RCU-postponed.  However, I'm not sure if we need highly accurate
> stats there.
>
> Any thoughts and suggestion (alternative solutions) are welcome to discuss.

Thanks Ilya. I will have a look at it and implement it in the next patch-set.

Warm Regards,
Vishal Ajmera



More information about the dev mailing list