[ovs-dev] [PATCH v2 1/1] ovs-thread: Fix thread id for threads not started with ovs_thread_create()
Ilya Maximets
i.maximets at samsung.com
Mon May 28 13:42:23 UTC 2018
On 28.05.2018 14:06, Eelco Chaudron wrote:
> When ping-pong'in a live VM migration between two machines running
> OVS-DPDK every now and then the ping misses would increase
> dramatically. For example:
>
> ===========Stream Rate: 3Mpps===========
> No Stream_Rate Downtime Totaltime Ping_Loss Moongen_Loss
> 0 3Mpps 128 13974 115 7168374
> 1 3Mpps 145 13620 17 1169770
> 2 3Mpps 140 14499 116 7141175
> 3 3Mpps 142 13358 16 1150606
> 4 3Mpps 136 14004 16 1124020
> 5 3Mpps 139 15494 214 13170452
> 6 3Mpps 136 15610 217 13282413
> 7 3Mpps 146 13194 17 1167512
> 8 3Mpps 148 12871 16 1162655
> 9 3Mpps 137 15615 214 13170656
>
> I identified this issue being introduced in OVS commit
> https://github.com/openvswitch/ovs/commit/f3e7ec254738364101eed8f04b1d954cb510615c
> and more specific due to DPDK commit
> http://dpdk.org/browse/dpdk/commit/?id=af14759181240120f76c82f894982e8f33f0ba2a
>
> The combined changes no longer have OVS start the vhost socket polling
> thread at startup, but DPDK will do it on its own when the first vhost
> client is started.
>
> Figuring out the reason why this happens kept me puzzled for quite some time...
> What happens is that the callbacks called from the vhost thread are
> calling ovsrcu_synchronize() as part of destroy_device(). This will
> end-up calling seq_wait__(), and the problem is with this part:
>
> 176static void
> 177seq_wait__(struct seq *seq, uint64_t value, const char *where)
> 178 OVS_REQUIRES(seq_mutex)
> 179{
>>> 180 unsigned int id = ovsthread_id_self();
> 181 uint32_t hash = hash_int(id, 0);
> 182 struct seq_waiter *waiter;
> 183
> 184 HMAP_FOR_EACH_IN_BUCKET (waiter, hmap_node, hash, &seq->waiters) {
>>> 185 if (waiter->ovsthread_id == id) {
> 186 if (waiter->value != value) {
> 187 /* The current value is different from the value we've already
> 188 * waited for, */
> 189 poll_immediate_wake_at(where);
> 190 } else {
> 191 /* Already waiting on 'value', nothing more to do. */
> 192 }
>>> 193 return;
> 194 }
> 195 }
> 196
>
> By default, all created threads outside of OVS will get thread id 0,
> which is equal to the main ovs thread. So for example in the function
> above if the main thread is waiting already we won't add ourselves as
> a waiter.
>
> The fix below assigns UINT_MAX to none OVS created threads, which will
> get updated to a valid ID on the first call to ovsthread_id_self().
>
> v1 -> v2:
> =========
> Included Ilya's suggestion to automatically assign an ID in
> ovsthread_id_self().
>
> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
> ---
> lib/ovs-thread.c | 22 +++++++++++++++++-----
> lib/ovs-thread.h | 12 +++++++++++-
> vswitchd/ovs-vswitchd.c | 2 ++
> 3 files changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index f8bc06d38..9cc194b73 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -315,7 +315,7 @@ ovs_barrier_block(struct ovs_barrier *barrier)
> }
> }
>
> -DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, 0);
> +DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, OVSTHREAD_ID_UNSET);
>
> struct ovsthread_aux {
> void *(*start)(void *);
> @@ -323,17 +323,29 @@ struct ovsthread_aux {
> char name[16];
> };
>
> +unsigned int
> +ovsthread_id_init(void)
> +{
> + static atomic_count next_id = ATOMIC_COUNT_INIT(0);
> +
> + unsigned int id = *ovsthread_id_get();
> +
> + if (id == OVSTHREAD_ID_UNSET) {
I'm not sure, if we need to check the id here. I think, 'init' function
should not be called twice. Also, you're checking the id before calling.
How about just adding the assert?
> + id = atomic_count_inc(&next_id);
> + *ovsthread_id_get() = id;
> + }
> +
> + return id;
> +}
> +
> static void *
> ovsthread_wrapper(void *aux_)
> {
> - static atomic_count next_id = ATOMIC_COUNT_INIT(1);
> -
> struct ovsthread_aux *auxp = aux_;
> struct ovsthread_aux aux;
> unsigned int id;
>
> - id = atomic_count_inc(&next_id);
> - *ovsthread_id_get() = id;
> + id = ovsthread_id_init();
>
> aux = *auxp;
> free(auxp);
> diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h
> index 03fd80439..6b8c285a8 100644
> --- a/lib/ovs-thread.h
> +++ b/lib/ovs-thread.h
> @@ -465,14 +465,24 @@ void *ovsthread_getspecific(ovsthread_key_t);
> * This thread ID avoids the problem.
> */
>
> +#define OVSTHREAD_ID_UNSET UINT_MAX
> DECLARE_EXTERN_PER_THREAD_DATA(unsigned int, ovsthread_id);
>
> +/* Initializes the unique per thread identifier if not done already */
> +unsigned int ovsthread_id_init(void);
> +
> /* Returns a per-thread identifier unique within the lifetime of the
> * process. */
> static inline unsigned int
> ovsthread_id_self(void)
> {
> - return *ovsthread_id_get();
> + unsigned int id = *ovsthread_id_get();
> +
> + if (OVS_UNLIKELY(id == OVSTHREAD_ID_UNSET)) {
> + id = ovsthread_id_init();
> + }
> +
> + return id;
> }
>
> /* Simulated global counter.
> diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
> index 414b54780..46da45db9 100644
> --- a/vswitchd/ovs-vswitchd.c
> +++ b/vswitchd/ovs-vswitchd.c
> @@ -39,6 +39,7 @@
> #include "ovsdb-idl.h"
> #include "ovs-rcu.h"
> #include "ovs-router.h"
> +#include "ovs-thread.h"
> #include "openvswitch/poll-loop.h"
> #include "simap.h"
> #include "stream-ssl.h"
> @@ -78,6 +79,7 @@ main(int argc, char *argv[])
> int retval;
>
> set_program_name(argv[0]);
> + ovsthread_id_init();
>
> ovs_cmdl_proctitle_init(argc, argv);
> service_start(&argc, &argv);
>
More information about the dev
mailing list