[ovs-discuss] ovs-vswitchd 2.0 has high cpu usage

Zhou, Han hzhou8 at ebay.com
Wed Nov 27 06:37:09 UTC 2013


Hi,

Since there is only 1 dispatcher thread, it will be the bottleneck if there are many miss_handler threads in a 32-core machine.

Chengyuan's test shows that most of the high CPU of miss_handler thread is caused by ticket spin locks triggered by futex calls. This can be related to the fact that miss_handlers are frequently waked up when there are only 1 or 2 upcalls for them, and consume and then wait again, instead of handling upcalls in a batch mode (i.e.  handle FLOW_MISS_MAX_BATCH number of upcalls after one wait). We observed this by adding logs to check handler->n_upcalls after the of ovs_mutex_cond_wait() in miss_handler. And the reason could be that the single dispatcher thread cannot supply upcalls fast enough for so many miss_handlers to work in batch mode. 

I suspect it is this frequent wait-and-wakeup in a 32-core environment results in high CPU because of futex implementation. Then I had a test by forcing the ovs_mutex_cond_wait() in a loop until there are FLOW_MISS_MAX_BATCH (=50) n_upcalls, and observed that the total CPU dropped from ~330% to ~190%! Particularly, the thread ovs-vswitchd's CPU dropped from ~90% to ~10%, each miss_handler's CPU dropped from ~%6 to ~4%, and dispatcher thread's CPU kept unchanged ~60%. This result may prove my speculation to some extent: by forcing the wait loop accumulating 50 upcalls, the miss_handler then takes certain amount of time to consume the batch of upcalls, which increases the probability of dispatcher schedules more upcalls to it, thus next time when the miss_handler checks the handler->n_upcalls it is non-zero and so doesn't need to wait. From my test logs, the ratio between wait:non-wait for this handler->n_upcalls check decreased after the change. This could leads to less futex calls, and in perf profiling result it shows significant increase in flow handling functions such as flow_hash_in_minimask(), and decrease in kernel spin-locks and mutex operations.

However, there are still things unclear to me:
1. I see in the code that a miss_handler should be woken up only when it has 50 upcalls pending by dispatcher, but why my test shows they always wakeup from ovs_mutex_cond_wait() with only 1 or 2 upcalls for them (rarely wakeup when there is no upcalls)? Is there any other wakeup mechanism I missed?
2. why ovs-vswitchd occupies so much CPU in short live flow test before my change? And why it drops so dramatically? What's the contention between ovs-vswitchd and miss_handler?

A better solution for this bottleneck of dispatcher, in my opinion, could be that each handler thread receives upcalls assigned to them from kernel directly thus no conditional wait and signal involved, which avoid unnecessary context switch and futex scalling problem in multicore env. The selection of handler can be done by kernel with same kind of hash, but put into different queues per-handler, and this way packet order is preserved. Can this be a valid proposal?

Best regards,
Han Zhou

-----Original Message-----
From: discuss-bounces at openvswitch.org [mailto:discuss-bounces at openvswitch.org] On Behalf Of Ben Pfaff
Sent: 2013年11月26日 4:55
To: Ethan Jackson
Cc: discuss at openvswitch.org ML
Subject: Re: [ovs-discuss] ovs-vswitchd 2.0 has high cpu usage

On Sat, Nov 23, 2013 at 03:24:17PM +0800, Chengyuan Li wrote:
> Do you have suggested max threads number?

Ethan, how many threads do you suggest using?  Chengyuan has a 32-core machine and sees high CPU usage with 28 threads, much lower with 4 threads.
_______________________________________________
discuss mailing list
discuss at openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss


More information about the discuss mailing list