[ovs-dev] The bug of slflow and upcall
lnykww at gmail.com
Tue Apr 24 04:17:38 UTC 2018
We meet a problem with sflow enable. The packet miss upcall to
vswitchd has a 1 second lagency.
The reason is sflow will get interface stats from kernel use the
method of rtnetlink,and the rtnetlink will wait the rtln_lock. also
this thread will hold the global mutex of sflow moudle. at this time
if ethtool hold the rtln_lock, the sflow get_counters function will
wait the ethtool to release the lock. but the ethtool maybe sleep, so
the sflow get_counters will wait a long time. and also hold the sflow
mutex for a longtime. this cause upcall module block,and the packet of
miss upcall has a big lagency.
The flow is:
ethtool -> hold rtln_lock -> sleep
dpif_sflow_run->hold sflow mutex->
udpif_upcall_handler->dpif_sflow_received->wait sflow mutex
So if unlock the sflow mutex before sflow get_counters and relock
after get counter will resolve this probelem?
More information about the dev