[ovs-dev] The bug of slflow and upcall

wei wang lnykww at gmail.com
Tue Apr 24 04:17:38 UTC 2018


We meet a problem with sflow enable. The packet miss upcall to
vswitchd has a 1 second lagency.
The reason is sflow will get interface stats from kernel use the
method of rtnetlink,and the rtnetlink will wait the rtln_lock. also
this thread will hold the global mutex of sflow moudle. at this time
if ethtool hold the rtln_lock, the sflow get_counters function will
wait the ethtool to release the lock.  but the ethtool maybe sleep, so
the sflow get_counters will wait a long time. and also hold the sflow
mutex for a longtime. this cause upcall module block,and the packet of
miss upcall has a big lagency.

The flow is:
ethtool -> hold rtln_lock -> sleep
dpif_sflow_run->hold sflow mutex->
sflow_agent_get_counters->rtnetlink->wait rtln_lock
udpif_upcall_handler->dpif_sflow_received->wait sflow mutex

So if  unlock the sflow mutex before sflow get_counters and relock
after get counter will resolve this probelem?
-- 
Regards,
Wang Wei


More information about the dev mailing list