[ovs-discuss] Applying Helgrind thread debugger

William Tu u9012063 at gmail.com
Mon Feb 15 16:01:20 UTC 2016


Hi,

While I'm finishing up with the valgrind memory leak tests, I saw another
tool come with valgrind called helgrind, which can detect pthread
synchronization errors.
http://valgrind.org/docs/manual/hg-manual.html
I used helgrind on OVS testsuite and saw a couple of errors. Before digging
into it, I'm not sure whether it worth fixing it or it's a false positive.
Any feedback or experience of using helgrind are welcome.

An example error reported by helgrind at OVS:
we have two lock:
(1) 0x7dd0c0 inside data symbol "netdev_class_mutex"
(2) 0x7dab60 inside data symbol "route_table_mutex"
Helgrind detects there is incorrect lock acquiring order. At one time we
lock (1), then (2). And another time we lock (2), then (1). In that case, a
possible deadlock might happen. Details are below:

==128325== Thread #1: lock order "0x7DAB60 before 0x7DD0C0" violated
==128325==
==128325== Observed (incorrect) order is: acquisition of lock at 0x7DD0C0
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x47B2F4: netdev_run (netdev.c:180)
==128325==    by 0x406263: main (ovs-vswitchd.c:122)
==128325==
==128325==  followed by a later acquisition of lock at 0x7DAB60
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x4FC282: route_table_run (route-table.c:124)
==128325==    by 0x47993E: netdev_vport_run (netdev-vport.c:375)
==128325==    by 0x47B379: netdev_run (netdev.c:183)
==128325==    by 0x406263: main (ovs-vswitchd.c:122)
==128325==
==128325== Required order was established by acquisition of lock at 0x7DAB60
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x4FC142: route_table_init (route-table.c:94)
==128325==    by 0x45FC57: dp_initialize (dpif.c:126)
==128325==    by 0x45FE8D: dp_enumerate_types (dpif.c:246)
==128325==    by 0x418894: ofproto_enumerate_types (ofproto.c:470)
==128325==    by 0x409C84: bridge_run__ (bridge.c:2877)
==128325==    by 0x40F4E3: bridge_run (bridge.c:2940)
==128325==    by 0x406254: main (ovs-vswitchd.c:120)
==128325==
==128325==  followed by a later acquisition of lock at 0x7DD0C0
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x47B77A: netdev_open (netdev.c:366)
==128325==    by 0x4DE8B9: insert_ipdev (tnl-ports.c:355)
==128325==    by 0x4DF0FA: tnl_port_map_insert_ipdev (tnl-ports.c:423)
==128325==    by 0x4BAD6C: ovs_router_insert__ (ovs-router.c:150)
==128325==    by 0x4FC0E4: route_table_handle_msg (route-table.c:301)
==128325==    by 0x4FC0E4: route_table_reset (route-table.c:186)
==128325==    by 0x4FC1F2: route_table_init (route-table.c:113)
==128325==    by 0x45FC57: dp_initialize (dpif.c:126)
==128325==    by 0x45FE8D: dp_enumerate_types (dpif.c:246)
==128325==    by 0x418894: ofproto_enumerate_types (ofproto.c:470)
==128325==    by 0x409C84: bridge_run__ (bridge.c:2877)
==128325==    by 0x40F4E3: bridge_run (bridge.c:2940)
==128325== Thread #1: lock order "0x7DAB60 before 0x7DD0C0" violated
==128325==
==128325== Observed (incorrect) order is: acquisition of lock at 0x7DD0C0
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x47B2F4: netdev_run (netdev.c:180)
==128325==    by 0x406263: main (ovs-vswitchd.c:122)
==128325==
==128325==  followed by a later acquisition of lock at 0x7DAB60
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x4FC282: route_table_run (route-table.c:124)
==128325==    by 0x47993E: netdev_vport_run (netdev-vport.c:375)
==128325==    by 0x47B379: netdev_run (netdev.c:183)
==128325==    by 0x406263: main (ovs-vswitchd.c:122)
==128325==
==128325== Required order was established by acquisition of lock at 0x7DAB60
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x4FC142: route_table_init (route-table.c:94)
==128325==    by 0x45FC57: dp_initialize (dpif.c:126)
==128325==    by 0x45FE8D: dp_enumerate_types (dpif.c:246)
==128325==    by 0x418894: ofproto_enumerate_types (ofproto.c:470)
==128325==    by 0x409C84: bridge_run__ (bridge.c:2877)
==128325==    by 0x40F4E3: bridge_run (bridge.c:2940)
==128325==    by 0x406254: main (ovs-vswitchd.c:120)
==128325==
==128325==  followed by a later acquisition of lock at 0x7DD0C0
==128325==    at 0x4C2CDE7: mutex_lock_WRK (hg_intercepts.c:901)
==128325==    by 0x4C30C3B: pthread_mutex_lock (hg_intercepts.c:917)
==128325==    by 0x4BB757: ovs_mutex_lock_at (ovs-thread.c:76)
==128325==    by 0x47B77A: netdev_open (netdev.c:366)
==128325==    by 0x4DE8B9: insert_ipdev (tnl-ports.c:355)
==128325==    by 0x4DF0FA: tnl_port_map_insert_ipdev (tnl-ports.c:423)
==128325==    by 0x4BAD6C: ovs_router_insert__ (ovs-router.c:150)
==128325==    by 0x4FC0E4: route_table_handle_msg (route-table.c:301)
==128325==    by 0x4FC0E4: route_table_reset (route-table.c:186)
==128325==    by 0x4FC1F2: route_table_init (route-table.c:113)
==128325==    by 0x45FC57: dp_initialize (dpif.c:126)
==128325==    by 0x45FE8D: dp_enumerate_types (dpif.c:246)
==128325==    by 0x418894: ofproto_enumerate_types (ofproto.c:470)
==128325==    by 0x409C84: bridge_run__ (bridge.c:2877)
==128325==    by 0x40F4E3: bridge_run (bridge.c:2940)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20160215/61e26c58/attachment-0002.html>


More information about the discuss mailing list