[ovs-dev] openvswitch-2.4 possible bug in hmap_remove

Richurov Kes kesri1234 at rediffmail.com
Sat Oct 3 06:04:04 UTC 2015


Hi,
I'm a student at Antwerp University D' Sciences and Technology and we are trying to bring up openvswitch-2.4 in our project environment. We are running into a crash that is affecting our research work. Here is some detail:
Stack trace:(gdb) bt#0  hmap_remove (xcfg=0x2289520, xport=0x23874a0) at lib/hmap.h:245#1  xlate_xport_remove (xcfg=0x2289520, xport=0x23874a0)    at ofproto/ofproto-dpif-xlate.c:1318#2  0x000000000043b6fc in xlate_xbridge_remove (xcfg=0x2289520,    xbridge=0x2386c90) at ofproto/ofproto-dpif-xlate.c:1142#3  0x000000000043ba0a in xlate_xcfg_free ()    at ofproto/ofproto-dpif-xlate.c:1091#4  xlate_txn_commit () at ofproto/ofproto-dpif-xlate.c:1050#5  0x0000000000426dbb in type_run (type=<value optimized out>)    at ofproto/ofproto-dpif.c:731#6  0x000000000041a322 in ofproto_type_run (datapath_type=0x2285f00 "system")    at ofproto/ofproto.c:1740#7  0x000000000040812d in bridge_run__ () at vswitchd/bridge.c:2928#8  0x00000000004113a5 in bridge_run () at vswitchd/bridge.c:3000#9  0x00000000004121ad in main (argc=10, argv=0x7fff37462dc8)    at vswitchd/ovs-vswitchd.c
 :131
We inspected some frames. In Frame 1, at line number 1318, we find xcfg->xports to be valid, and note the xport->hmap_node. All of the xport->ofport appear to be valid data.
(gdb) frame 1#1  xlate_xport_remove (xcfg=0x2289520, xport=0x23874a0)    at ofproto/ofproto-dpif-xlate.c:13181318        hmap_remove(&xcfg->xports, &xport->hmap_node);(gdb) list1313        }13141315        clear_skb_priorities(xport);1316        hmap_destroy(&xport->skb_priorities);13171318        hmap_remove(&xcfg->xports, &xport->hmap_node);      <<<<<1319        hmap_remove(&xport->xbridge->xports, &xport->ofp_node);13201321        netdev_close(xport->netdev);(gdb) p &xport->hmap_node$5 = (struct hmap_node *) 0x23874a0(gdb) p xport->hmap_node$6 = {hash = 7, next = 0x0}(gdb) p xcfg->xports$11 = {buckets = 0x2387620, one = 0x0, mask = 31, n = 45}(gdb) p xport->ofport$12 = (struct ofport_dpif *) 0x2232160(gdb) p *xport->o
 fport$13 = {odp_port_node = {hash = 0, next = 0x0}, up = {hmap_node = {      hash = 186588107, next = 0x0}, ofproto = 0x21b6f20, netdev = 0x2231f90,    pp = {port_no = 12, hw_addr = "\362\366\a\035\275D",      name = "taf2fe2592c\000\000\000\000", config = 0,      state = OFPUTIL_PS_STP_LISTEN, curr = 0, advertised = 0, supported = 0,      peer = 0, curr_speed = 0, max_speed = 0}, ofp_port = 12, change_seq = 3,    created = 1260028, mtu = 0, op_vendor_data = 0x22322a0, rte_node = {      hash = 186588107, next = 0x0}}, ofpd_vendor_data = 0x21e03d0,  odp_port = 9, bundle = 0x0, bundle_node = {prev = 0x0, next = 0x0},  cfm = 0x0, bfd = 0x0, lldp = 0x0, may_enable = true, is_tunnel = true,  is_layer3 = false, carrier_seq = 0, peer = 0x0, stp_port = 0x0,  stp_state = STP_DISABLED, stp_state_entered = 0, rstp_port = 0x0,  rstp_state = RSTP_DISABLED, qdscp = 0x0, 
 n
 _qdscp = 0, realdev_ofp_port = 0,  vlandev_vid = 0}
However, peculiarly, we find that xport->hmap_node to be in wrong hash map bucket in frame 0 and as a result we trip. See below:
(gdb) frame 0#0  hmap_remove (xcfg=0x2289520, xport=0x23874a0) at lib/hmap.h:245245         while (*bucket != node) {(gdb) list240      * hmap_shrink() directly if desired. */241     static inline void242     hmap_remove(struct hmap *hmap, struct hmap_node *node)243     {244         struct hmap_node **bucket = &hmap->buckets[node->hash & hmap->mask];245         while (*bucket != node) {246             bucket = &(*bucket)->next;247         }248         *bucket = node->next;249         hmap->n--;(gdb) p node->hash & hmap->mask$7 = 7(gdb) p hmap->buckets[7]$8 = (struct hmap_node *) 0x0
(gdb) p hmap->buckets[6]$15 = (struct hmap_node *) 0x23874a0
This makes us wonder if there is a bug in openvswitch-2.4 hash map library that computes wrong address during hashing. There is no particular sequence of events that reproduce this bug. But it often occurs during downloading flows from controller or even when adding some ports or deleting ports from ovs-vsctl.
Is this a known problem with openvswitch-2.4? If so, is there a patch available or is it fixed in new releases or master branch? Please help.
Regards,Richurov



More information about the dev mailing list