[ovs-discuss] 答复: 答复: about <the lldp bug cause the crash of the process "ovs-vswitchd"with the userspace ovs 2.5.0>

Ben Pfaff blp at ovn.org
Thu Jul 6 21:34:22 UTC 2017


I took a (belated) look at this just now.  There's an obvious bug fix
that I sent out:
        https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335021.html

Does it make any difference?

On Tue, Apr 25, 2017 at 01:19:26AM +0000, qintao (F) wrote:
> the valgrind's report as follows:
> 
> 2017-04-25T01:38:16Z|00025|coverage|INFO|bridge_reconfigure         0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2017-04-25T01:38:16Z|00026|coverage|INFO|ofproto_flush              0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2017-04-25T01:38:16Z|00027|coverage|INFO|ofproto_update_port        0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00028|coverage|INFO|cmap_expand                0.0/sec     0.000/sec        0.0000/sec   total: 3
> 2017-04-25T01:38:16Z|00029|coverage|INFO|dpif_port_add              0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00030|coverage|INFO|dpif_flow_flush            0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00031|coverage|INFO|dpif_flow_get              0.0/sec     0.000/sec        0.0000/sec   total: 5
> 2017-04-25T01:38:16Z|00032|coverage|INFO|dpif_flow_put              0.0/sec     0.000/sec        0.0000/sec   total: 9
> 2017-04-25T01:38:16Z|00033|coverage|INFO|dpif_flow_del              0.0/sec     0.000/sec        0.0000/sec   total: 5
> 2017-04-25T01:38:16Z|00034|coverage|INFO|dpif_execute               0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00035|coverage|INFO|flow_extract               0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2017-04-25T01:38:16Z|00036|coverage|INFO|miniflow_malloc            0.0/sec     0.000/sec        0.0000/sec   total: 44
> 2017-04-25T01:38:16Z|00037|coverage|INFO|hmap_pathological          0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00038|coverage|INFO|hmap_expand                0.0/sec     0.000/sec        0.0000/sec   total: 412
> 2017-04-25T01:38:16Z|00039|coverage|INFO|netdev_get_stats           0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00040|coverage|INFO|poll_create_node           0.0/sec     0.000/sec        0.0000/sec   total: 9
> 2017-04-25T01:38:16Z|00041|coverage|INFO|seq_change                 0.0/sec     0.000/sec        0.0000/sec   total: 598
> 2017-04-25T01:38:16Z|00042|coverage|INFO|pstream_open               0.0/sec     0.000/sec        0.0000/sec   total: 3
> 2017-04-25T01:38:16Z|00043|coverage|INFO|stream_open                0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2017-04-25T01:38:16Z|00044|coverage|INFO|util_xalloc                0.0/sec     0.000/sec        0.0000/sec   total: 10064
> 2017-04-25T01:38:16Z|00045|coverage|INFO|netdev_set_policing        0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00046|coverage|INFO|netdev_get_ifindex         0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00047|coverage|INFO|netdev_get_hwaddr          0.0/sec     0.000/sec        0.0000/sec   total: 11
> 2017-04-25T01:38:16Z|00048|coverage|INFO|netdev_set_hwaddr          0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2017-04-25T01:38:16Z|00049|coverage|INFO|netdev_get_ethtool         0.0/sec     0.000/sec        0.0000/sec   total: 4
> 2017-04-25T01:38:16Z|00050|coverage|INFO|netlink_received           0.0/sec     0.000/sec        0.0000/sec   total: 16
> 2017-04-25T01:38:16Z|00051|coverage|INFO|netlink_recv_jumbo         0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2017-04-25T01:38:16Z|00052|coverage|INFO|netlink_sent               0.0/sec     0.000/sec        0.0000/sec   total: 15
> 2017-04-25T01:38:16Z|00053|coverage|INFO|72 events never hit
> 2017-04-25T01:38:16Z|00054|bridge|INFO|ifname=enp1s0f0, vlan=4095, oper=1
> BRIDGE_AA_VLA:i:0,reconfigure:0
> ==22676== Conditional jump or move depends on uninitialised value(s)
> ==22676==    at 0x4E10DA: uuid_compare_3way (uuid.c:139)
> ==22676==    by 0x4BB8AD: ovsdb_datum_find_key (ovsdb-data.c:1633)
> ==22676==    by 0x40E922: bridge_configure_aa (bridge.c:3865)
> ==22676==    by 0x40E922: bridge_reconfigure (bridge.c:710)
> ==22676==    by 0x40FF67: bridge_run (bridge.c:2996)
> ==22676==    by 0x40690C: main (ovs-vswitchd.c:120)
> ==22676== 
> ==22676== Conditional jump or move depends on uninitialised value(s)
> ==22676==    at 0x4BB8B0: ovsdb_datum_find_key (ovsdb-data.c:1634)
> ==22676==    by 0x40E922: bridge_configure_aa (bridge.c:3865)
> ==22676==    by 0x40E922: bridge_reconfigure (bridge.c:710)
> ==22676==    by 0x40FF67: bridge_run (bridge.c:2996)
> ==22676==    by 0x40690C: main (ovs-vswitchd.c:120)
> ==22676== 
> ==22676== Conditional jump or move depends on uninitialised value(s)
> ==22676==    at 0x4BB888: ovsdb_datum_find_key (ovsdb-data.c:1636)
> ==22676==    by 0x40E922: bridge_configure_aa (bridge.c:3865)
> ==22676==    by 0x40E922: bridge_reconfigure (bridge.c:710)
> ==22676==    by 0x40FF67: bridge_run (bridge.c:2996)
> ==22676==    by 0x40690C: main (ovs-vswitchd.c:120)
> ==22676== 
> 2017-04-25T01:38:17Z|00055|bridge|INFO|Deleting isid=1, vlan=4095
> 2017-04-25T01:38:17Z|00056|ovs_lldp|INFO|Removing mapping aux=0x7b663d0
> 2017-04-25T01:38:17Z|00057|ovs_lldp|INFO|        Removing mapping ISID=1, VLAN=4095 (lldp->name=enp1s0f0)
> 2017-04-25T01:38:17Z|00058|ovs_lldp|INFO|                hardware->h_ifname=enp1s0f0
> 2017-04-25T01:38:17Z|00059|ovs_lldp|INFO|                Removing lport, isid=1, vlan=4095
> 2017-04-25T01:38:17Z|00060|bridge|INFO|Adding isid=1, vlan=4095
> 2017-04-25T01:38:17Z|00061|ovs_lldp|INFO|Adding mapping ISID=1, VLAN=4095, aux=0x7c0a9e0
> 2017-04-25T01:38:17Z|00062|ovs_lldp|INFO|        lldp->name=enp1s0f0
> 2017-04-25T01:38:17Z|00063|ovs_lldp|INFO|                hardware->h_ifname=enp1s0f0
> 2017-04-25T01:38:17Z|00064|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.5.2
> 2017-04-25T01:38:18Z|00065|bridge|INFO|ifname=enp1s0f0, vlan=4095, oper=2
> ,BRIDGE_AA_VLAN_OPER_REMOVE:i:0,j:0,port->cfg->trunks[0]:4095,m->vlan:4095
> BRIDGE_AA_VLA:i:0,reconfigure:1
> 2017-04-25T01:38:18Z|00066|bridge|INFO|ifname=enp1s0f0, vlan=4095, oper=1
> ,BRIDGE_AA_VLA:i:1,reconfigure:1
> 2017-04-25T01:38:25Z|00067|memory|INFO|65776 kB peak resident set size after 10.0 seconds
> 2017-04-25T01:38:25Z|00068|memory|INFO|handlers:2 ports:2 revalidators:2 rules:5
> ==22676== Thread 7 monitor6:
> ==22676== Invalid read of size 8
> ==22676==    at 0x42975A: ofproto_dpif_send_packet (ofproto-dpif.c:4390)
> ==22676==    by 0x43082D: monitor_mport_run (ofproto-dpif-monitor.c:290)
> ==22676==    by 0x430B63: monitor_run (ofproto-dpif-monitor.c:227)
> ==22676==    by 0x430C04: monitor_main (ofproto-dpif-monitor.c:189)
> ==22676==    by 0x4B9263: ovsthread_wrapper (ovs-thread.c:340)
> ==22676==    by 0x5491DF4: start_thread (in /usr/lib64/libpthread-2.17.so)
> ==22676==    by 0x5CA61AC: clone (in /usr/lib64/libc-2.17.so)
> ==22676==  Address 0x7b1e6c0 is 32 bytes inside a block of size 280 free'd
> ==22676==    at 0x4C2AD17: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==22676==    by 0x41E7C7: ofproto_destroy (ofproto.c:1602)
> ==22676==    by 0x4097A3: bridge_destroy (bridge.c:3233)
> ==22676==    by 0x409B9D: add_del_bridges.isra.20 (bridge.c:1734)
> ==22676==    by 0x40C591: bridge_reconfigure (bridge.c:616)
> ==22676==    by 0x40FF67: bridge_run (bridge.c:2996)
> ==22676==    by 0x40690C: main (ovs-vswitchd.c:120)
> ==22676== 
> ==22676== Invalid read of size 8
> ==22676==    at 0x42975E: ofproto_dpif_cast (ofproto-dpif.c:360)
> ==22676==    by 0x42975E: ofproto_dpif_send_packet (ofproto-dpif.c:4390)
> ==22676==    by 0x43082D: monitor_mport_run (ofproto-dpif-monitor.c:290)
> ==22676==    by 0x430B63: monitor_run (ofproto-dpif-monitor.c:227)
> ==22676==    by 0x430C04: monitor_main (ofproto-dpif-monitor.c:189)
> ==22676==    by 0x4B9263: ovsthread_wrapper (ovs-thread.c:340)
> ==22676==    by 0x5491DF4: start_thread (in /usr/lib64/libpthread-2.17.so)
> ==22676==    by 0x5CA61AC: clone (in /usr/lib64/libc-2.17.so)
> ==22676==  Address 0x7aa6450 is 32 bytes inside a block of size 1,368 free'd
> ==22676==    at 0x4C2AD17: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==22676==    by 0x41BC02: ofproto_destroy__ (ofproto.c:1569)
> ==22676==    by 0x4B8345: ovsrcu_call_postponed (ovs-rcu.c:293)
> ==22676==    by 0x4B8513: ovsrcu_postpone_thread (ovs-rcu.c:308)
> ==22676==    by 0x4B9263: ovsthread_wrapper (ovs-thread.c:340)
> ==22676==    by 0x5491DF4: start_thread (in /usr/lib64/libpthread-2.17.so)
> ==22676==    by 0x5CA61AC: clone (in /usr/lib64/libc-2.17.so)
> ==22676== 
> ==22676== Invalid read of size 8
> ==22676==    at 0x4B927C: ovs_mutex_lock_at (ovs-thread.c:76)
> ==22676==    by 0x429786: ofproto_dpif_send_packet (ofproto-dpif.c:4395)
> ==22676==    by 0x43082D: monitor_mport_run (ofproto-dpif-monitor.c:290)
> ==22676==    by 0x430B63: monitor_run (ofproto-dpif-monitor.c:227)
> ==22676==    by 0x430C04: monitor_main (ofproto-dpif-monitor.c:189)
> ==22676==    by 0x4B9263: ovsthread_wrapper (ovs-thread.c:340)
> ==22676==    by 0x5491DF4: start_thread (in /usr/lib64/libpthread-2.17.so)
> ==22676==    by 0x5CA61AC: clone (in /usr/lib64/libc-2.17.so)
> ==22676==  Address 0x7aa6780 is 848 bytes inside a block of size 1,368 free'd
> ==22676==    at 0x4C2AD17: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==22676==    by 0x41BC02: ofproto_destroy__ (ofproto.c:1569)
> ==22676==    by 0x4B8345: ovsrcu_call_postponed (ovs-rcu.c:293)
> ==22676==    by 0x4B8513: ovsrcu_postpone_thread (ovs-rcu.c:308)
> ==22676==    by 0x4B9263: ovsthread_wrapper (ovs-thread.c:340)
> ==22676==    by 0x5491DF4: start_thread (in /usr/lib64/libpthread-2.17.so)
> ==22676==    by 0x5CA61AC: clone (in /usr/lib64/libc-2.17.so)
> ==22676== 
> ovs-vswitchd(monitor6): ofproto/ofproto-dpif.c:4395: ovs_mutex_lock_at() passed uninitialized ovs_mutex
> ,==22676== 
> ==22676== HEAP SUMMARY:
> ==22676==     in use at exit: 4,277,720 bytes in 1,039 blocks
> ==22676==   total heap usage: 34,041 allocs, 33,002 frees, 25,258,154 bytes allocated
> ==22676== 
> ==22676== LEAK SUMMARY:
> ==22676==    definitely lost: 89 bytes in 4 blocks
> ==22676==    indirectly lost: 0 bytes in 0 blocks
> ==22676==      possibly lost: 4,196,601 bytes in 21 blocks
> ==22676==    still reachable: 81,030 bytes in 1,014 blocks
> ==22676==         suppressed: 0 bytes in 0 blocks
> ==22676== Rerun with --leak-check=full to see details of leaked memory
> ==22676== 
> ==22676== For counts of detected and suppressed errors, rerun with: -v
> ==22676== Use --track-origins=yes to see where uninitialised values come from
> ==22676== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 2 from 2)
> Killed
> [root at localhost ~]#
> 
> ========================================================================
> 
> the reproduction information as follows:
> 
> 1、create a bridge "br1" with the netdev type
> 2、run the command “ovs-vsctl set int br1 lldp:enable=true “ to make the interface br1 enable the function lldp
> 3、run the command ''ovs-vsctl del-br br1" to delete the br1
> =====================================================================================
> -----邮件原件-----
> 发件人: Ben Pfaff [mailto:blp at ovn.org] 
> 发送时间: 2017年4月25日 1:15
> 收件人: qintao (F)
> 抄送: Justin Pettit; Liuguifeng; ovs-discuss at openvswitch.org; wuhao (S); zhouyong (R); Lukai (Look); Guoyilong
> 主题: Re: [ovs-discuss] 答复: about <the lldp bug cause the crash of the process "ovs-vswitchd"with the userspace ovs 2.5.0>
> 
> OK, let's figure out the problem.  Can you provide a backtrace?  Or run OVS until valgrind and provide valgrind's report?  Or can you provide reproduction information for us?
> 
> Thanks,
> 
> Ben.
> 
> On Thu, Apr 20, 2017 at 08:21:54AM +0000, qintao (F) wrote:
> > hi,Pettit
> > 	I have reprodeuced the same issue with 2.5.2, but the result is still disappointing ,which is the crash of the process "ovs-vswitchd".
> > 
> > best regards,
> >  Tony tao
> > -----邮件原件-----
> > 发件人: Justin Pettit [mailto:jpettit at ovn.org]
> > 发送时间: 2017年4月20日 5:05
> > 收件人: qintao (F)
> > 抄送: ovs-discuss at openvswitch.org; wuhao (S); Liuguifeng; Lukai (Look); 
> > Guoyilong
> > 主题: Re: [ovs-discuss] about <the lldp bug cause the crash of the 
> > process "ovs-vswitchd"with the userspace ovs 2.5.0>
> > 
> > 
> > > On Apr 18, 2017, at 6:59 PM, qintao (F) <qintao5 at huawei.com> wrote:
> > > 
> > >  
> > >  
> > > Dear all ,
> > >      we create a bridge “br1” with the type of netdev .And the version of the ovs is 2.5.0. Then we run the command “ovs-vsctl set int br1 lldp:enable=true “ to make the interface br1 enable the function lldp.After that ,we delete the bridge br1 ,we found the the process “ovs-vswitchd” has been lost.
> > > ”
> > 
> > Thanks for the report.  There are a couple of releases in the 2.5.x series since 2.5.0.  Are you able to reproduce the same issue with 2.5.2?
> > 
> > --Justin
> > 
> > 
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


More information about the discuss mailing list