[ovs-discuss] Crash in openvswitch 2.0.2

Marco Kuendig marco at nuvula.ch
Tue Mar 31 21:43:24 UTC 2015


ok, I have tested it and I can reproduce it. 

For testing to reproduce that core: 

I have a very small VM on KVM, I start and shutdown that VM every 30 seconds. With that I get:

Mar 31 16:10:13 nuv-vir-kvm-server-1 ovs-vswitchd: ovs|00010|daemon(monitor)|ERR|9 crashes: pid 10918 died, killed (Segmentation fault), core dumped, restarting
Mar 31 16:29:20 nuv-vir-kvm-server-1 ovs-vswitchd: ovs|00011|daemon(monitor)|ERR|10 crashes: pid 22038 died, killed (Segmentation fault), core dumped, restarting
Mar 31 23:33:04 nuv-vir-kvm-server-1 ovs-vswitchd: ovs|00012|daemon(monitor)|ERR|11 crashes: pid 25067 died, killed (Segmentation fault), core dumped, restarting
Mar 31 23:34:43 nuv-vir-kvm-server-1 ovs-vswitchd: ovs|00013|daemon(monitor)|ERR|12 crashes: pid 30254 died, killed (Segmentation fault), core dumped, restarting
Mar 31 23:38:49 nuv-vir-kvm-server-1 ovs-vswitchd: ovs|00014|daemon(monitor)|ERR|13 crashes: pid 31612 died, killed (Segmentation fault), core dumped, restarting

The ovs-vswitchd does not restart every cycle but it cores quite often as you can see in the log above.



 <http://www.nuvula.ch/>
Marco Kuendig / CEO / Founder 
marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71

Marco's Google Hangout <https://plus.google.com/hangouts/_/nuvula.ch/marco>
Nuvula AG - Hybrid Clouds 
Weierbachstrasse 7b 8193 Eglisau Switzerland 
http://www.nuvula.ch <http://www.nuvula.ch/>
> On 31 Mar 2015, at 23:15, Marco Kuendig <marco at nuvula.ch> wrote:
> 
> I think I can reproduce that bug. Not 100% sure but today I think I had it several times.
> 
> Interesting is that it happens when a VM on kvm boots. Certainly strange is that several hosts crash simultaneously.
> 
> My setup is a lab setup and can be accessed if it helps in troubleshooting.
> 
>  <http://www.nuvula.ch/>
> Marco Kuendig / CEO / Founder 
> marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71
> 
> Marco's Google Hangout <https://plus.google.com/hangouts/_/nuvula.ch/marco>
> Nuvula AG - Hybrid Clouds 
> Weierbachstrasse 7b 8193 Eglisau Switzerland 
> http://www.nuvula.ch <http://www.nuvula.ch/>
>> On 31 Mar 2015, at 23:12, Joe Stringer <joestringer at nicira.com> wrote:
>> 
>> James, I believe you were involved last time this bug came up, I wonder if you ever got to the bottom of this?
>> 
>> ---
>> 
>> This looks the same as a bug reported in October:
>> 
>> http://openvswitch.org/pipermail/discuss/2014-October/015429.html <http://openvswitch.org/pipermail/discuss/2014-October/015429.html>
>> 
>> Ben's assessment was that there is no logical issue in the code, so perhaps there was weird code generation caused by GCC.
>> 
>> 
>> On 31 March 2015 at 13:05, Marco Kuendig <marco at nuvula.ch <mailto:marco at nuvula.ch>> wrote:
>> Reading symbols from /usr/sbin/ovs-vswitchd...Reading symbols from /usr/lib/debug//usr/sbin/ovs-vswitchd...done.
>> done.
>> [New LWP 32725]
>> [New LWP 32732]
>> [New LWP 32726]
>> [New LWP 32730]
>> [New LWP 32727]
>> [New LWP 32728]
>> [New LWP 32729]
>> [New LWP 32731]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfi'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  nl_attr_get_size (nla=nla at entry=0x0) at ../lib/netlink.c:506
>> 506	../lib/netlink.c: No such file or directory.
>> (gdb) bt
>> #0  nl_attr_get_size (nla=nla at entry=0x0) at ../lib/netlink.c:506
>> #1  0x0000000000460473 in format_generic_odp_key (a=a at entry=0x0, ds=ds at entry=0x7fff0408f3b0) at ../lib/odp-util.c:767
>> #2  0x0000000000460cd2 in format_odp_key_attr (a=a at entry=0xc485a4, ma=ma at entry=0x0, ds=ds at entry=0x7fff0408f3b0, verbose=verbose at entry=true)
>>     at ../lib/odp-util.c:1332
>> #3  0x00000000004609d7 in odp_flow_format (key=<optimized out>, key_len=40, mask=0x0, mask_len=0, ds=0x7fff0408f3b0, verbose=true) at ../lib/odp-util.c:1402
>> #4  0x0000000000460fc4 in format_odp_key_attr (a=a at entry=0xc48580, ma=ma at entry=0x0, ds=ds at entry=0x7fff0408f3b0, verbose=verbose at entry=true) at ../lib/odp-util.c:987
>> #5  0x00000000004609d7 in odp_flow_format (key=key at entry=0xc48520, key_len=key_len at entry=140, mask=mask at entry=0x0, mask_len=mask_len at entry=0,
>>     ds=ds at entry=0x7fff0408f3b0, verbose=verbose at entry=true) at ../lib/odp-util.c:1402
>> #6  0x00000000004450f3 in log_flow_message (error=error at entry=2, operation=operation at entry=0x4d0e73 "flow_del", key=0xc48520, key_len=140, mask=mask at entry=0x0,
>>     mask_len=mask_len at entry=0, stats=0x0, actions=actions at entry=0x0, actions_len=actions_len at entry=0, dpif=<optimized out>) at ../lib/dpif.c:1354
>> #7  0x00000000004453c9 in log_flow_del_message (dpif=dpif at entry=0xc489c0, del=del at entry=0x7fff0408f460, error=error at entry=2) at ../lib/dpif.c:1397
>> #8  0x0000000000445433 in log_flow_del_message (error=2, del=0x7fff0408f460, dpif=0xc489c0) at ../lib/dpif.c:1396
>> #9  dpif_flow_del__ (dpif=0xc489c0, del=del at entry=0x7fff0408f460) at ../lib/dpif.c:945
>> #10 0x00000000004455ca in dpif_flow_del (dpif=<optimized out>, key=<optimized out>, key_len=<optimized out>, stats=stats at entry=0x7fff0408f490) at ../lib/dpif.c:965
>> #11 0x000000000041b423 in subfacet_uninstall (subfacet=0xbd76a0) at ../ofproto/ofproto-dpif.c:4686
>> #12 0x0000000000420f18 in facet_remove (facet=facet at entry=0xbd72a0) at ../ofproto/ofproto-dpif.c:4014
>> #13 0x0000000000422f52 in facet_revalidate (facet=facet at entry=0xbd72a0) at ../ofproto/ofproto-dpif.c:4321
>> #14 0x0000000000424b5a in facet_lookup_valid (flow=0x7f3e700020a8, ofproto=0xc52600) at ../ofproto/ofproto-dpif.c:4203
>> #15 handle_flow_miss (n_ops=<synthetic pointer>, ops=0x7fff0408fb60, miss=0x7f3e70002090) at ../ofproto/ofproto-dpif.c:3339
>> #16 handle_flow_misses (fmb=fmb at entry=0x7f3e700008e0, backer=<optimized out>) at ../ofproto/ofproto-dpif.c:3410
>> #17 0x0000000000425196 in handle_upcalls (backer=<optimized out>) at ../ofproto/ofproto-dpif.c:3565
>> #18 dpif_backer_run_fast (backer=<optimized out>) at ../ofproto/ofproto-dpif.c:1007
>> #19 type_run_fast (type=<optimized out>) at ../ofproto/ofproto-dpif.c:1024
>> #20 0x00000000004122cf in ofproto_type_run_fast (datapath_type=<optimized out>, datapath_type at entry=0xc4ef20 "system") at ../ofproto/ofproto.c:1326
>> #21 0x00000000004081a5 in bridge_run_fast () at ../vswitchd/bridge.c:2318
>> #22 0x00000000004059c5 in main (argc=<optimized out>, argv=<optimized out>) at ../vswitchd/ovs-vswitchd.c:119
>> (gdb)
>> 
>> 
>> 
>>  <http://www.nuvula.ch/>
>> Marco Kuendig / CEO / Founder 
>> marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71 <tel:%2B41%2078%20751%2099%2071>
>> Marco's Google Hangout <https://plus.google.com/hangouts/_/nuvula.ch/marco>
>> Nuvula AG - Hybrid Clouds 
>> Weierbachstrasse 7b 8193 Eglisau Switzerland 
>> http://www.nuvula.ch <http://www.nuvula.ch/>
>>> On 31 Mar 2015, at 22:04, Joe Stringer <joestringer at nicira.com <mailto:joestringer at nicira.com>> wrote:
>>> 
>>> Great, we're moving. Looks like the gdb version of this is working below. Do you get the gdb prompt from there? the command 'bt' should provide the backtrace we're after.
>>> 
>>> On 31 March 2015 at 12:52, Marco Kuendig <marco at nuvula.ch <mailto:marco at nuvula.ch>> wrote:
>>> that brought us a step forward. thank Sab.
>>> 
>>> Important to know is:
>>> 
>>> I got 4 kvm servers, meshed with openvswitch. I use vxlan for tunnelling.
>>> 
>>> Sometimes when I restart a domain in kvm, 3 or 4 hosts crash at the same time.
>>> 
>>> I have STP enabled to avoid loops.
>>> 
>>> 
>>> this is the output now:
>>> 
>>> root at nuv-vir-kvm-server-1 ~ # gdb /usr/sbin/ovs-vswitchd /var/crash/ovs/CoreDump
>>> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
>>> Copyright (C) 2014 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html <http://gnu.org/licenses/gpl.html>>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>>> and "show warranty" for details.
>>> This GDB was configured as "x86_64-linux-gnu".
>>> Type "show configuration" for configuration details.
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/ <http://www.gnu.org/software/gdb/bugs/>>.
>>> Find the GDB manual and other documentation resources online at:
>>> <http://www.gnu.org/software/gdb/documentation/ <http://www.gnu.org/software/gdb/documentation/>>.
>>> For help, type "help".
>>> Type "apropos word" to search for commands related to "word"...
>>> Reading symbols from /usr/sbin/ovs-vswitchd...Reading symbols from /usr/lib/debug//usr/sbin/ovs-vswitchd...done.
>>> done.
>>> [New LWP 32725]
>>> [New LWP 32732]
>>> [New LWP 32726]
>>> [New LWP 32730]
>>> [New LWP 32727]
>>> [New LWP 32728]
>>> [New LWP 32729]
>>> [New LWP 32731]
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>> Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfi'.
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  nl_attr_get_size (nla=nla at entry=0x0) at ../lib/netlink.c:506
>>> 506	../lib/netlink.c: No such file or directory.
>>> 
>>> 
>>> root at nuv-vir-kvm-server-1 ~ # crash /usr/sbin/ovs-vswitchd /var/crash/ovs/CoreDump
>>> 
>>> crash 7.0.3
>>> Copyright (C) 2002-2013  Red Hat, Inc.
>>> Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
>>> Copyright (C) 1999-2006  Hewlett-Packard Co
>>> Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
>>> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
>>> Copyright (C) 2005, 2011  NEC Corporation
>>> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
>>> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
>>> This program is free software, covered by the GNU General Public License,
>>> and you are welcome to change it and/or distribute copies of it under
>>> certain conditions.  Enter "help copying" to see the conditions.
>>> This program has absolutely no warranty.  Enter "help warranty" for details.
>>> 
>>> 
>>> crash: /usr/sbin/ovs-vswitchd: no debugging data available
>>> 
>>> root at nuv-vir-kvm-server-1 ~ # ll /var/crash/ovs/
>>> Architecture         Date                 ExecutableTimestamp  ProcCwd              ProcStatus           UserGroups
>>> CoreDump             DistroRelease        ProblemType          ProcEnviron          Signal
>>> CrashCounter         ExecutablePath       ProcCmdline          ProcMaps             Uname
>>> 
>>>  <http://www.nuvula.ch/>
>>> Marco Kuendig / CEO / Founder 
>>> marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71 <tel:%2B41%2078%20751%2099%2071>
>>> Marco's Google Hangout <https://plus.google.com/hangouts/_/nuvula.ch/marco>
>>> Nuvula AG - Hybrid Clouds 
>>> Weierbachstrasse 7b 8193 Eglisau Switzerland 
>>> http://www.nuvula.ch <http://www.nuvula.ch/>
>>> 
>>>> On 31 Mar 2015, at 21:45, Sabyasachi Sengupta <Sabyasachi.Sengupta at alcatel-lucent.com <mailto:Sabyasachi.Sengupta at alcatel-lucent.com>> wrote:
>>>> 
>>>> 
>>>> Typically Ubuntu does not unpack the crashes. Can you try apport-unpack?
>>>> # apport-unpack /var/crash/<name> <crash-dir>
>>>> 
>>>> On Tue, 31 Mar 2015, Marco Kuendig wrote:
>>>> 
>>>>> thanks Joe and Ben
>>>>> have done:
>>>>> 1. installed dgb symbols for kernel....doesn't help
>>>>> 2. installed debug symbols for openvswitch
>>>>> no change, gdb and crash still don't work for me. I'm not a dev, need
>>>>> more help to get that backtrace done.
>>>>> here some output:
>>>>> root at nuv-vir-kvm-server-1 ~ # crash
>>>>>  /usr/lib/debug/boot/vmlinux-3.13.0-48-generic
>>>>> /var/crash/_usr_sbin_ovs-vswitchd.0.crash
>>>>> crash 7.0.3
>>>>> Copyright (C) 2002-2013  Red Hat, Inc.
>>>>> Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
>>>>> Copyright (C) 1999-2006  Hewlett-Packard Co
>>>>> Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
>>>>> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
>>>>> Copyright (C) 2005, 2011  NEC Corporation
>>>>> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
>>>>> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
>>>>> This program is free software, covered by the GNU General Public License,
>>>>> and you are welcome to change it and/or distribute copies of it under
>>>>> certain conditions.  Enter "help copying" to see the conditions.
>>>>> This program has absolutely no warranty.  Enter "help warranty" for
>>>>> details.
>>>>> crash: /var/crash/_usr_sbin_ovs-vswitchd.0.crash: not a supported file
>>>>> format
>>>>> Usage:
>>>>> 
>>>>>   crash [OPTION]... NAMELIST MEMORY-IMAGE  (dumpfile form)
>>>>>   crash [OPTION]... [NAMELIST]             (live system form)
>>>>> Enter "crash -h" for details.
>>>>> root at nuv-vir-kvm-server-1 ~ # gdb /usr/sbin/ovs-vswitchd
>>>>> /var/crash/_usr_sbin_ovs-vswitchd.0.crash
>>>>> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
>>>>> Copyright (C) 2014 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later
>>>>> <http://gnu.org/licenses/gpl.html <http://gnu.org/licenses/gpl.html>>
>>>>> This is free software: you are free to change and redistribute it.
>>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>>>>> copying"
>>>>> and "show warranty" for details.
>>>>> This GDB was configured as "x86_64-linux-gnu".
>>>>> Type "show configuration" for configuration details.
>>>>> For bug reporting instructions, please see:
>>>>> <http://www.gnu.org/software/gdb/bugs/ <http://www.gnu.org/software/gdb/bugs/>>.
>>>>> Find the GDB manual and other documentation resources online at:
>>>>> <http://www.gnu.org/software/gdb/documentation/ <http://www.gnu.org/software/gdb/documentation/>>.
>>>>> For help, type "help".
>>>>> Type "apropos word" to search for commands related to "word"...
>>>>> Reading symbols from /usr/sbin/ovs-vswitchd...Reading symbols from
>>>>> /usr/lib/debug//usr/sbin/ovs-vswitchd...done.
>>>>> done.
>>>>> "/var/crash/_usr_sbin_ovs-vswitchd.0.crash" is not a core dump: File
>>>>> format not recognized
>>>>> (gdb) q
>>>>> root at nuv-vir-kvm-server-1 ~ #
>>>>> Nuvula AG
>>>>> Marco Kuendig / CEO / Founder marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71 <tel:%2B41%2078%20751%2099%2071>
>>>>> Marco's Google Hangout
>>>>> Nuvula AG - Hybrid Clouds Weierbachstrasse 7b 8193 Eglisau Switzerland http://www.nuvula.ch <http://www.nuvula.ch/>
>>>>> 
>>>>>      On 31 Mar 2015, at 19:00, Joe Stringer
>>>>>      <joestringer at nicira.com <mailto:joestringer at nicira.com>> wrote:
>>>>> For the 'File format not recognized' problem, you might have better
>>>>> luck with the 'crash' utility.
>>>>> $ crash <binary> <crashdump>
>>>>> On 31 March 2015 at 08:16, Marco Kuendig <marco at nuvula.ch <mailto:marco at nuvula.ch>> wrote:
>>>>>      Have tried this:
>>>>> http://openvswitch.org/pipermail/discuss/2015-February/016582.html <http://openvswitch.org/pipermail/discuss/2015-February/016582.html>
>>>>> this is the output, so doesn't seem to be correct:
>>>>> root at nuv-vir-kvm-server-2 ~ # gdb /usr/sbin/ovs-vswitchd
>>>>> /var/crash/_usr_sbin_ovs-vswitchd.0.crash
>>>>> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
>>>>> Copyright (C) 2014 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later
>>>>> <http://gnu.org/licenses/gpl.html <http://gnu.org/licenses/gpl.html>>
>>>>> This is free software: you are free to change and
>>>>> redistribute it.
>>>>> There is NO WARRANTY, to the extent permitted by law.  Type
>>>>> "show copying"
>>>>> and "show warranty" for details.
>>>>> This GDB was configured as "x86_64-linux-gnu".
>>>>> Type "show configuration" for configuration details.
>>>>> For bug reporting instructions, please see:
>>>>> <http://www.gnu.org/software/gdb/bugs/ <http://www.gnu.org/software/gdb/bugs/>>.
>>>>> Find the GDB manual and other documentation resources online
>>>>> at:
>>>>> <http://www.gnu.org/software/gdb/documentation/ <http://www.gnu.org/software/gdb/documentation/>>.
>>>>> For help, type "help".
>>>>> Type "apropos word" to search for commands related to
>>>>> "word"...
>>>>> Reading symbols from /usr/sbin/ovs-vswitchd...(no debugging
>>>>> symbols found)...done.
>>>>> "/var/crash/_usr_sbin_ovs-vswitchd.0.crash" is not a core
>>>>> dump: File format not recognized
>>>>> (gdb) bt
>>>>> No stack.
>>>>> (gdb) quit
>>>>> any more hints please ?
>>>>> thanks
>>>>> marco
>>>>> Nuvula AG
>>>>> Marco Kuendig / CEO / Founder marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71 <tel:%2B41%2078%20751%2099%2071>
>>>>> Marco's Google Hangout
>>>>> Nuvula AG - Hybrid Clouds Weierbachstrasse 7b 8193 Eglisau Switzerland http://www.nuvula.ch <http://www.nuvula.ch/>
>>>>> 
>>>>>      On 31 Mar 2015, at 17:00, Ben Pfaff
>>>>>      <blp at nicira.com <mailto:blp at nicira.com>> wrote:
>>>>> Can you get a backtrace for these?
>>>>> On Tue, Mar 31, 2015 at 7:09 AM, Marco Kuendig
>>>>> <marco at nuvula.ch <mailto:marco at nuvula.ch>> wrote:
>>>>>      Folks,
>>>>> any chance of having somebody look at these crash
>>>>> files ?
>>>>> I have several servers that are loosing network
>>>>> connectivity because of this.
>>>>> Downloads:
>>>>> https://drive.google.com/file/d/0Bx_w1Tf2B5VSRU9yUmRpTDJLVEU/view?usp=sharing <https://drive.google.com/file/d/0Bx_w1Tf2B5VSRU9yUmRpTDJLVEU/view?usp=sharing>
>>>>> Thanks for any hint or fix
>>>>> marco
>>>>> Nuvula AG
>>>>> Marco Kuendig / CEO / Founder marco at nuvula.ch <mailto:marco at nuvula.ch> / +41 78 751 99 71 <tel:%2B41%2078%20751%2099%2071>
>>>>> Marco's Google Hangout
>>>>> Nuvula AG - Hybrid Clouds Weierbachstrasse 7b 8193 Eglisau Switzerland http://www.nuvula.ch <http://www.nuvula.ch/>
>>>>> _______________________________________________
>>>>> discuss mailing list
>>>>> discuss at openvswitch.org <mailto:discuss at openvswitch.org>
>>>>> http://openvswitch.org/mailman/listinfo/discuss <http://openvswitch.org/mailman/listinfo/discuss>
>>>>> --
>>>>> "I don't normally do acked-by's.  I think it's my way
>>>>> of avoiding
>>>>> getting blamed when it all blows up." Andrew Morton
>>>>> _______________________________________________
>>>>> discuss mailing list
>>>>> discuss at openvswitch.org <mailto:discuss at openvswitch.org>
>>>>> http://openvswitch.org/mailman/listinfo/discuss <http://openvswitch.org/mailman/listinfo/discuss>
>>> 
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20150331/22a8eeef/attachment-0002.html>


More information about the discuss mailing list