[ovs-discuss] ovs-vswitchd crashing lib/rconn.c:568

Luiz Henrique Ozaki luiz.ozaki at gmail.com
Thu Jan 22 13:47:57 UTC 2015


Hi all,

I have some instances running on OVS with the XenServer and the
ovs-vswitchd process have been crashing for me:
ovs-vswitchd: monitoring pid 19196 (10 crashes: pid 11819 died, killed
(Aborted))

I've got this on the log:
23159|util|EMER|lib/rconn.c:568: assertion version >= 0 && version <= 0xff
failed in run_ACTIVE()

I did a backtrace from the coredump, but I couldn't figure it out why I'm
getting this assertion failed.

Here is a backtrace from the 2.3.0 version:
(gdb) thread apply all bt

Thread 11 (Thread 0xb7725b90 (LWP 20243)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb73d9ff2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#2  0xb74cbbb4 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libc.so.6
#3  0xb7571199 in handle_fildes_io () from /lib/librt.so.1
#4  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#5  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 10 (Thread 0xb48feb90 (LWP 12169)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb6704538, n_pollfds=3, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb48e6eb8) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x08075aa1 in udpif_upcall_handler (arg=0x95f6490) at
ofproto/ofproto-dpif-upcall.c:529
#5  0x080e7058 in ovsthread_wrapper (aux_=0x95a65e0) at lib/ovs-thread.c:322
#6  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#7  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 9 (Thread 0xb725eb90 (LWP 12168)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb673b0a8, n_pollfds=3, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb7246eb8) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x08075aa1 in udpif_upcall_handler (arg=0x95f6484) at
ofproto/ofproto-dpif-upcall.c:529
#5  0x080e7058 in ovsthread_wrapper (aux_=0x9600d20) at lib/ovs-thread.c:322
#6  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#7  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 8 (Thread 0xb5a5bb90 (LWP 12167)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0x9612c60, n_pollfds=3, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb5a43eb8) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x08075aa1 in udpif_upcall_handler (arg=0x95f6478) at
ofproto/ofproto-dpif-upcall.c:529
#5  0x080e7058 in ovsthread_wrapper (aux_=0x9611220) at lib/ovs-thread.c:322
#6  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#7  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 7 (Thread 0xb38fcb90 (LWP 12174)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb65281c0, n_pollfds=2, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb38fad88) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x080e70cb in ovs_barrier_block (barrier=0x95a7bac) at
lib/ovs-thread.c:290
#5  0x08076d6e in udpif_revalidator (arg=0x95f3a18) at
ofproto/ofproto-dpif-upcall.c:588
---Type <return> to continue, or q <return> to quit---
#6  0x080e7058 in ovsthread_wrapper (aux_=0x96110e0) at lib/ovs-thread.c:322
#7  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#8  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 6 (Thread 0xb30fbb90 (LWP 12178)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb6526308, n_pollfds=2, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb30f9d88) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x080e70cb in ovs_barrier_block (barrier=0x95a7bac) at
lib/ovs-thread.c:290
#5  0x08076d6e in udpif_revalidator (arg=0x95f3a28) at
ofproto/ofproto-dpif-upcall.c:588
#6  0x080e7058 in ovsthread_wrapper (aux_=0x9620820) at lib/ovs-thread.c:322
#7  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#8  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 5 (Thread 0xb40fdb90 (LWP 12173)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb67119b0, n_pollfds=3, handles=0x0,
timeout_when=10068193, elapsed=0xb40fbdb8) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x080773d8 in udpif_revalidator (arg=0x95f3a08) at
ofproto/ofproto-dpif-upcall.c:629
#5  0x080e7058 in ovsthread_wrapper (aux_=0x9620038) at lib/ovs-thread.c:322
#6  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#7  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 4 (Thread 0xb50ffb90 (LWP 12166)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb6713a20, n_pollfds=3, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb50e7eb8) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x08075aa1 in udpif_upcall_handler (arg=0x95f646c) at
ofproto/ofproto-dpif-upcall.c:529
#5  0x080e7058 in ovsthread_wrapper (aux_=0x96110e0) at lib/ovs-thread.c:322
#6  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#7  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 3 (Thread 0xb28fab90 (LWP 12165)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb73dc839 in __lll_lock_wait () from /lib/libpthread.so.0
#2  0xb73d7e9f in _L_lock_885 () from /lib/libpthread.so.0
#3  0xb73d7d66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb74cbcd6 in pthread_mutex_lock () from /lib/libc.so.6
#5  0x080e7731 in ovs_mutex_lock_at (l_=0x962d7b0, where=0x815b306
"lib/rconn.c:960") at lib/ovs-thread.c:70
#6  0x080f4a6f in rconn_get_version (rconn=0x962d7b0) at lib/rconn.c:960
#7  0x0807ffba in ofconn_get_protocol (ofconn=0x962f490) at
ofproto/connmgr.c:992
#8  0x08080018 in connmgr_wants_packet_in_on_miss (mgr=0x9600e88) at
ofproto/connmgr.c:1577
#9  0x0806753a in rule_dpif_lookup (ofproto=0x9601598, flow=0xb28f9ebc,
wc=0xb28e31dc, rule=0xb28e2eb4, take_ref=false) at
ofproto/ofproto-dpif.c:3238
---Type <return> to continue, or q <return> to quit---
#10 0x0807c568 in xlate_actions__ (xin=0xb28f9eb8, xout=0xb28e31dc) at
ofproto/ofproto-dpif-xlate.c:3278
#11 xlate_actions (xin=0xb28f9eb8, xout=0xb28e31dc) at
ofproto/ofproto-dpif-xlate.c:3182
#12 0x08076265 in handle_upcalls (arg=0x95f6460) at
ofproto/ofproto-dpif-upcall.c:931
#13 udpif_upcall_handler (arg=0x95f6460) at
ofproto/ofproto-dpif-upcall.c:531
#14 0x080e7058 in ovsthread_wrapper (aux_=0x95dc718) at lib/ovs-thread.c:322
#15 0xb73d5912 in start_thread () from /lib/libpthread.so.0
#16 0xb74bf4ae in clone () from /lib/libc.so.6

Thread 2 (Thread 0xb625cb90 (LWP 6925)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb74b56c3 in poll () from /lib/libc.so.6
#2  0x08102166 in time_poll (pollfds=0xb6998de0, n_pollfds=2, handles=0x0,
timeout_when=9223372036854775807, elapsed=0xb625c128) at lib/timeval.c:301
#3  0x080f3f5a in poll_block () at lib/poll-loop.c:314
#4  0x080e6c79 in ovsrcu_postpone_thread (arg=0x0) at lib/ovs-rcu.c:267
#5  0x080e7058 in ovsthread_wrapper (aux_=0x95c2870) at lib/ovs-thread.c:322
#6  0xb73d5912 in start_thread () from /lib/libpthread.so.0
#7  0xb74bf4ae in clone () from /lib/libc.so.6

Thread 1 (Thread 0xb725f8f0 (LWP 6615)):
#0  0xb7728424 in __kernel_vsyscall ()
#1  0xb7412b10 in raise () from /lib/libc.so.6
#2  0xb7414421 in abort () from /lib/libc.so.6
#3  0x081044c4 in ovs_abort_valist (err_no=0, format=0x815d864 "%s:
assertion %s failed in %s()", args=0xbfef7658
"\005\265\025\b\210\271\025\b\366\267\025\b\260\327b\txv\357\277JJ\017\b\b")
at lib/util.c:322
#4  0x08109858 in vlog_abort_valist (module_=0x81847e0, message=0x815d864
"%s: assertion %s failed in %s()", args=0xbfef7658
"\005\265\025\b\210\271\025\b\366\267\025\b\260\327b\txv\357\277JJ\017\b\b")
    at lib/vlog.c:992
#5  0x08109882 in vlog_abort (module=0x81847e0, message=0x815d864 "%s:
assertion %s failed in %s()") at lib/vlog.c:1006
#6  0x08104711 in ovs_assert_failure (where=0x815b505 "lib/rconn.c:568",
function=0x19d7 <Address 0x19d7 out of bounds>, condition=0x815b988
"version >= 0 && version <= 0xff") at lib/util.c:71
#7  0x080f6003 in run_ACTIVE (rc=0x962d7b0) at lib/rconn.c:568
#8  rconn_run (rc=0x962d7b0) at lib/rconn.c:659
#9  0x08081e45 in ofconn_run (mgr=0x9600e88, handle_openflow=0x80658a0
<handle_openflow>) at ofproto/connmgr.c:1390
#10 connmgr_run (mgr=0x9600e88, handle_openflow=0x80658a0
<handle_openflow>) at ofproto/connmgr.c:339
#11 0x08062597 in ofproto_run (p=0x96015a0) at ofproto/ofproto.c:1543
#12 0x0804bef3 in bridge_run__ () at vswitchd/bridge.c:2255
#13 0x080537ca in bridge_run () at vswitchd/bridge.c:2307
#14 0x08054e56 in main (argc=-1257216160, argv=0x0) at
vswitchd/ovs-vswitchd.c:116

Running OVS 1.9.3 and 2.3.0, both crashes at the same assertion.
XenServer 6.2

It seems that this only happens when ovs-vswitchd is on heavy load and I
run ovs-ofctl add-flow.

Anyone knows what could be triggering this ?

Regards,


-- 
[]'s
Luiz Henrique Ozaki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20150122/4d55d85c/attachment-0002.html>


More information about the discuss mailing list