[ovs-dev] '27: test atomic operations' unit test hanging

Finucane, Stephen stephen.finucane at intel.com
Wed Jan 7 15:57:58 UTC 2015


> UPDATE:  With a fresh "apt-get dist-upgrade" on Ubuntu 14.10 and a fresh
> "git pull" on the openvswitch repo, the clang error and test #27 hangup are
> not reproducible on that platform.
> 
> The test #27 hangup is still reproducible on CentOS 7 after a fresh "yum
> upgrade" and "git pull" on the openvswitch repo, with the backtrace looking
> the same as it does below.  The writer thread and reader thread are both
> intact, but it looks like the test is running incredibly slowly as Jarno
> described.

This is also the case for Fedora 20 also. I can't say whether it's running slow or at all though. I'll leave it running tonight to see.

> > I see this same error, Stephen -- the test gets caught in an
> > infinite while loop where the atomic reader waits for an atomic
> > writer that is not running (maybe terminated or failed to spawn).  I
> > see the error when building on Ubuntu 14.10 and CentOS 7.  I do not
> > see it on CentOS 6.6.
> 
> Can you give a backtrace?
> 

My backtrace appears to be much the same. Nonetheless, for posterities sake:

	$ gdb --args tests/ovstest test-atomic
	...
	(gdb) r
	Starting program: /home/sfinucan/development/ovs/ovs/tests/ovstest test-atomic
	[Thread debugging using libthread_db enabled]
	Using host libthread_db library "/lib64/libthread_db.so.1".
	[New Thread 0x7ffff7fe0700 (LWP 107540)]
	[New Thread 0x7ffff77df700 (LWP 107541)]
	^C
	Program received signal SIGINT, Interrupt.
	0x00000039ace09237 in pthread_join (threadid=140737354008320, thread_return=0x0) at pthread_join.c:92
	92          lll_wait_tid (pd->tid);

	(gdb) bt
	#0  0x00000039ace09237 in pthread_join (threadid=140737354008320, thread_return=0x0) at pthread_join.c:92
	#1  0x000000000047af36 in xpthread_join (arg1=140737354009040, arg2=0x0) at lib/ovs-thread.c:175
	#2  0x0000000000405801 in test_acq_rel () at tests/test-atomic.c:313
	#3  test_atomic_main (argc=<optimized out>, argv=<optimized out>) at tests/test-atomic.c:393
	#4  0x00000000004324fc in run_command (argc=1, argv=0x7fffffffe2d0, commands=<optimized out>) at lib/command-line.c:115
	#5  0x00000000004053b3 in main (argc=<optimized out>, argv=<optimized out>) at tests/ovstest.c:128

	(gdb) info threads
	  Id   Target Id         Frame
	  3    Thread 0x7ffff77df700 (LWP 107541) "writer2" atomic_writer (aux_=0x792010) at tests/test-atomic.c:284
	  2    Thread 0x7ffff7fe0700 (LWP 107540) "reader1" atomic_reader (aux_=0x792010) at tests/test-atomic.c:258
	* 1    Thread 0x7ffff7fe1a00 (LWP 107522) "ovstest" 0x00000039ace09237 in pthread_join (threadid=140737354008320, thread_return=0x0) at pthread_join.c:92

	(gdb) thread 2
	[Switching to thread 2 (Thread 0x7ffff7fe0700 (LWP 107540))]
	#0  atomic_reader (aux_=0x792010) at tests/test-atomic.c:258
	258                 atomic_read_explicit(&aux->data64, &data, memory_order_acquire);
	(gdb) bt
	#0  atomic_reader (aux_=0x792010) at tests/test-atomic.c:258
	#1  0x000000000047b499 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:339
	#2  0x00000039ace07ee5 in start_thread (arg=0x7ffff7fe0700) at pthread_create.c:309
	#3  0x00000039ac6f4b8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

	(gdb) thread 3
	[Switching to thread 3 (Thread 0x7ffff77df700 (LWP 107541))]
	#0  atomic_writer (aux_=0x792010) at tests/test-atomic.c:284
	284                 atomic_read_explicit(&aux->data64, &data, memory_order_acquire);
	(gdb) bt
	#0  atomic_writer (aux_=0x792010) at tests/test-atomic.c:284
	#1  0x000000000047b499 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:339
	#2  0x00000039ace07ee5 in start_thread (arg=0x7ffff77df700) at pthread_create.c:309
	#3  0x00000039ac6f4b8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

	(gdb) thread 4
	Thread ID 4 not known.

> > I also see the OVS atomic source code cause an internal compiler
> > error when using clang as delivered in the Ubuntu 14.10 distro.
> > (Compiler says it is unable to determine whether atomic boolean base
> > type should be "bool" or "u8").  Some forum threads have claimed
> > that recent Ubuntu clang crashes can be a result of a bug where
> > clang builds improperly under GCC 4.8.  I tried rebuilding clang
> > using GCC 4.7 on Ubuntu but it didn't eliminate the internal error
> > when the resultant clang binary compiled OVS atomic code.
> 
> By "internal compiler error", do you mean that compiling OVS triggers
> a bug in Clang?  I haven't seen this--what version of Clang are you
> using?  But you can work around the problem by compiling with GCC
> instead.

I don't see any serious issues when building with clang (though I only tried building with DPDK linked in).


More information about the dev mailing list