[ovs-dev] '27: test atomic operations' unit test hanging
Finucane, Stephen
stephen.finucane at intel.com
Wed Jan 7 15:57:58 UTC 2015
> UPDATE: With a fresh "apt-get dist-upgrade" on Ubuntu 14.10 and a fresh
> "git pull" on the openvswitch repo, the clang error and test #27 hangup are
> not reproducible on that platform.
>
> The test #27 hangup is still reproducible on CentOS 7 after a fresh "yum
> upgrade" and "git pull" on the openvswitch repo, with the backtrace looking
> the same as it does below. The writer thread and reader thread are both
> intact, but it looks like the test is running incredibly slowly as Jarno
> described.
This is also the case for Fedora 20 also. I can't say whether it's running slow or at all though. I'll leave it running tonight to see.
> > I see this same error, Stephen -- the test gets caught in an
> > infinite while loop where the atomic reader waits for an atomic
> > writer that is not running (maybe terminated or failed to spawn). I
> > see the error when building on Ubuntu 14.10 and CentOS 7. I do not
> > see it on CentOS 6.6.
>
> Can you give a backtrace?
>
My backtrace appears to be much the same. Nonetheless, for posterities sake:
$ gdb --args tests/ovstest test-atomic
...
(gdb) r
Starting program: /home/sfinucan/development/ovs/ovs/tests/ovstest test-atomic
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fe0700 (LWP 107540)]
[New Thread 0x7ffff77df700 (LWP 107541)]
^C
Program received signal SIGINT, Interrupt.
0x00000039ace09237 in pthread_join (threadid=140737354008320, thread_return=0x0) at pthread_join.c:92
92 lll_wait_tid (pd->tid);
(gdb) bt
#0 0x00000039ace09237 in pthread_join (threadid=140737354008320, thread_return=0x0) at pthread_join.c:92
#1 0x000000000047af36 in xpthread_join (arg1=140737354009040, arg2=0x0) at lib/ovs-thread.c:175
#2 0x0000000000405801 in test_acq_rel () at tests/test-atomic.c:313
#3 test_atomic_main (argc=<optimized out>, argv=<optimized out>) at tests/test-atomic.c:393
#4 0x00000000004324fc in run_command (argc=1, argv=0x7fffffffe2d0, commands=<optimized out>) at lib/command-line.c:115
#5 0x00000000004053b3 in main (argc=<optimized out>, argv=<optimized out>) at tests/ovstest.c:128
(gdb) info threads
Id Target Id Frame
3 Thread 0x7ffff77df700 (LWP 107541) "writer2" atomic_writer (aux_=0x792010) at tests/test-atomic.c:284
2 Thread 0x7ffff7fe0700 (LWP 107540) "reader1" atomic_reader (aux_=0x792010) at tests/test-atomic.c:258
* 1 Thread 0x7ffff7fe1a00 (LWP 107522) "ovstest" 0x00000039ace09237 in pthread_join (threadid=140737354008320, thread_return=0x0) at pthread_join.c:92
(gdb) thread 2
[Switching to thread 2 (Thread 0x7ffff7fe0700 (LWP 107540))]
#0 atomic_reader (aux_=0x792010) at tests/test-atomic.c:258
258 atomic_read_explicit(&aux->data64, &data, memory_order_acquire);
(gdb) bt
#0 atomic_reader (aux_=0x792010) at tests/test-atomic.c:258
#1 0x000000000047b499 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:339
#2 0x00000039ace07ee5 in start_thread (arg=0x7ffff7fe0700) at pthread_create.c:309
#3 0x00000039ac6f4b8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff77df700 (LWP 107541))]
#0 atomic_writer (aux_=0x792010) at tests/test-atomic.c:284
284 atomic_read_explicit(&aux->data64, &data, memory_order_acquire);
(gdb) bt
#0 atomic_writer (aux_=0x792010) at tests/test-atomic.c:284
#1 0x000000000047b499 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:339
#2 0x00000039ace07ee5 in start_thread (arg=0x7ffff77df700) at pthread_create.c:309
#3 0x00000039ac6f4b8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) thread 4
Thread ID 4 not known.
> > I also see the OVS atomic source code cause an internal compiler
> > error when using clang as delivered in the Ubuntu 14.10 distro.
> > (Compiler says it is unable to determine whether atomic boolean base
> > type should be "bool" or "u8"). Some forum threads have claimed
> > that recent Ubuntu clang crashes can be a result of a bug where
> > clang builds improperly under GCC 4.8. I tried rebuilding clang
> > using GCC 4.7 on Ubuntu but it didn't eliminate the internal error
> > when the resultant clang binary compiled OVS atomic code.
>
> By "internal compiler error", do you mean that compiling OVS triggers
> a bug in Clang? I haven't seen this--what version of Clang are you
> using? But you can work around the problem by compiling with GCC
> instead.
I don't see any serious issues when building with clang (though I only tried building with DPDK linked in).
More information about the dev
mailing list