[ovs-dev] '27: test atomic operations' unit test hanging

Sharo, Randall A CIV SPAWARSYSCEN-ATLANTIC, 55200 randall.sharo at navy.mil
Fri Jan 2 16:59:05 UTC 2015


UPDATE:  With a fresh "apt-get dist-upgrade" on Ubuntu 14.10 and a fresh "git pull" on the openvswitch repo, the clang error and test #27 hangup are not reproducible on that platform.



The test #27 hangup is still reproducible on CentOS 7 after a fresh "yum upgrade" and "git pull" on the openvswitch repo, with the backtrace looking the same as it does below.  The writer thread and reader thread are both intact, but it looks like the test is running incredibly slowly as Jarno described.



-Randy



________________________________
From: Sharo, Randall A CIV SPAWARSYSCEN-ATLANTIC, 55200
Sent: Monday, December 29, 2014 6:02 PM
Subject: RE: [ovs-dev] '27: test atomic operations' unit test hanging


Please see the following backtraces for test 27:



(gdb) attach 128246
Attaching to program: ovs/tests/ovstest, process 128246

[...]

(gdb) bt
#0  0x00007ffebd820f37 in pthread_join () from /lib64/libpthread.so.0
#1  0x0000000000471ed9 in xpthread_join (arg1=arg1 at entry=140732040972032<mailto:arg1=arg1 at entry=140732040972032>, arg2=arg2 at entry=0x0<mailto:arg2=arg2 at entry=0x0>) at lib/ovs-thread.c:174
#2  0x00000000004121a3 in test_acq_rel () at tests/test-atomic.c:313
#3  test_atomic_main (argc=<optimized out>, argv=<optimized out>) at tests/test-atomic.c:393
#4  0x00000000004307ba in run_command (argc=argc at entry=1<mailto:argc=argc at entry=1>, argv=argv at entry=0x7fff9f4f0620<mailto:argv=argv at entry=0x7fff9f4f0620>, commands=<optimized out>)
    at lib/command-line.c:115
#5  0x00000000004063cd in main (argc=2, argv=0x7fff9f4f0618) at tests/ovstest.c:128

[...]

(gdb) thread 2
[Switching to thread 2 (Thread 0x7ffebacf7700 (LWP 128248))]
#0  atomic_writer (aux_=0x1bd0030) at tests/test-atomic.c:284
284             atomic_read_explicit(&aux->data64, &data, memory_order_acquire);
(gdb) bt
#0  atomic_writer (aux_=0x1bd0030) at tests/test-atomic.c:284
#1  0x00000000004716e1 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:338
#2  0x00007ffebd81fdf3 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffebd04301d in clone () from /lib64/libc.so.6

[...]

(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffebb4f8700 (LWP 128247))]
#0  atomic_reader (aux_=0x1bd0030) at tests/test-atomic.c:258
258             atomic_read_explicit(&aux->data64, &data, memory_order_acquire);
(gdb) bt
#0  atomic_reader (aux_=0x1bd0030) at tests/test-atomic.c:258
#1  0x00000000004716e1 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs-thread.c:338
#2  0x00007ffebd81fdf3 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffebd04301d in clone () from /lib64/libc.so.6

[...]

(gdb) thread 4
Thread ID 4 not known.
(gdb) quit

[...]

Script done on Mon 29 Dec 2014 05:50:30 PM EST



----



As for clang, yes, the compiler itself crashes ("internal compiler error") with an error message stating that it can't resolve a data type. I'll have to look at that another time -- I don't have access to that machine right now.  What I can say for certain at this point is it is the latest binary package from the Ubuntu 14.10 repos (no ppa's added, no build from source, just "apt-get install clang").  Probably 3.5.  I was only using clang to check for warnings not produced by gcc, though.  I use gcc normally.



Regards,

   Randy Sharo



________________________________
From: Ben Pfaff [blp at nicira.com]
Sent: Monday, December 29, 2014 3:39 PM
To: Sharo, Randall A CIV SPAWARSYSCEN-ATLANTIC, 55200
Cc: dev at openvswitch.org
Subject: Re: [ovs-dev] '27: test atomic operations' unit test hanging

On Tue, Dec 23, 2014 at 11:56:40PM +0000, Sharo, Randall A CIV SPAWARSYSCEN-ATLANTIC, 55200 wrote:
> I see this same error, Stephen -- the test gets caught in an
> infinite while loop where the atomic reader waits for an atomic
> writer that is not running (maybe terminated or failed to spawn).  I
> see the error when building on Ubuntu 14.10 and CentOS 7.  I do not
> see it on CentOS 6.6.

Can you give a backtrace?

> I also see the OVS atomic source code cause an internal compiler
> error when using clang as delivered in the Ubuntu 14.10 distro.
> (Compiler says it is unable to determine whether atomic boolean base
> type should be "bool" or "u8").  Some forum threads have claimed
> that recent Ubuntu clang crashes can be a result of a bug where
> clang builds improperly under GCC 4.8.  I tried rebuilding clang
> using GCC 4.7 on Ubuntu but it didn't eliminate the internal error
> when the resultant clang binary compiled OVS atomic code.

By "internal compiler error", do you mean that compiling OVS triggers
a bug in Clang?  I haven't seen this--what version of Clang are you
using?  But you can work around the problem by compiling with GCC
instead.



More information about the dev mailing list