Bug 1222286

Summary: arm: all programs that link to librados2 hang forever on startup
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: cephAssignee: Boris Ranto <branto>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: amit.shah, berrange, branto, cfergeau, crobinso, david, dwmw2, fedora, itamar, kdreyer, ktdreyer, loic, pbonzini, rjones, scottt.tw, steve, tmraz, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: armv7hl   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-0.94.1-3.fc22 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1226806 (view as bug list) Environment:
Last Closed: 2015-06-04 20:19:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269, 1226806    
Attachments:
Description Flags
fix-cycles-arch.patch
none
strace qemu-system-arm -help none

Description Richard W.M. Jones 2015-05-17 09:33:29 UTC
Description of problem:

My ARMv7 server is broken at the moment so I can't try this, but
it seems as if the simple command:

/usr/bin/qemu-system-arm -help

hangs (on ARMv7 hardware).

See this build log for an example:

https://kojipkgs.fedoraproject.org//work/tasks/3698/9763698/build.log

Version-Release number of selected component (if applicable):

qemu 2:2.3.0-3.fc23.armv7hl

How reproducible:

Unknown.

Steps to Reproduce:
1. Run `qemu-system-arm -help'

Actual results:

Hangs.

Expected results:

Should print help output.

Comment 1 Richard W.M. Jones 2015-05-17 09:55:27 UTC
Another build and again it's hanging in the same place, but again
only on armv7hl:
http://koji.fedoraproject.org/koji/taskinfo?taskID=9767065
https://kojipkgs.fedoraproject.org//work/tasks/7067/9767067/build.log

This works fine on x86-64 & i686.

The configure.ac fragment is very simple so it's not likely to
be a problem with autoconf, although I guess anything is possible:

    AC_MSG_CHECKING([that $QEMU -help works])
    if $QEMU -help >&AS_MESSAGE_LOG_FD 2>&1; then
        AC_MSG_RESULT([yes])
    else
        AC_MSG_RESULT([no])
        AC_MSG_FAILURE(<error message omitted>)
    fi

Comment 2 Richard W.M. Jones 2015-05-17 13:52:22 UTC
strace output is here:
https://kojipkgs.fedoraproject.org//work/tasks/7470/9767470/build.log

It's frankly bizarre.  The last system calls are:

open("/dev/urandom", O_RDONLY|O_LARGEFILE) = 3
fcntl64(3, F_GETFD)                     = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
fstat64(3, {st_mode=S_IFCHR|0444, st_rdev=makedev(1, 9), ...}) = 0
clock_gettime(CLOCK_REALTIME, {1431870340, 192190777}) = 0
getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 330000}, ru_stime={0, 55000}, ...}
) = 0
read(3, "*\222:\253\26\211\313\245\6KF\256Bm\242\363s\202>`u\223\6*1\351\234\324
E\362\347j", 32) = 32
read(3, "83\313y\206\10\316^\226\342&\266;R3\215\375\350\227\363+\16?\210LTy\345
v.\251\243", 32) = 32
read(3, "\231\302\270,\311\271z\324", 8) = 8
gettimeofday({1431870340, 198569}, NULL) = 0
gettimeofday({1431870340, 198647}, NULL) = 0

... and gettimeofday repeats in a loop forever.  The context looks as
if it is a bug in libselinux.

Comment 3 Richard W.M. Jones 2015-05-17 15:41:44 UTC
According to gdb, this is the backtrace:

warning: Probes-based dynamic linker interface failed.
Reverting to original interface.
Cannot parse expression `.L1046 4@r4'.
#0  0xb6fd10b0 in dl_main () from /lib/ld-linux-armhf.so.3
#1  0xb6fe6670 in _dl_sysdep_start () from /lib/ld-linux-armhf.so.3
#2  0xb6fd23a8 in _dl_start () from /lib/ld-linux-armhf.so.3
#3  0xb6fcdb50 in _start () from /lib/ld-linux-armhf.so.3
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Comment 4 Richard W.M. Jones 2015-05-17 16:01:56 UTC
OK I believe the code is from libnettle (gnutls):

lib/nettle/rnd-common.c:_rnd_system_entropy_init contains

        _gnutls_urandom_fd = open("/dev/urandom", O_RDONLY);
        if (_gnutls_urandom_fd < 0) {
                _gnutls_debug_log("Cannot open urandom!\n");
                goto fallback;
        }

        old = fcntl(_gnutls_urandom_fd, F_GETFD);
        if (old != -1)
                fcntl(_gnutls_urandom_fd, F_SETFD, old | FD_CLOEXEC);

        if (fstat(_gnutls_urandom_fd, &st) >= 0) {
                _gnutls_urandom_fd_mode = st.st_mode;
        }

lib/nettle/rnd-common.c:_rnd_get_event contains

        gettime(&e->now);

#ifdef HAVE_GETRUSAGE
        if (getrusage(ARG_RUSAGE, &e->rusage) < 0) {
                _gnutls_debug_log("getrusage failed: %s\n",
                          strerror(errno));
        }
#endif

which looks identical to the sequence of syscalls being made.  It's
still a mystery where the infinite loop happens.

Comment 5 Richard W.M. Jones 2015-05-17 16:14:34 UTC
I'm still pretty sure this stack trace is wrong, but I got gdb
to print out the register contents:

warning: Probes-based dynamic linker interface failed.
Reverting to original interface.
Cannot parse expression `.L1046 4@r4'.
r0             0xb6fff8f4	3070228724
r1             0x0	0
r2             0xb6fff908	3070228744
r3             0xb6fff8f4	3070228724
r4             0xb6fff8f4	3070228724
r5             0x0	0
r6             0xfffffffc	4294967292
r7             0xa3	163
r8             0xa3	163
r9             0x0	0
r10            0xb6fff908	3070228744
r11            0xbefff95c	3204446556
r12            0xb6ffefac	3070226348
sp             0xbefff8c0	0xbefff8c0
lr             0xb6fd10b0	-1224929104
pc             0xb6fd10b0	0xb6fd10b0 <dl_main+11624>
cpsr           0x60070010	1611071504
Thread 1 (process 804):
#0  0xb6fd10b0 in dl_main () from /lib/ld-linux-armhf.so.3
#1  0xb6fe6670 in _dl_sysdep_start () from /lib/ld-linux-armhf.so.3
#2  0xb6fd23a8 in _dl_start () from /lib/ld-linux-armhf.so.3
#3  0xb6fcdb50 in _start () from /lib/ld-linux-armhf.so.3
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Comment 6 Richard W.M. Jones 2015-05-17 16:48:47 UTC
*If* the stack trace is to be believed, then it points to
the following code in glibc.  PC is marked with an arrow.

    40a0:       eb00376c        bl      11e58 <_dl_debug_initialize>
    40a4:       e580500c        str     r5, [r0, #12]
    40a8:       e1a04000        mov     r4, r0
    40ac:       eb003768        bl      11e54 <_dl_debug_state>
--> 40b0:       e320f000        nop     {0}
    40b4:       eb0052c7        bl      18bd8 <_dl_unload_cache>
    40b8:       e24bd020        sub     sp, fp, #32
    40bc:       e8bd8ff0        pop     {r4, r5, r6, r7, r8, r9, sl, fp, pc}

Comment 7 Nikos Mavrogiannopoulos 2015-05-17 22:25:30 UTC
The gnutls stacktrace being repeated is run within the library constructor. Could it be that one of these calls causes a dlopen in that platform which in turn causes the infinite loop?

Comment 8 Richard W.M. Jones 2015-05-18 15:22:51 UTC
Finally I am able to reproduce this locally, so I'll attach a
debugger and find out what's really going on here.

Comment 9 Richard W.M. Jones 2015-05-18 17:41:20 UTC
First observation is that the stack trace above *is* bogus.  qemu-system-arm
for some reason behaves differently under gdb, triggering some kind
of (gdb?) bug.  Please ignore the stack traces in comments 3, 5 & 6.
And my apologies for blaming gnutls.

The REAL stack trace points to ceph:

#0  0xb284846c in gettimeofday () from /lib/libc.so.6
#1  0xb405b868 in Cycles::init () at common/Cycles.cc:75
#2  0xb6b28500 in call_init.part () from /lib/ld-linux-armhf.so.3
#3  0xb6b28664 in _dl_init () from /lib/ld-linux-armhf.so.3
#4  0xb6b17b84 in _dl_start_user () from /lib/ld-linux-armhf.so.3

Comment 10 Richard W.M. Jones 2015-05-18 18:01:46 UTC
A very trivial reproducer for this is:

$ echo 'main(){}' > test.c
$ gcc test.c -lrados -o test
$ ./test

The final command will hang forever on 32 bit ARM.

Comment 11 Richard W.M. Jones 2015-05-18 18:13:07 UTC
Created attachment 1026798 [details]
fix-cycles-arch.patch

BLEAH UPSTREAM FIRST *!#$

Turns out that Ubuntu fixed this and are carrying a downstream patch
for it, which I have attached.

Ubuntu bug: https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1432786

Comment 12 Boris Ranto 2015-05-18 18:32:07 UTC
Actually, that patch got half-merged upstream (the first half) and that is why you were seeing the hang-ups. I'll apply the second half of the patch and rebuild the packages.

Comment 13 Fedora Update System 2015-05-18 22:01:38 UTC
ceph-0.94.1-1.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/FEDORA-2015-8189/ceph-0.94.1-1.fc22

Comment 14 Boris Ranto 2015-05-18 22:32:05 UTC
btw: Note that it is not fixed in ceph-0.94.1-1.fc22, it is (should be) fixed in ceph-0.94.1-2.fc22 and ceph-0.94.1-2.fc23 that were just recently built in koji:

http://koji.fedoraproject.org/koji/taskinfo?taskID=9777655
http://koji.fedoraproject.org/koji/taskinfo?taskID=9777671

Comment 15 Richard W.M. Jones 2015-05-19 07:29:19 UTC
Can confirm that this is fixed in ceph-0.94.1-2.fc23 on armv7hl.

Comment 16 Fedora Update System 2015-05-20 02:54:10 UTC
Package ceph-0.94.1-2.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing ceph-0.94.1-2.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-8189/ceph-0.94.1-2.fc22
then log in and leave karma (feedback).

Comment 17 Cole Robinson 2015-05-20 14:24:49 UTC
FWIW qemu built against this package still seems to hang:

http://koji.fedoraproject.org/koji/taskinfo?taskID=9780467

But I'm trying another build now

Comment 18 Fedora Update System 2015-05-22 02:31:20 UTC
Package ceph-0.94.1-3.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing ceph-0.94.1-3.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-8189/ceph-0.94.1-3.fc22
then log in and leave karma (feedback).

Comment 19 Richard W.M. Jones 2015-05-29 20:32:28 UTC
This has regressed somehow, so it's started to happen again.

https://kojipkgs.fedoraproject.org//work/tasks/1496/9871496/build.log

The build log shows that it is using:

DEBUG util.py:388:   librados2                   armv7hl 1:0.94.1-3.fc23                build 1.5 M

https://kojipkgs.fedoraproject.org//work/tasks/1496/9871496/root.log

I don't understand exactly why, since the patch exists and
is still being applied.  Perhaps the patch is not a complete
fix for the problem.

Comment 20 Richard W.M. Jones 2015-05-29 21:15:11 UTC
I can reproduce this on baremetal armv7hl with
librados2-0.94.1-3.fc23.armv7hl
qemu-system-arm-2.3.0-5.fc23.armv7hl
and (as before) simply running 'qemu-system-arm -help'.

Unfortunately gdb doesn't let me get a stack trace.  It simply
ignores ^C.

I was able to use strace, and the process is doing this over and
over again:

gettimeofday({1432934064, 725311}, NULL) = 0
nanosleep({0, 2000001}, NULL)           = 0
gettimeofday({1432934064, 728002}, NULL) = 0
nanosleep({0, 2000001}, NULL)           = 0

Comment 21 Richard W.M. Jones 2015-05-29 21:35:13 UTC
Created attachment 1032285 [details]
strace qemu-system-arm -help

Comment 22 Boris Ranto 2015-06-01 02:45:40 UTC
Just few thoughts:

The ceph itself should not run the loop, now. Is there any chance that qemu is trying to do that instead?

Can you still reproduce on bare metal with the simple reproducer that you mentioned in Comment 10?

What about the older librados2-0.94.1-2.fc23.armv7hl? The only change between -2 and -3 should be that lttng was disabled but one never knows.

I'm trying to get my hands on an armhfp machine to debug further myself but that may take some time.

Comment 23 Boris Ranto 2015-06-01 07:25:04 UTC
I'll answer the questions myself as I've finally got my hands on an armhfp machine where I could reproduce.

The steps in Comment 10 do not reproduce the issue so that bug is indeed fixed. I've tapped into the running process (gdb -p $(pidof qemu-system-arm)) and I didn't see anything ceph (rbd/rados) related in the backtrace.

The backtrace leading to nanosleep was

#0  0xffffffff in nanosleep () at /lib/libpthread.so.0
#1  0xffffffff in base::internal::SpinLockDelay(int volatile*, int, int) () at /lib/libtcmalloc.so.4
#2  0xffffffff in SpinLock::SlowLock() () at /lib/libtcmalloc.so.4
#3  0xffffffff in tcmalloc::ThreadCache::InitModule() () at /lib/libtcmalloc.so.4
#4  0xffffffff in tc_malloc () at /lib/libtcmalloc.so.4
#5  0xffffffff in __fopen_internal () at /lib/libc.so.6
#6  0xffffffff in load_debug_frame () at /lib/libunwind.so.8
#7  0xffffffff in locate_debug_info () at /lib/libunwind.so.8
#8  0xffffffff in _ULarm_dwarf_find_debug_frame () at /lib/libunwind.so.8
#9  0xffffffff in _ULarm_dwarf_callback () at /lib/libunwind.so.8
#10 0xffffffff in dl_iterate_phdr () at /lib/libc.so.6
#11 0xffffffff in _ULarm_find_proc_info () at /lib/libunwind.so.8
#12 0xffffffff in fetch_proc_info () at /lib/libunwind.so.8
#13 0xffffffff in _ULarm_dwarf_find_save_locs () at /lib/libunwind.so.8
#14 0xffffffff in _ULarm_dwarf_step () at /lib/libunwind.so.8
#15 0xffffffff in _ULarm_step () at /lib/libunwind.so.8
#16 0xffffffff in GetStackTrace_libunwind(void**, int, int) () at /lib/libtcmalloc.so.4
#17 0xffffffff in GetStackTrace(void**, int, int) () at /lib/libtcmalloc.so.4
#18 0xffffffff in tcmalloc::PageHeap::GrowHeap(unsigned int) () at /lib/libtcmalloc.so.4
#19 0xffffffff in tcmalloc::PageHeap::New(unsigned int) () at /lib/libtcmalloc.so.4
#20 0xffffffff in tcmalloc::CentralFreeList::Populate() () at /lib/libtcmalloc.so.4
#21 0xffffffff in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () at /lib/libtcmalloc.so.4
#22 0xffffffff in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () at /lib/libtcmalloc.so.4
#23 0xffffffff in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int) () at /lib/libtcmalloc.so.4
#24 0xffffffff in tc_malloc () at /lib/libtcmalloc.so.4
#25 0xffffffff in strdup () at /lib/libc.so.6
#26 0xffffffff in verify_selinuxmnt () at /lib/libselinux.so.1
#27 0xffffffff in init_lib () at /lib/libselinux.so.1
#28 0xffffffff in call_init.part () at /lib/ld-linux-armhf.so.3
#29 0xffffffff in _dl_init_internal () at /lib/ld-linux-armhf.so.3
#30 0xffffffff in _dl_start_user () at /lib/ld-linux-armhf.so.3

The suspects seemed to be tcmalloc (gperftools) or libunwind. While both of these are used by ceph, neither of them is used by the client libraries (rbd/rados) that qemu links to.

I've done some more tests and it seems that tcmalloc (gperftools) is actually the one causing the hang-ups, i.e. similar to comment 10, I did

$ echo 'main(){}' > test.c
$ gcc test.c -ltcmalloc -o test
$ ./test

and the program hanged indefinitely.

I'm moving this bz back to ON_QA (as there actually was a bug in ceph code that was fixed in the -2 release) and I'll clone this bz for the tcmalloc issue.

Comment 24 Richard W.M. Jones 2015-06-01 08:40:12 UTC
Thanks - yes I believe you are correct.  Quite annoying that two
difficult bugs look coincidentally so similar :-(

Comment 25 Ken Dreyer (Red Hat) 2015-06-04 18:14:26 UTC
The patch was merged to Ceph master upstream in 35c5fd0091fc4d63b21207fb94e46b343519fd56 (https://github.com/ceph/ceph/pull/4839)

Comment 26 Fedora Update System 2015-06-04 20:19:45 UTC
ceph-0.94.1-3.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.