Bug 1226806
| Summary: | arm: all programs that link to tcmalloc hang forever on startup | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Boris Ranto <branto> |
| Component: | libunwind | Assignee: | Kyle McMartin <kmcmartin> |
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 22 | CC: | amit.shah, berrange, branto, cfergeau, crobinso, david, dwmw2, extras-qa, fedora, itamar, kmcmartin, ktdreyer, loic, pbonzini, pbrobinson, peterm, redhat-bugzilla, rjones, scottt.tw, steve, tcallawa, tmraz, virt-maint |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | armv7hl | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libunwind-1.1-8.fc22 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1222286 | Environment: | |
| Last Closed: | 2015-06-18 13:20:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1222286 | ||
| Bug Blocks: | 245418, 910269 | ||
|
Description
Boris Ranto
2015-06-01 07:28:21 UTC
Moving to tcmalloc as I forgot to change that when I was cloning the bz...
I'm attaching some more (concise) info:
gperftools version: gperftools-libs-2.4-2.fc23.armv7hl
reproducer steps:
$ echo 'main(){}' > test.c
$ gcc test.c -ltcmalloc -o test
$ ./test
result: ./test hangs indefinitely -- gettimeofday and nanosleep syscalls occur in the (probably) infinite loop
./test backtrace:
#0 0xb6e521e8 in nanosleep () from /lib/libc.so.6
#1 0xb6f2b230 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib/libtcmalloc.so.4
#2 0xb6f2b0a0 in SpinLock::SlowLock() () from /lib/libtcmalloc.so.4
#3 0xb6f1ea08 in tcmalloc::ThreadCache::InitModule() () from /lib/libtcmalloc.so.4
#4 0xb6f2e3f0 in tc_malloc () from /lib/libtcmalloc.so.4
#5 0xb6e0fbb4 in __fopen_internal () from /lib/libc.so.6
#6 0xb6d590a0 in load_debug_frame () from /lib/libunwind.so.8
#7 0xb6d59bb8 in locate_debug_info () from /lib/libunwind.so.8
#8 0xb6d59d24 in _ULarm_dwarf_find_debug_frame () from /lib/libunwind.so.8
#9 0xb6d5a304 in _ULarm_dwarf_callback () from /lib/libunwind.so.8
#10 0xb6ecbe34 in dl_iterate_phdr () from /lib/libc.so.6
#11 0xb6d563c8 in _ULarm_find_proc_info () from /lib/libunwind.so.8
#12 0xb6d579d4 in fetch_proc_info () from /lib/libunwind.so.8
#13 0xb6d589dc in _ULarm_dwarf_find_save_locs () from /lib/libunwind.so.8
#14 0xb6d59014 in _ULarm_dwarf_step () from /lib/libunwind.so.8
#15 0xb6d55628 in _ULarm_step () from /lib/libunwind.so.8
#16 0xb6f2b684 in GetStackTrace_libunwind(void**, int, int) () from /lib/libtcmalloc.so.4
#17 0xb6f2be90 in GetStackTrace(void**, int, int) () from /lib/libtcmalloc.so.4
#18 0xb6f1bb34 in tcmalloc::PageHeap::GrowHeap(unsigned int) () from /lib/libtcmalloc.so.4
#19 0xb6f1bea0 in tcmalloc::PageHeap::New(unsigned int) () from /lib/libtcmalloc.so.4
#20 0xb6f1a7b4 in tcmalloc::CentralFreeList::Populate() () from /lib/libtcmalloc.so.4
#21 0xb6f1aa04 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib/libtcmalloc.so.4
#22 0xb6f1aab4 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib/libtcmalloc.so.4
#23 0xb6f1dd28 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int) () from /lib/libtcmalloc.so.4
#24 0xb6f2e05c in tc_malloc () from /lib/libtcmalloc.so.4
I am able to reproduce this bug as well.
Test:
$ echo 'main(){}' > test.c
$ gcc test.c -ltcmalloc -o test
test.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
main(){}
$ ./test
<--- hangs here forever
gperftools-libs-2.4-1.fc22.armv7hl
kernel-4.0.0-0.rc4.git0.1.fc22.armv7hl+lpae
#ifdef __arm__
// ARM linux doesn't support sys_futex1(void*, int, int, struct timespec*);
have_futex = 0;
#else
have_futex = (sizeof (Atomic32) == sizeof (int) &&
sys_futex(&x, FUTEX_WAKE, 1, 0) >= 0);
#endif
Despite the ugliness of that code that Paolo found, the hang is in libunwind. On ARM, the default unwind method is for dwarf tables - causing an fopen() with malloc if debug symbols are not part of the executable itself. Forcing to use EXID table method with export UNW_ARM_UNWIND_METHOD=4 makes the libunwind method work. You can reproduce that with the test case in Comment 2. Some docs here: https://wiki.linaro.org/KenWerner/Sandbox/libunwind#UNW_ARM_METHOD_EXIDX_0x04 This is in the upstream gperftools issue tracker here: https://code.google.com/p/gperftools/issues/detail?id=629 I don't think the libunwind upstream considers this a bug. That said, it makes sense to change the unwinding method default on ARM to be Method 4 (at least in Fedora), so that this just works (tm). If someone wants the old broken behavior for some reason, the UNW_ARM_UNWIND_METHOD env variable will let them go back to it. I've confirmed changing the default also resolves this issue, so I'll push a libunwind update for that. libunwind-1.1-8.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/libunwind-1.1-8.fc22 Package libunwind-1.1-8.fc22: * should fix your issue, * was pushed to the Fedora 22 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libunwind-1.1-8.fc22' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-9378/libunwind-1.1-8.fc22 then log in and leave karma (feedback). libunwind-1.1-8.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. |