Bug 1226806
Summary: | arm: all programs that link to tcmalloc hang forever on startup | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Boris Ranto <branto> |
Component: | libunwind | Assignee: | Kyle McMartin <kmcmartin> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 22 | CC: | amit.shah, berrange, branto, cfergeau, crobinso, david, dwmw2, extras-qa, fedora, itamar, kmcmartin, ktdreyer, loic, pbonzini, pbrobinson, peterm, redhat-bugzilla, rjones, scottt.tw, steve, tcallawa, tmraz, virt-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | armv7hl | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libunwind-1.1-8.fc22 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1222286 | Environment: | |
Last Closed: | 2015-06-18 13:20:13 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1222286 | ||
Bug Blocks: | 245418, 910269 |
Description
Boris Ranto
2015-06-01 07:28:21 UTC
Moving to tcmalloc as I forgot to change that when I was cloning the bz... I'm attaching some more (concise) info: gperftools version: gperftools-libs-2.4-2.fc23.armv7hl reproducer steps: $ echo 'main(){}' > test.c $ gcc test.c -ltcmalloc -o test $ ./test result: ./test hangs indefinitely -- gettimeofday and nanosleep syscalls occur in the (probably) infinite loop ./test backtrace: #0 0xb6e521e8 in nanosleep () from /lib/libc.so.6 #1 0xb6f2b230 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib/libtcmalloc.so.4 #2 0xb6f2b0a0 in SpinLock::SlowLock() () from /lib/libtcmalloc.so.4 #3 0xb6f1ea08 in tcmalloc::ThreadCache::InitModule() () from /lib/libtcmalloc.so.4 #4 0xb6f2e3f0 in tc_malloc () from /lib/libtcmalloc.so.4 #5 0xb6e0fbb4 in __fopen_internal () from /lib/libc.so.6 #6 0xb6d590a0 in load_debug_frame () from /lib/libunwind.so.8 #7 0xb6d59bb8 in locate_debug_info () from /lib/libunwind.so.8 #8 0xb6d59d24 in _ULarm_dwarf_find_debug_frame () from /lib/libunwind.so.8 #9 0xb6d5a304 in _ULarm_dwarf_callback () from /lib/libunwind.so.8 #10 0xb6ecbe34 in dl_iterate_phdr () from /lib/libc.so.6 #11 0xb6d563c8 in _ULarm_find_proc_info () from /lib/libunwind.so.8 #12 0xb6d579d4 in fetch_proc_info () from /lib/libunwind.so.8 #13 0xb6d589dc in _ULarm_dwarf_find_save_locs () from /lib/libunwind.so.8 #14 0xb6d59014 in _ULarm_dwarf_step () from /lib/libunwind.so.8 #15 0xb6d55628 in _ULarm_step () from /lib/libunwind.so.8 #16 0xb6f2b684 in GetStackTrace_libunwind(void**, int, int) () from /lib/libtcmalloc.so.4 #17 0xb6f2be90 in GetStackTrace(void**, int, int) () from /lib/libtcmalloc.so.4 #18 0xb6f1bb34 in tcmalloc::PageHeap::GrowHeap(unsigned int) () from /lib/libtcmalloc.so.4 #19 0xb6f1bea0 in tcmalloc::PageHeap::New(unsigned int) () from /lib/libtcmalloc.so.4 #20 0xb6f1a7b4 in tcmalloc::CentralFreeList::Populate() () from /lib/libtcmalloc.so.4 #21 0xb6f1aa04 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib/libtcmalloc.so.4 #22 0xb6f1aab4 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib/libtcmalloc.so.4 #23 0xb6f1dd28 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int) () from /lib/libtcmalloc.so.4 #24 0xb6f2e05c in tc_malloc () from /lib/libtcmalloc.so.4 I am able to reproduce this bug as well. Test: $ echo 'main(){}' > test.c $ gcc test.c -ltcmalloc -o test test.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int] main(){} $ ./test <--- hangs here forever gperftools-libs-2.4-1.fc22.armv7hl kernel-4.0.0-0.rc4.git0.1.fc22.armv7hl+lpae #ifdef __arm__ // ARM linux doesn't support sys_futex1(void*, int, int, struct timespec*); have_futex = 0; #else have_futex = (sizeof (Atomic32) == sizeof (int) && sys_futex(&x, FUTEX_WAKE, 1, 0) >= 0); #endif Despite the ugliness of that code that Paolo found, the hang is in libunwind. On ARM, the default unwind method is for dwarf tables - causing an fopen() with malloc if debug symbols are not part of the executable itself. Forcing to use EXID table method with export UNW_ARM_UNWIND_METHOD=4 makes the libunwind method work. You can reproduce that with the test case in Comment 2. Some docs here: https://wiki.linaro.org/KenWerner/Sandbox/libunwind#UNW_ARM_METHOD_EXIDX_0x04 This is in the upstream gperftools issue tracker here: https://code.google.com/p/gperftools/issues/detail?id=629 I don't think the libunwind upstream considers this a bug. That said, it makes sense to change the unwinding method default on ARM to be Method 4 (at least in Fedora), so that this just works (tm). If someone wants the old broken behavior for some reason, the UNW_ARM_UNWIND_METHOD env variable will let them go back to it. I've confirmed changing the default also resolves this issue, so I'll push a libunwind update for that. libunwind-1.1-8.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/libunwind-1.1-8.fc22 Package libunwind-1.1-8.fc22: * should fix your issue, * was pushed to the Fedora 22 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libunwind-1.1-8.fc22' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-9378/libunwind-1.1-8.fc22 then log in and leave karma (feedback). libunwind-1.1-8.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. |