Bug 737011
Summary: | SIGSEGV when select() received async signal (cancel signal) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Marek Polacek <mpolacek> | ||||
Component: | glibc | Assignee: | Carlos O'Donell <codonell> | ||||
Status: | CLOSED WONTFIX | QA Contact: | qe-baseos-tools-bugs | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 5.7 | CC: | ashankar, fweimer, law, pfrankli, spoyarek | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | ppc64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-11-26 22:52:39 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Reassigning to glibc -- this looks a whole lot like some of the problems we had in RHEL 6 with incorrect memory fencing in the glibc unwind-forcedunwind code on modern power hardware. We haven't completely resolved those in RHEL 6, so backporting anything to RHEL 5 seems premature at this point. This looks exactly like the memory fencing issues in the glibc forced unwind code. I'm hesitant to commit to fixing this because it's more than just glibc it also requires thread-safe PLT stubs from binutils to catch the rest of the niggling issues. I'll have to look into if binutils for rhel5 has those fixes. I doubt binutils has those fixes... I don't recall backporting them to RHEL5. My gut says this shouldn't make the cut for RHEL 5.11, but wanted to get it reassigned to the proper component so you could chime in as well. (In reply to Jeff Law from comment #3) > I doubt binutils has those fixes... I don't recall backporting them to > RHEL5. It doesn't have them, bfd/elf64-ppc.c (build_plt_stub) lacks all of the thread-safe plt stub code. Therefore fixing this is going to be much much harder. > My gut says this shouldn't make the cut for RHEL 5.11, but wanted to get it > reassigned to the proper component so you could chime in as well. I agree the risk of backporting binutils fixes that impact all binaries by using alternate PLT stubs is just too high to fix this kind of problem. I could fix the glibc bug, but it wouldn't fix all of these kinds of crashes. What do I do with this bug? Shall we close this as CLOSED/WONTFIX? It gets even worse, Alan's original code was too optimistic in when it decided to use thread safe stubs. We'd need to turn them on for any shared link. Which implies that to be effective we need to relink (which implies rebuild) a large number of libraries on the system (I'd nearly forgotten about this part of the mess). Given there's no customer case, I say CLOSE/WONTFIX, too risky & invasive at this stage in the RHEL 5 lifecycle. |
Created attachment 522305 [details] Reproducer Description of problem: Program crashes if it is run long enough. This applies only to ppc64. Version-Release number of selected component (if applicable): gcc-4.1.2-51.el5 How reproducible: Sometimes. Steps to Reproduce: 1. # gcc -O2 -g rep.c -lpthread 2. # ./a.out 180 Segmentation fault Actual results: Segfault. Expected results: No segfault. Additional info: Seems like we need to backport some change into libgcc/config/rs6000/linux-unwind.h. The corefile says: Core was generated by `./a.out 180'. Program terminated with signal 11, Segmentation fault. #0 0x0fbcc838 in get_regs (context=0xf7fede5c, fs=0xf7fed950) at ../../gcc/config/rs6000/linux-unwind.h:159 159 if (*(unsigned int *) (pc + 4) != 0x44000002) (gdb) bt #0 0x0fbcc838 in get_regs (context=0xf7fede5c, fs=0xf7fed950) at ../../gcc/config/rs6000/linux-unwind.h:159 #1 ppc_fallback_frame_state (context=0xf7fede5c, fs=0xf7fed950) at ../../gcc/config/rs6000/linux-unwind.h:227 #2 uw_frame_state_for (context=0xf7fede5c, fs=0xf7fed950) at ../../gcc/unwind-dw2.c:1127 #3 0x0fbce12c in _Unwind_ForcedUnwind_Phase2 (exc=0xf7fef6f0, context=0xf7fede5c) at ../../gcc/unwind.inc:159 #4 0x0fbce73c in _Unwind_ForcedUnwind (exc=0xf7fef6f0, stop=0xfdaf260 <unwind_stop>, stop_argument=0xf7feee40) at ../../gcc/unwind.inc:211 #5 0x0fdb24a0 in _Unwind_ForcedUnwind (exc=0xf7fef6f0, stop=0xfdaf260 <unwind_stop>, stop_argument=0xf7feee40) at ../nptl/sysdeps/pthread/unwind-forcedunwind.c:100 #6 0x0fdaf21c in __pthread_unwind (buf=<value optimized out>) at unwind.c:130 #7 0x0fda47cc in __do_cancel (sig=<value optimized out>, si=<value optimized out>, ctx=<value optimized out>) at ../nptl/pthreadP.h:259 #8 sigcancel_handler (sig=<value optimized out>, si=<value optimized out>, ctx=<value optimized out>) at init.c:199 #9 <signal handler called> #10 0x0ff1153c in __libc_enable_asynccancel () at libc-cancellation.c:76 #11 0x00000400 in ?? ()