Bug 752122
Summary: | Segmentation fault in dynamic loader on AVX enabled CPU | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jaroslav Drzik <jaroslav.drzik> | ||||
Component: | glibc | Assignee: | Jeff Law <law> | ||||
Status: | CLOSED ERRATA | QA Contact: | qe-baseos-tools-bugs | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 6.1 | CC: | bart, fweimer, i.kay, law, mfranc, mishu, netwiz, pasteur, pmuller, toracat, yury | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | glibc-2.12-1.48.el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-06-20 12:08:43 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jaroslav Drzik
2011-11-08 15:53:17 UTC
info frame info all-registers (gdb) bt #0 0x00007f4255180b70 in _dl_x86_64_save_sse () from /lib64/ld-linux-x86-64.so.2 #1 0x00007f4255176a98 in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2 #2 0x00007f4255179ee0 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #3 0x00007f42551805b5 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #4 0x00007f42502923a7 in clse_get_crs_home (ctx=0x7fffffb15140, buf=0x7fffffb1b06c "", bufsiz=0x7fffffb156c8) at clse.c:211 #5 0x00007f4250252d0b in clscrs_get_crshome (ctx=0x18f2150, crshome=0x7fffffb1b06c "", bufsize=0x7fffffb156c8) at clscrs0.c:423 #6 0x00000000004312d9 in nsglcrsnfy () #7 0x0000000000407ff2 in nsglma () #8 0x0000000000406137 in main () (gdb) info all-registers rax 0x1 1 rbx 0x18f08c0 26151104 rcx 0x159ae3bf 362472383 rdx 0xbfebfbff 3219913727 rsi 0x0 0 rdi 0x58 88 rbp 0x7fffffb13fd0 0x7fffffb13fd0 rsp 0x7fffffb13e88 0x7fffffb13e88 r8 0x7f425538c908 139922874353928 r9 0x0 0 r10 0x7fffffb13e00 140737483193856 r11 0x18f08c0 26151104 r12 0x7fffffb13ff8 140737483194360 r13 0x18f1460 26154080 r14 0x0 0 r15 0x0 0 rip 0x7f4255180b70 0x7f4255180b70 <_dl_x86_64_save_sse+48> eflags 0x10202 [ IF RF ] cs 0xe033 57395 ss 0xe02b 57387 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 st0 0 (raw 0x00000000000000000000) st1 0 (raw 0x00000000000000000000) st2 0 (raw 0x00000000000000000000) st3 0 (raw 0x00000000000000000000) st4 0 (raw 0x00000000000000000000) st5 0 (raw 0x00000000000000000000) st6 4300441390 (raw 0x401f8029c39700000000) st7 0 (raw 0x00000000000000000000) fctrl 0x37f 895 fstat 0x0 0 ftag 0xffff 65535 fiseg 0x7f42 32578 fioff 0x5193cf04 1368641284 foseg 0x7fff 32767 fooff 0xffb0bc70 -5194640 fop 0x0 0 xmm0 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000} xmm1 {v4_float = {0x0, 0x0, 0xc01b0000, 0xce5}, v2_double = { 0x8000000000000000, 0x8000000000000000}, v16_int8 = {0x6e, 0x2f, 0x6c, 0x73, 0x6e, 0x72, 0x63, 0x74, 0x6c, 0x0, 0x4f, 0x52, 0x41, 0x5f, 0x4e, 0x45}, v8_int16 = {0x2f6e, 0x736c, 0x726e, 0x7463, 0x6c, 0x524f, 0x5f41, 0x454e}, v4_int32 = {0x736c2f6e, 0x7463726e, 0x524f006c, 0x454e5f41}, v2_int64 = {0x7463726e736c2f6e, 0x454e5f41524f006c}, uint128 = 0x454e5f41524f006c7463726e736c2f6e} xmm2 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xff, 0xff, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0xffff, 0xffff, 0x0}, v4_int32 = {0x0, 0x0, 0xffff0000, 0xffff}, v2_int64 = {0x0, 0xffffffff0000}, uint128 = 0x0000ffffffff00000000000000000000} xmm3 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x70, 0x3e, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0x3e70, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x1903e70, 0x0, 0x0, 0x0}, v2_int64 = {0x1903e70, 0x0}, uint128 = 0x00000000000000000000000001903e70} xmm4 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0xc0, 0x98, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0x98c0, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x19098c0, 0x0, 0x0, 0x0}, v2_int64 = {0x19098c0, 0x0}, uint128 = 0x000000000000000000000000019098c0} xmm5 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0xa0, 0xaa, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0xaaa0, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x190aaa0, 0x0, 0x0, 0x0}, v2_int64 = {0x190aaa0, 0x0}, uint128 = 0x0000000000000000000000000190aaa0} xmm6 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x30, 0x7f, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0x7f30, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x1907f30, 0x0, 0x0, 0x0}, v2_int64 = {0x1907f30, 0x0}, uint128 = 0x00000000000000000000000001907f30} xmm7 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0xd0, 0xac, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0xacd0, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x190acd0, 0x0, 0x0, 0x0}, v2_int64 = {0x190acd0, 0x0}, uint128 = 0x0000000000000000000000000190acd0} xmm8 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0, 0x1, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = {0x100, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x1900100, 0x0, 0x0, 0x0}, v2_int64 = {0x1900100, 0x0}, uint128 = 0x00000000000000000000000001900100} xmm9 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x40, 0xd8, 0x92, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0xd840, 0x192, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x192d840, 0x0, 0x0, 0x0}, v2_int64 = {0x192d840, 0x0}, uint128 = 0x0000000000000000000000000192d840} xmm10 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0xa0, 0xa9, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0xa9a0, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x190a9a0, 0x0, 0x0, 0x0}, v2_int64 = {0x190a9a0, 0x0}, uint128 = 0x0000000000000000000000000190a9a0} xmm11 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000} xmm12 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000} xmm13 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x1, 0xad, 0x90, 0x1, 0x0 <repeats 12 times>}, v8_int16 = { 0xad01, 0x190, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x190ad01, 0x0, 0x0, 0x0}, v2_int64 = {0x190ad01, 0x0}, uint128 = 0x0000000000000000000000000190ad01} xmm14 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000} xmm15 {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000} mxcsr 0x9fe0 [ PE DAZ IM DM ZM OM UM PM FZ ] Dump of assembler code for function _dl_x86_64_save_sse: 0x00007f4255180b40 <+0>: cmpl $0x0,0x20c431(%rip) # 0x7f425538cf78 0x00007f4255180b47 <+7>: jne 0x7f4255180b6e <_dl_x86_64_save_sse+46> 0x00007f4255180b49 <+9>: mov %rbx,%r11 0x00007f4255180b4c <+12>: mov $0x1,%eax 0x00007f4255180b51 <+17>: cpuid 0x00007f4255180b53 <+19>: mov %r11,%rbx 0x00007f4255180b56 <+22>: mov $0x1,%eax 0x00007f4255180b5b <+27>: test $0x10000000,%ecx 0x00007f4255180b61 <+33>: jne 0x7f4255180b65 <_dl_x86_64_save_sse+37> 0x00007f4255180b63 <+35>: neg %eax 0x00007f4255180b65 <+37>: mov %eax,0x20c40d(%rip) # 0x7f425538cf78 0x00007f4255180b6b <+43>: cmp $0x0,%eax 0x00007f4255180b6e <+46>: js 0x7f4255180bc1 <_dl_x86_64_save_sse+129> => 0x00007f4255180b70 <+48>: vmovdqa %ymm0,%fs:0x80 0x00007f4255180b7a <+58>: vmovdqa %ymm1,%fs:0xa0 0x00007f4255180b84 <+68>: vmovdqa %ymm2,%fs:0xc0 0x00007f4255180b8e <+78>: vmovdqa %ymm3,%fs:0xe0 0x00007f4255180b98 <+88>: vmovdqa %ymm4,%fs:0x100 0x00007f4255180ba2 <+98>: vmovdqa %ymm5,%fs:0x120 0x00007f4255180bac <+108>: vmovdqa %ymm6,%fs:0x140 0x00007f4255180bb6 <+118>: vmovdqa %ymm7,%fs:0x160 0x00007f4255180bc0 <+128>: retq 0x00007f4255180bc1 <+129>: movdqa %xmm0,%fs:0x80 0x00007f4255180bcb <+139>: movdqa %xmm1,%fs:0x90 0x00007f4255180bd5 <+149>: movdqa %xmm2,%fs:0xa0 0x00007f4255180bdf <+159>: movdqa %xmm3,%fs:0xb0 0x00007f4255180be9 <+169>: movdqa %xmm4,%fs:0xc0 0x00007f4255180bf3 <+179>: movdqa %xmm5,%fs:0xd0 0x00007f4255180bfd <+189>: movdqa %xmm6,%fs:0xe0 0x00007f4255180c07 <+199>: movdqa %xmm7,%fs:0xf0 0x00007f4255180c11 <+209>: retq End of assembler dump. info frame 4 *** Bug 751331 has been marked as a duplicate of this bug. *** #0 0x00007f8c7a164b70 in _dl_x86_64_save_sse () from /lib64/ld-linux-x86-64.so.2 #1 0x00007f8c7a15aa98 in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2 #2 0x00007f8c7a15dee0 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #3 0x00007f8c7a1645b5 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #4 0x00007f8c752763a7 in clse_get_crs_home (ctx=0x7fffa1d465d0, buf=0x7fffa1d4c4fc "", bufsiz=0x7fffa1d46b58) at clse.c:211 #5 0x00007f8c75236d0b in clscrs_get_crshome (ctx=0x147e150, crshome=0x7fffa1d4c4fc "", bufsize=0x7fffa1d46b58) at clscrs0.c:423 #6 0x00000000004312d9 in nsglcrsnfy () #7 0x0000000000407ff2 in nsglma () #8 0x0000000000406137 in main () (gdb) info frame 4 Stack frame at 0x7fffa1d465d0: rip = 0x7f8c752763a7 in clse_get_crs_home (clse.c:211); saved rip 0x7f8c75236d0b called by frame at 0x7fffa1d46a50, caller of frame at 0x7fffa1d45510 source language c. Arglist at 0x7fffa1d465c0, args: ctx=0x7fffa1d465d0, buf=0x7fffa1d4c4fc "", bufsiz=0x7fffa1d46b58 Locals at 0x7fffa1d465c0, Previous frame's sp is 0x7fffa1d465d0 Saved registers: rbx at 0x7fffa1d46598, rbp at 0x7fffa1d465c0, rip at 0x7fffa1d465c8 (gdb) info frame Stack level 0, frame at 0x7fffa1d45320: rip = 0x7f8c7a164b70 in _dl_x86_64_save_sse; saved rip 0x7f8c7a15aa98 called by frame at 0x7fffa1d45470 Arglist at 0x7fffa1d45310, args: Locals at 0x7fffa1d45310, Previous frame's sp is 0x7fffa1d45320 Saved registers: rip at 0x7fffa1d45318 (gdb) Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. *** Bug 790399 has been marked as a duplicate of this bug. *** Created attachment 562109 [details]
Possible path to fix this issue.
Attached a possible patch for this. It should apply cleanly to glibc-2.12-1.47
Not sure what the rest of the comments on this BZ are - as they seem to be hidden.
Thanks. I'd already identified the fixes upstream and ported them to our internal source tree for our upcoming Red Hat Enterprise Linux 6.3 release. In general, if you see a bug in the ON_QA that means we've already identified and backported the change and it's in our QA process prior to the next release. FYI, the hidden comments in this BZ are not related to fixing or analysis of the BZ. They're just part of the internal process/procedures we use when planning releases. If you're really looking to dive into a problem, 768300 may be related to this bug. To date neither Petr nor myself have been able to reproduce that particular failure, which makes it much more difficult to analyze. Given this bug affects quite wide range of systems (in my organisation, 73 to be exact) wouldn't it be worth to consider pushing it out earlier then next minor release? I mean, this makes us have to backport the patch ourselves instead of relaying on upstream to do it. This is what I was hoping - as it basically kills any program using dynamic loading on any new Xeon or i[357] cpus. From what I gather so far, this includes: mod_perl Oracle (what this bug is about) OpenVZ Xen DomUs possibly more... I tried to look at doing this myself for my own installs (using the patch I attached), however I don't have an i686 build environment and sadly for compatibility, I need to also redo the i686 glibc builds for programs that are 32 bit only (ie IBM Tivoli Storage Manager) To be honest, I was surprised to see a 'not until 6.3' reply. Please note that Bugzilla is not a Red Hat support tool: it's just for defect tracking and serves mainly as a base for technical aspects of the defect. If you have any requests about the release schedule of this bugfix in Red Hat Enterprise Linux, please direct it through your appropriate support channel. Support channels have means how to (possibly) accelerate the process. For the sake of completion, and to hopefully make it more obvious for people to find this bug who currently have apache do strange things on new hardware, I've added a backtrace of apache and mod_perl causing the same Illegal Instruction: Starting program: /usr/sbin/httpd -X [Thread debugging using libthread_db enabled] Program received signal SIGILL, Illegal instruction. 0x00007ffff7d9ebe0 in _dl_x86_64_save_sse () from /lib64/ld-linux-x86-64.so.2 (gdb) bt #0 0x00007ffff7d9ebe0 in _dl_x86_64_save_sse () from /lib64/ld-linux-x86-64.so.2 #1 0x00007ffff7d94ad8 in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2 #2 0x00007ffff7d97f40 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ffff7d9e625 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #4 0x00007fffe250b1d6 in boot_Apache2__ServerUtil () from /usr/lib64/perl5/auto/Apache2/ServerUtil/ServerUtil.so #5 0x00007fffed593695 in Perl_pp_entersub () from /usr/lib64/perl5/CORE/libperl.so For further reference, it shows in the apache error_log as: [notice] child pid XXXX exit signal Illegal instruction (4) Bart, Steven, As Petr stated, there are processes by which we can make fixes available earlier. The decision as to whether or not a particular fix is made available via the accelerated process is largely driven by customer needs expressed through their appropriate support channel. So my suggestion would be to get in touch with your support contacts and indicate to them the importance of this fix to your respective organizations. One more note, in c#15, I was referring to Petr Machata who is looking at BZ 768300. The comments in c#18 are from Petr Muller who doesn't have any responsibility for BZ 768300. Hi Jeff, Can we please have test packages for the AVX thing? Right now I'm trying to apply the patch from Debian to the latest RHEL 6.2 glibc source released by RH and rebuild the package, but it's taking ages and yet I'm not sure of the outcome... Thanks! Yury, that kind of request really needs to go through your support channel; they have various mechanisms to request fixes be made available under accelerated schedules. Hi Jeff, Thanks, I see your point, but these interactions are always soooo sloooow, that I appreciate a lot occasional test packages from bugzilla, i.e. SELinux updates, kernel bugfixes etc. Anyways, after rebuilding glibc with Steven's patch I realized that it addresses a different problem. I'm rather facing segfaults inside a x86_64 Xen domU running on top a x86_64 Xen dom0 in case if software makes extensive use of TLS (Postfix smtp client) and they are, unfortunately, totally unrelated to this SEGILL ticket. Steven, if you want my rpms with your patch (i386 / x86_64) you can have them here: http://rpm.zaytsev.net/test/glibc-xen/ . Of course, no warranties, explicit or implicit whatsoever. Z. Yury, you'll probably want to look at 801650 for your segfault in the dom0. Especially if you're on a sandybridge processor. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0763.html |