Bug 523255

Summary: 2.6.9-89.0.9.ELsmp 64-bit kernel hanging on Sun Fire X2200-M2 servers with 'noapic'
Product: Red Hat Enterprise Linux 4 Reporter: Issue Tracker <tao>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED WONTFIX QA Contact: Evan McNabb <emcnabb>
Severity: medium Docs Contact:
Priority: urgent    
Version: 4.8CC: cww, emcnabb, jwest, peterm, tao, vgoyal
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Some x86 64-bit systems may hang when booting with the'noapic' debug kernel parameter.
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-14 16:57:58 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 583726    

Description Issue Tracker 2009-09-14 11:26:31 EDT
Escalated to Bugzilla from IssueTracker
Comment 1 Issue Tracker 2009-09-14 11:26:33 EDT
Event posted on 08-25-2009 04:09pm EDT by ljm

We are seeing hangs on at least two of our
Sun Fire X2200-M2 servers when we try to boot
up the new 2.6.9-89.0.9.ELsmp 64-bit kernel.
(The same thing happens with the "hot fix" kernel,
2.6.9-89.0.8.ELsmp).

The last thing we see at the console before the hang
is:

...
11:58:27 Linux agpgart interface v0.100 (c) Dave Jones
11:58:27 serio: i8042 AUX port at 0x60,0x64 irq 12
11:58:27 serio: i8042 KBD port at 0x60,0x64 irq 1
11:58:27 Serial: 8250/16550 driver $Revision: 1.90 $ 68 ports, IRQ sharing enabled

If we boot into the 2.6.9-78.0.17.ELsmp kernel, the
next few lines we see are:

11:51:39 ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
11:51:39 ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
11:51:40 RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
...

I've checked one of these same model servers running
the 32-bit kernel and it seems to be fine.

Len
This event sent from IssueTracker by kbaxley  [SLAC]
 issue 334663
Comment 2 Issue Tracker 2009-09-14 11:26:35 EDT
Event posted on 08-25-2009 04:51pm EDT by kbaxley

Len,

Also, would it be possible for you guys to try and test one or any of the
4.8 kernels prior to the 2.6.9-89.0.9 version to ensure that we didn't
introduce some sort of new issue?

The versions of the kernels that we relased for 4.8 are:

kernel-2.6.9-89
kernel-2.6.9-89.0.3
kernel-2.6.9-89.0.7

I'll keep digging around on my end.

Thanks.


This event sent from IssueTracker by kbaxley  [SLAC]
 issue 334663
Comment 3 Issue Tracker 2009-09-14 11:26:36 EDT
Event posted on 08-25-2009 09:41pm EDT by ljm

I have tried kernel 2.6.9-89.ELsmp on the problem
machine, atl-prod07: it hangs at the same point.
So it does seem like it's probably something in the
89 series kernel and not either of the latest security
patches.

It also seems there must be something different between
atl-prod07, which hangs, and glastlnx, which seems to
be find (at least with 2.6.9-89.0.8.ELsmp).


This event sent from IssueTracker by kbaxley  [SLAC]
 issue 334663
Comment 4 Issue Tracker 2009-09-14 11:26:38 EDT
Event posted on 08-26-2009 11:31am EDT by kbaxley

Hi Len,

Looking over the sysreports, I saw these differences in the grub.conf
between the two machines:


Here is the grub config from atl-prod07, which is hanging with the -89
kernels.  Note the "noapic" entry:

title Red Hat Enterprise Linux WS (2.6.9-89.0.9.ELsmp)
	root (hd0,0)
	kernel /boot/vmlinuz-2.6.9-89.0.9.ELsmp ro root=LABEL=/ console=tty0
console=ttyS0,9600 selinux=0 nmi_watchdog=1 noapic
	initrd /boot/initrd-2.6.9-89.0.9.ELsmp.img


Here is the grub config from glastlnx16, which seems to boot up without
issue:

title Red Hat Enterprise Linux WS (2.6.9-89.0.9.ELsmp)
	root (hd0,0)
	kernel /boot/vmlinuz-2.6.9-89.0.9.ELsmp ro root=LABEL=/1 console=tty0
console=ttyS0,9600 nmi_watchdog=1
	initrd /boot/initrd-2.6.9-89.0.9.ELsmp.img


When you get the opportunity, can you try booting up atl-prod07 without
the "noapic" on the kernel line?


Internal Status set to 'Waiting on Customer'
Status set to: Waiting on Client

This event sent from IssueTracker by kbaxley  [SLAC]
 issue 334663
Comment 5 Issue Tracker 2009-09-14 11:26:39 EDT
Event posted on 08-26-2009 12:57pm EDT by ljm

I just booted atl-prod07 without the "noapic" on the kernel line
and it came up just fine.  I don't know why that option is
present on this machine -- I'll try to find out.


Ticket type changed from 'Problem' to ''

This event sent from IssueTracker by kbaxley  [SLAC]
 issue 334663
Comment 7 Issue Tracker 2009-09-14 11:26:43 EDT
Event posted on 09-14-2009 11:23am EDT by kbaxley

Problem: 2.6.9-89.0.9.ELsmp 64-bit kernel hanging on bootup on Sun Fire
X2200-M2 servers with "noapic" in the grub kernel command line.

We also tested the original 2.6.9-89 kernel for RHEL4.8 and found that the
bootup hangs in the same place as well, so, this appears to have been
something introduced in the 4.8 kernel.

The systems in question that exhibited this problem are Sun X2200-M2
systems

Two Dual-Core AMD Opteron(tm) Processor 2218 (For a total of 4 cores)
stepping	: 3
cpu MHz		: 2613.436
cache size	: 1024 KB

8GB of memory


How reproducible: Always

Steps to reproduce:  Install the 2.6.9-89.0.9.ELsmp 64-bit kernel and
ensure grub.conf has the following on the kernel command line:

kernel /boot/vmlinuz-2.6.9-89.0.9.ELsmp ro root=LABEL=/ console=tty0
console=ttyS0,9600 selinux=0 nmi_watchdog=1 noapic

Actual results:
System "hangs" at bootup.  The last thing we see at the console before
the hang
is:

...
11:58:27 Linux agpgart interface v0.100 (c) Dave Jones
11:58:27 serio: i8042 AUX port at 0x60,0x64 irq 12
11:58:27 serio: i8042 KBD port at 0x60,0x64 irq 1
11:58:27 Serial: 8250/16550 driver $Revision: 1.90 $ 68 ports, IRQ sharing
enabled

Expected results:  
System shouldn't hang at bootup with the 'noapic' option

Workaround:
Customer can boot up the system if they:
1) Boot up to the previously installed kernel 2.6.9-78.0.17.ELsmp kernel
with the 'noapic' option

OR

2) They can boot up to the 2.6.9-89.* kernels if they remove 'noapic'
from the grub.conf

Priority set to: 5
Ticket type set to: 'Problem'

This event sent from IssueTracker by kbaxley  [SLAC]
 issue 334663
Comment 10 Issue Tracker 2010-04-15 11:39:25 EDT
Event posted on 04-15-2010 11:39am EDT by kbaxley

Fine with me, as long as we can ensure that it's documented.  Any chance
this will eventually get fixed in a post-4.9 kernel update?


This event sent from IssueTracker by kbaxley 
 issue 334663
Comment 12 Peter Martuccelli 2010-04-19 11:35:39 EDT
Devel ACK set only for adding a release note, no code changes.
Comment 13 RHEL Product and Program Management 2010-04-20 09:18:39 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 16 Linda Wang 2010-11-19 13:53:42 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Some x86 64-bit systems may hang when booting with the'noapic' debug kernel parameter.