Bug 177585

Summary: SMP Opteron box panics on boot with 2.6.14* kernels
Product: [Fedora] Fedora Reporter: Jeff Kuehn <kuehn>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 5CC: pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-11-24 23:01:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
example of successful single processor boot
none
example of successful SMP boot
none
2.6.14-1.1637_FC4 panic example
none
2.6.14-1.1644_FC4 panic example
none
2.6.14-1.1653_FC4 panic example
none
2.6.14-1.1656_FC4 panic example
none
2.6.14-1.1637_FC4smp panic example 1
none
2.6.14-1.1637_FC4smp panic example 2
none
2.6.14-1.1644_FC4smp panic example 1
none
2.6.14-1.1644_FC4smp panic example 2
none
2.6.14-1.1653_FC4smp panic example 1
none
2.6.14-1.1653_FC4smp panic example 2
none
2.6.14-1.1656_FC4smp panic example 1
none
2.6.14-1.1656_FC4smp panic example 2
none
2.6.15 boot logs for 13 attempts (10 panics, 3 survive to login prompt) (UP and SMP)
none
2.6.15 boot logs for 13 attempts (10 panics, 3 survive to login prompt) (UP and SMP) none

Description Jeff Kuehn 2006-01-11 22:10:47 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc3 Firefox/1.0.7

Description of problem:
2.6.14 series of FC4 kernels (1637, 1644, 1653, 1656, single processor and SMP) all panic during the boot process. Previous FC4 kernels (2.6.13 and earlier) boot successfully and run stably.

The single processor kernels exhibit similar behavior in their panics (first lines of panic message: Kernel BUG at arch/x86_64/kernel/apic.c:333, invalid operand: 0000 [1])

The SMP panics don't show as clear a pattern.

I have (and will attach) the following bootlogs collected via a serial console:
-rw-r--r--  1 root root  11664 Jan 11 16:02 2.6.13-1.1532_FC4-1
-rw-r--r--  1 root root  19226 Jan 11 15:47 2.6.13-1.1532_FC4smp-1
-rw-r--r--  1 root root   6436 Jan 11 15:59 2.6.14-1.1637_FC4-1
-rw-r--r--  1 root root  41864 Jan 11 15:39 2.6.14-1.1637_FC4smp-1
-rw-r--r--  1 root root  40951 Jan 11 15:42 2.6.14-1.1637_FC4smp-2
-rw-r--r--  1 root root   6437 Jan 11 15:57 2.6.14-1.1644_FC4-1
-rw-r--r--  1 root root  43195 Jan 11 15:29 2.6.14-1.1644_FC4smp-1
-rw-r--r--  1 root root  18973 Jan 11 15:34 2.6.14-1.1644_FC4smp-2
-rw-r--r--  1 root root   6432 Jan 11 15:51 2.6.14-1.1653_FC4-1
-rw-r--r--  1 root root  18925 Jan 11 15:22 2.6.14-1.1653_FC4smp-1
-rw-r--r--  1 root root  37450 Jan 11 15:26 2.6.14-1.1653_FC4smp-2
-rw-r--r--  1 root root   6435 Jan 11 15:49 2.6.14-1.1656_FC4-1
-rw-r--r--  1 root root  39689 Jan 11 15:08 2.6.14-1.1656_FC4smp-1
-rw-r--r--  1 root root  41837 Jan 11 15:15 2.6.14-1.1656_FC4smp-2

The "-1" and "-2" refer to the first and second bootlogs for that kernel (to demonstrate inconsistency in the SMP crashes). Note that I'll include 2.6.13 logs which also show a successful boot in single cpu and smp mode.




Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Boot a 2.6.14-1.*_FC4* kernel
2. 
3.
  

Actual Results:  The kernels panics.

Expected Results:  No panic.

Additional info:

not seeing the "attach file" opportunity yet... I have 14 of them, write me if you don't see them here.

Comment 1 Jeff Kuehn 2006-01-11 22:13:40 UTC
Created attachment 123082 [details]
example of successful single processor boot

Comment 2 Jeff Kuehn 2006-01-11 22:14:28 UTC
Created attachment 123083 [details]
example of successful SMP boot

Comment 3 Jeff Kuehn 2006-01-11 22:15:20 UTC
Created attachment 123084 [details]
2.6.14-1.1637_FC4 panic example

Comment 4 Jeff Kuehn 2006-01-11 22:15:59 UTC
Created attachment 123085 [details]
2.6.14-1.1644_FC4 panic example

Comment 5 Jeff Kuehn 2006-01-11 22:16:49 UTC
Created attachment 123086 [details]
2.6.14-1.1653_FC4 panic example

Comment 6 Jeff Kuehn 2006-01-11 22:17:37 UTC
Created attachment 123087 [details]
2.6.14-1.1656_FC4 panic example

Comment 7 Jeff Kuehn 2006-01-11 22:18:27 UTC
Created attachment 123088 [details]
2.6.14-1.1637_FC4smp panic example 1

Comment 8 Jeff Kuehn 2006-01-11 22:19:29 UTC
Created attachment 123089 [details]
2.6.14-1.1637_FC4smp panic example 2

Comment 9 Jeff Kuehn 2006-01-11 22:20:13 UTC
Created attachment 123090 [details]
2.6.14-1.1644_FC4smp panic example 1

Comment 10 Jeff Kuehn 2006-01-11 22:21:04 UTC
Created attachment 123091 [details]
2.6.14-1.1644_FC4smp panic example 2

Comment 11 Jeff Kuehn 2006-01-11 22:21:57 UTC
Created attachment 123092 [details]
2.6.14-1.1653_FC4smp panic example 1

Comment 12 Jeff Kuehn 2006-01-11 22:22:47 UTC
Created attachment 123093 [details]
2.6.14-1.1653_FC4smp panic example 2

Comment 13 Jeff Kuehn 2006-01-11 22:24:03 UTC
Created attachment 123094 [details]
2.6.14-1.1656_FC4smp panic example 1

Comment 14 Jeff Kuehn 2006-01-11 22:24:52 UTC
Created attachment 123095 [details]
2.6.14-1.1656_FC4smp panic example 2

Comment 15 Dave Jones 2006-01-12 04:35:39 UTC
does the 2.6.15 based test kernel in updates-testing work ?

Comment 16 Jeff Kuehn 2006-01-12 21:20:51 UTC
Several attempts to boot the 2.6.15 kernel from updates-testing resulted in the
following scoreboard:

2.6.15 single processor kernel:
  2 panics before reaching login prompt

2.6.15 SMP kernel:
  8 panics before reaching login prompt
  3 boot attempts made it all the way to a login prompt
    (one of these was left standing to check stability overnight)

Serial console captured all 13 boot attempts. Console logs will be added to the
attachment list as a tarball of 2.6.15 logs.

Comment 17 Jeff Kuehn 2006-01-12 21:25:08 UTC
Created attachment 123140 [details]
2.6.15 boot logs for 13 attempts (10 panics, 3 survive to login prompt) (UP and SMP)

Comment 18 Jeff Kuehn 2006-01-12 21:29:08 UTC
Created attachment 123141 [details]
2.6.15 boot logs for 13 attempts (10 panics, 3 survive to login prompt) (UP and SMP)

Comment 19 Dave Jones 2006-02-03 07:25:29 UTC
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.


Comment 20 Jeff Kuehn 2006-02-03 15:16:03 UTC
Installed and tested 2.6.15-1.1830_FC4 on SMP Opteron box. This kernel still
crashes as demonstrated in the previous traces. Of the kernels installed on this
system:

vmlinuz-2.6.11-1.1369_FC4
vmlinuz-2.6.11-1.1369_FC4smp
vmlinuz-2.6.13-1.1526_FC4
vmlinuz-2.6.13-1.1526_FC4smp
vmlinuz-2.6.13-1.1532_FC4
vmlinuz-2.6.13-1.1532_FC4smp
vmlinuz-2.6.14-1.1637_FC4
vmlinuz-2.6.14-1.1637_FC4smp
vmlinuz-2.6.14-1.1644_FC4
vmlinuz-2.6.14-1.1644_FC4smp
vmlinuz-2.6.14-1.1653_FC4
vmlinuz-2.6.14-1.1653_FC4smp
vmlinuz-2.6.14-1.1656_FC4
vmlinuz-2.6.14-1.1656_FC4smp
vmlinuz-2.6.15-1.1823_FC4
vmlinuz-2.6.15-1.1823_FC4smp
vmlinuz-2.6.15-1.1824_FC4
vmlinuz-2.6.15-1.1824_FC4smp
vmlinuz-2.6.15-1.1830_FC4
vmlinuz-2.6.15-1.1830_FC4smp

the most recent working kernel still appears to be 2.6.13-1.1532_FC4smp.

Comment 21 Dave Jones 2006-09-17 03:17:18 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.


Comment 22 Dave Jones 2006-10-17 00:29:53 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 23 Dave Jones 2006-11-24 23:01:46 UTC
This bug has been mass-closed along with all other bugs that
have been in NEEDINFO state for several months.

Due to the large volume of inactive bugs in bugzilla, this
is the only method we have of cleaning out stale bug reports
where the reporter has disappeared.

If you can reproduce this bug after installing all the
current updates, please reopen this bug.

If you are not the reporter, you can add a comment requesting
it be reopened, and someone will get to it asap.

Thank you.