Bug 693578

Summary: NMI error message stops install and the host
Product: [Fedora] Fedora Reporter: Mike <rhb>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 14CC: dzickus, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-29 15:14:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Mike 2011-04-05 00:12:43 UTC
Description of problem:
Got error shortly after starting the install from DVD

Uhhuh. NMI received for unknown reason a1 on CPU 0.
You have some hardware problem, likely on the PCI bus.
Dazed and confused, but trying to continue


Version-Release number of selected component (if applicable):
Fedora 14
Also occurred when I tried Fedora 13

How reproducible:

Steps to Reproduce:
1. Config hardware as listed
2. Boot to install
  
Actual results:
Install and host lockup

Expected results:
Fedora gets installed

Additional info:
Hardware is Dell PowerEdge SC420 with new Adaptec 2405

The Dell has been a functioning server for years with Fedora 8 on or so. The Adaptec was for the Fedora 14 rebuild.

The array and card worked fine while building the array. Problem occurred every time during install. Mostly very early on (before switch to graphics mode) but it did once get to the first graphics screen

The Adaptec was eventually removed and the install went fine.

I can not capture any debugging since it is a DVD install and the host locks up.

There are a few mentions on the web about similar errors, including some in bugzilla here. But I did not find any real solutions other than a few mentioning a kernel upgrade eventually worked.

Comment 1 Don Zickus 2011-04-07 14:16:19 UTC
This machine is a pentium4 and it is a known issue that it will produce unknown NMIs.  Booting with nmi_watchdog=0 on the kernel command line will make those messages disappear until we properly fix them.

Cheers,
Don

Comment 2 Mike 2011-04-07 17:30:38 UTC
I can offer some debugging time if there is something you need to find out from this. Reproducing is as easy as putting the card in and running setup.

As mentioned I am doing an install from a DVD. I assume there is a way to configure a USB flash drive to do the install (I think I saw a page on the Fedora docs site, I'll try and find it). There use to be a way to get the CLI from the install, but I did not see it with Fedora 14. Without one of those, I am not sure how I set that when the first thing it does is lock up. 

After the server is installed, is that setting automatically copied to the installed server? As I recall, there is also a GRUB settings install page, perhaps it goes there? Otherwise, I have doubts that it will be able to boot.

Comment 3 Don Zickus 2011-04-07 18:33:29 UTC
Don't worry about reproducing it, a bunch of us can reproduce it, just trying to understand why the hardware is sending duplicate NMIs which cause this warning.

I haven't installed Fedora in a while, but I thought when it booted it gave the option of either installing or booting in rescue mode.  In that menu, I thought you could press the <Tab> key to get a prompt and type in the extra kernel options like 'nmi_watchdog=0'.  We used that method when people are having trouble installing on newer hardware and we have to workaround limitations of the newer network or storage cards.

Is that something you see when you boot the DVD?

Cheers,
Don

Comment 4 Mike 2011-04-08 00:26:23 UTC
I will check it out and see. There use to be an actual option in the menu to drop to CLI and then do "install" or something. I never used it and then at some point it disappeared. That's evolution for you, I guess.

I'll see what I can figure out. Thanks for the help.

Comment 5 Josh Boyer 2011-08-29 15:14:41 UTC
The F14 install isos aren't ever respun, so unfortunately there isn't much we can do about this bug.  If this problem still exists on the F16 isos, please open a bug against that version.