Bug 596517
| Summary: | RHEL6 Install on ibm-x3950m2-0[12].ovirt.rhts.eng.bos.redhat.com fail for no apparent reason around 8 seconds after anaconda gets network up | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Barry Marson <bmarson> |
| Component: | kernel | Assignee: | James Takahashi (IBM) <nobody+PNT0273897> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.0 | CC: | borgan, jburke, jolsa, knoel, mjenner, nhorman, peterm, tburke, tgraf |
| Target Milestone: | rc | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2010-07-28 20:43:20 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Barry Marson
2010-05-26 21:04:07 UTC
Beta 1 fails as well. Barry My guess is something is occurring during storage detection that's causing the system to bail. Based on the messages you included in the initial comment, it sounds like you are able to get in to stage 2 of anaconda. Before clicking next (or advancing past the welcome screen), can you do the following: 1) ssh in to the system and tail -f /tmp/storage.log 2) ssh in to the system and tail -f /tmp/program.log 3) Advance the installer to the next screen Hopefully we'll see some error output in the storage.log and/or program.log that points us to the root cause. I dont think I get to stage 2. If I do, there isn't enough time to do anything. I have about 4-8 seconds to get into the system after a ping starts responding. After that the system resets. In other words, not possible. Btw, I blacklisted the FC HBA driver lpfc to no avail. In fact when ever I blacklisted anything, the time to reset was quicker (closer to 4 sec). The only other storage would be the local LSI storage and thats what I need for the OS. Barry Latest attempts through beaker with a Kickstart metadata = "manual" and adding vnc to the Kernel Options install line have shown that when I get the language option at the console, waiting a minute shows it resets all by itself. So something asynchronously is going on .. module probing ?? Barry You are definitely getting to stage 2. When you see this message: Running anaconda 13.21.45, the Red Hat Enterprise Linux system installer - please wait. You have entered stage 2. Module loading occurs during stage 1. Are you running the text mode or graphical interface for the installer? Well stage two causes a machine reset or what ever in seconds ... See comment #5 for the args. This is text mode with a request for vnc once it can get started. But I never even get to select language in manual mode. It's already in a hardware reset phase. Again comment #5 says what has been tried. There's nothing more I can provide that you can't do yourself on these box's yourself. console -M console.lab.bos.redhat.com HOSTNAME Barry I wonder if netconsole (http://lxr.linux.no/#linux+v2.6.34/Documentation/networking/netconsole.txt) might be useful for debugging this? Doesn't look like the module gets automatically loaded if you pass the parameter. We'll have to do that early on in anaconda if we want to make use of it. Standby. Without any additional debugging information, it's hard to determine what is happening. Given that the failure happens very earlier, our guess is a kernel failure of some variety (module loading problem, etc). *** This bug has been marked as a duplicate of bug 607650 *** The bug log does not mention kvm anywhere. Is this in fact a guest install? In fact, comment #7 means it isn't kvm for sure. kvm consoles are through the host, not console.something. This never ever had anything to do with virt or kvm. The problem is the attached FC storage. Something about it makes stage 1 install fail. It was disconnected and the machine works now albeit without that needed storage for certain virt testing. We are trying to get that storage reconnected to find out if there is still an issue. Barry Right, so this isn't a dup of the infamous #607650 as comment #13 suggests. I have verified that with the FC storage attached, RHEL6.0-20100722.0 installs. So my issue is resolved. Barry Based on Barry's comment I am going to close this bug. |