Bug 199793 - hang at or near nash startup at boot
Summary: hang at or near nash startup at boot
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard: MassClosed
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-07-22 02:51 UTC by Tom Horsley
Modified: 2008-01-20 04:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-01-20 04:40:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Tom Horsley 2006-07-22 02:51:00 UTC
Description of problem:
The original 2.6.15-1.2054_FC5smp kernel from the FC5 dvd works
fine on this machine I just genned up, but when I downloaded updates
the boot hangs after it prints the "starting redhat nash" line,
followed by a line that says "ata3: disabling port".

The 2054 kernel prints those same lines, but then goes on to
INIT, booting system, etc...

Version-Release number of selected component (if applicable):
kernel-smp 2.6.17-1.2157_FC5 i686

How reproducible:

Every time I try to boot 2157

Steps to Reproduce:
1. boot system
  
Actual results:

hangs as described above.

Expected results:

continue to boot and come all the way up.

Additional info:

This isn't a hard hang, I can type and echo characters on the console
(but they don't do anything other than echo). I can also reboot
via Ctrl-Alt-Del.

The machine details may be of interest here since this is a bit of a
peculiar setup. The machine I was using died, so I moved the disks
to this machine temporarily so I can get to my data. The setup I currently
have is a hyperthreaded 2.8GHZ Pentium 4 in an ASUS P4C800-E motherboard
with 4 sata ports in use. The two sata ports on the Intel controller
are configured with a Windows XP raid-0 which I try to ignore as much
as possible when running linux on the two disks that are on the onboard
promise raid controller, but not configured as a raid, just as two
independent physical disks.

I don't know if the nash stuff is desperately attempting to configure
the Windows disks as a raid or what (if there was a way to turn those ports
off in the BIOS, I'd do it, but I can't seem to find one).

For now, I have simply uninstalled the 2157 kernel and am sticking with
the base 2054 kernel.

This didn't look exactly like any of the other nash problems I could
find search the bugzillas since they all seem to be hangs where folks
have raids or lvms setup that they want to be recognized, whereas I
just want the two ordinary linux disks I have to be recognized and the
Windows NTFS partitions to be ignored (I don't mention them in the fstab
at all).

Comment 1 Tom Horsley 2006-07-22 11:36:43 UTC
I noticed that update offered me kernel 2139 today instead of 2157, so
I gave it a whirl, and it spews a lot more stuff on my screen, but still
hangs after it says "device-mapper initialized".

Just prior to that there are messages about it recognizing the sdc and sdd
disks, which are the Windows XP RAID disks I'd just as soon it didn't know
about.

Is there any way to tell the boot code: "No sdc and sdd aren't there - pretend
you didn't see them"?


Comment 2 Tom Horsley 2006-07-23 00:31:26 UTC
I guess 2139 was from an out of date mirror. I tried some more experiments
today, and I was back to kernel 2157 again, and finally got it to boot
with a few errors printed from some of the init scripts later in the boot,
but everything seems to be working. What I did was modify the initrd
image init script. The original script that was installed along with the
kernel contained these lines that appear to be related to the Windows
RAID disks:

rmparts sdd
rmparts sdc
dm create isw_deichhibbi_RAID_Volume1 0 312602112 striped 2 128 8:32 0 8:48 0
dm partadd isw_deichhibbi_RAID_Volume1

Adding some additional echo commands revealed that the hang happens on the
partadd command, so I removed just the partadd command, and no more hang.
It gets all the way through the boot and everything seems to be working the
way I want it to.

I had previously tried removing more stuff and always got kernel panics
(but that may also have been because I wasn't building the initrd image
exactly correctly). Anyway, the minimal change seems to be to remove
partadd and ignore any errors that happen later in the boot related
to device mapper stuff.


Comment 3 Roland Roberts 2006-09-18 18:31:14 UTC
I am having a similar problem with FC5, 2.6.17-1.2187_FC5smp but the point of
commonality seems to be an ASUS motherboard, the ASUS P4P800S with 2.8GHz P4. 
If I enable hyperthreading, the machine locks up.  The exact point seems to be
more-or-less random, but with previous kernels, it was typically after the boot
when I've already had X running.  With the most recent kernel (above), it locks
when switching to runlevel 5.

I can try to repeat this, booting only to run level 3 to see if I can actually
get there.  It is unclear to me whether this is X related or not, but running
with hyperthreading disabled definitely gets me running.

Comment 4 Dave Jones 2006-10-16 19:12:44 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 5 Jon Stanley 2008-01-20 04:40:55 UTC
(this is a mass-close to kernel bugs in NEEDINFO state)

As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.

If you believe that this bug was closed in error, please feel free to reopen
this bug.


Note You need to log in before you can comment on or make changes to this bug.