Bug 76699 - Workstation upgraded (7.2->8.0) hangs during boot
Workstation upgraded (7.2->8.0) hangs during boot
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
8.0
athlon Linux
high Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-10-24 20:19 EDT by John
Modified: 2008-08-01 12:22 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-30 11:40:07 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Output of /usr/sbin/dmidecode using vmlinuz-2.4.18-17.8.0smp (3.35 KB, text/plain)
2002-10-25 16:05 EDT, John
no flags Details

  None (edit)
Description John 2002-10-24 20:19:36 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux 2.4.9-31 i686; en-US; 0.7) Gecko/20010316

Description of problem:
A system running Red Hat 7.2 was upgraded to Red Hat 8.0.
We took the simplest path during the upgrade, and chose
default options.  No errors were reported.  However, after
it loads the kernel it hangs soon after the root filesystem
is mounted.  We can find no problems with the filesystems or
the partitions.

The system is a dual Athlon MP 1900+ workstation with a Tyan
Tiger S2466 motherboard.  It has 3.5 GB of the approriate
registered RAM.  The video card is a Radeon VE.  The computer
has run fine under Red Hat 7.2 for a little over half a year.

The version of Red Hat 7.2 on the system was stock, w/o hand-
replaced packages.  It had all of the errata RPMs applied,
except for one or two of the kernel updates.  (The latest
kernel was the Athlon SMP 2.4.9-31, we think.)

We decided to perform the update after a successful upgrade
from 7.3 to 8.0 on a very similar machine.  The other machine
differs in that it has slightly slower CPUs, 1GB of ram, and
an S2460 motherboard rather than an S2466.  That machine has
encountered no problems since the upgrade.

The install on the problematic machine also appeared to go
smoothly, and no errors were reported.  We took the simplest
path during the upgrade, and chose default options.

When we restarted the machine, though, it locked-up during
boot.  Grub started the kernel, and it progressed until the
start of the init scripts.  It froze immediately after the
message "INIT: version 2.84 booting."

When booting from a rescue floppy, the boot proceeds to the
same spot, except that after the "INIT" message, it gives
three error messages:

attempt to access beyond end of device
03:01: rw=0, want=918552944, limit=8191984
attempt to access beyond end of device
03:01: rw=0, want=738197508, limit=8191984
attempt to access beyond end of device
03:01: rw=0, want=8913264, limit=8191984

(From what we can tell, the 03:01 indicates the major and
minor numbers of /dev/hda1, the "/" partition.)

Just for completeness, here are a few lines of the boot
screen from just prior to the "INIT: version 2.84
booting" line:

Mounting /proc filesystem
Creating block devices
Creating root device
Mounting root filesystem
kjournald starting.  Commit interval 5 seconds
EXT3-fs:  mounted filesystem with ordered data mode.
Freeing unused kernel memory: 212k freed


We've been able to check the filesystems by starting the
system using "linux rescue" from the install cds.  We have
run "fsck -f" on the "/" partition (which contains /boot),
which didn't report any errors.  We also ran "fsck -cfn"
on that partition, which didn't report any errors but did
inform us that the filesystem on the disk was changed.
(Does it always say that when searching for bad blocks even
if there isn't an error?)

We've used df and fdisk to double check the size of the
partitions, and theyall look fine.  We're currently
looking for a way to test the integrity of the partition
tables.

Here's the layout of the drive, from fdisk:

Disk /dev/hda: 16 heads, 63 sectors, 193821 cylinders
Units = cylinders of 1008 * 512 bytes

Device
	Boot	Start	End	Blocks		Id	System
/dev/hda1
*
1
16254
8191984+
83
Linux
/dev/hda2
	16255	24381	4096008		83	Linux
/dev/hda3
	24382	26413	10240128	82	Linux swap
/dev/hda4
	26414	193821	84373632	f	Win95 Ext'd (LBA)
/dev/hda5
	26414	193821	84373600+	83	Linux

I don't know if this is relevant at all, but when we hit
"v" (verify) in fdisk it returns the line, "124 unallocated
sectors."

(As with everything here, this was transcribed by hand.)

Any help fixing this problem would be greatly appreciated!

Thank you,
John



Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Reboot
2.
3.
	

Actual Results:  The machine hangs after printing "INIT: version 2.84 booting."

Expected Results:  It should boot. :)

Additional info:
Comment 1 John 2002-10-25 10:43:35 EDT
Bug # <a
href="https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=74749">74749</a>
describes a system which also hangs at "INIT: version 2.84 booting."  Setting
"apm=idle_threshold=100" as a kernel parameter at boot worked around the problem
for several people there.  However, it not work for us.

It looks like the effects are similar but the cause is different.

(By the way, we aren't using a RAID array, and do not have a Promise IDE
controller card installed.  I'm trying to find the make and model of the drive now.)
Comment 2 John 2002-10-25 11:21:07 EDT
In case it helps, the hard drive is a Western Digital WD1000BB.  It's
connected to the S2466's onboard IDE controller.
Comment 3 John 2002-10-25 14:26:16 EDT
Ok, setting "apm=idle_threshold=100" doesn't work, but setting "apm=off" _does_
work around the problem and allow the workstation to boot.

What else should I test to help locate the specific bug?

Also, Arjan, should this be assigned to you instead?

Thanks!
John
Comment 4 Arjan van de Ven 2002-10-25 14:29:58 EDT
can you get me dmidecode (part of kernel-utils package) output (I only need the
top 30 lines or so)
Comment 5 John 2002-10-25 16:03:03 EDT
Unfortunately, setting "apm=off" appears to have succeeded by chance.  We find
that booting without any apm switches works about once out of every six
attempts.  Most freeze immediately after the "INIT:" statement.  Occassionally,
though, it proceeds slightly beyond the "INIT:" statement, but many error
message follow.  It freezes. and the errors are not reported in the log.

While we initially believed that "apm=off" worked around the problem, now no
longer do.  The rate of success is no greater with "apm=off."

We have tried both the original (-14) and updated (-18) kernels, and the problem
exists with both.

Just in case you still want it, we will attach the output of dmidecode.
Comment 6 John 2002-10-25 16:05:06 EDT
Created attachment 82126 [details]
Output of /usr/sbin/dmidecode using vmlinuz-2.4.18-17.8.0smp
Comment 7 John 2002-10-28 21:54:48 EST
The problem can be avoided by specifying mem=XXX on the boot line.

In our case, the startup BIOS reports (640+3668992)KB memory. Specifying
mem=3669632K always leads to a normal boot. In contrast, without the mem
parameter, about 10 of 12 boots fail completely (freezing after printing "INIT
version 2.84 booting"), 1 of 12 fails very shortly afterwards (with a torrent of
error messages involving missing programs on "/"). Only 1 of 12 appears to boot
successfully.

The information in /proc/meminfo differs between the rare, successful boot
without a mem parameter and the case when mem is specified by hand. In
particular,the kernel reports that HIGHMEM is 1 MB larger for the rare,
successful boot. Perhaps this will be useful in pinpointing the error.
Comment 8 Bugzilla owner 2004-09-30 11:40:07 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.