Bug 61340 - kernel 2.4.9-31 does not boot anymore
kernel 2.4.9-31 does not boot anymore
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-03-17 23:22 EST by Michal Jaegermann
Modified: 2008-08-01 12:22 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-30 11:39:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Michal Jaegermann 2002-03-17 23:22:43 EST
Description of Problem:

I have here an Athlon machine, with the following /proc/cpuinfo:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 2
model name	: AMD Athlon(tm) Processor
stepping	: 2
cpu MHz		: 751.728
cache size	: 512 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips	: 1497.49

which upon an installation 2.4.9-31 kernel stopped to boot.  It did boot
2.4.9-13 and 2.4.9-21 (athlon versions) without any problems whatsoever.
It does not seem to be a processor problem as this machine does not boot
_any_ of uniprocessor 2.4.9-31 kernels.  I tried all x86 variants available.
It prints:

Uncompressing Linux ... Ok, booting the kernel.

and it sits there doing nothing with all kernels from 2.4.9-31 series.

Unfortunately I seem to unable to find now 2.4.9-21 - sources or binaries.
Comment 1 Alexei Podtelezhnikov 2002-03-18 18:06:04 EST
2.4.9-31 boots absolutely FINE here on exactly the same processor.
Hint-hint: you didn't provide enough information about you motherboard, 
chipset, etc. Can you redownload-reinstall this kernel in case of 
file corruption? .. and test on the other similar system to rule out
hardware failure?
Comment 2 Michal Jaegermann 2002-03-18 20:07:02 EST
> 2.4.9-31 boots absolutely FINE here on exactly the same processor.

Well, it is fine here also on something, for example, like this:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 4
model name	: AMD Athlon(tm) Processor
stepping	: 4
cpu MHz		: 1399.627
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips	: 2791.83

> Hint-hint: you didn't provide enough information about you motherboard, 
> chipset, etc.

Here is an output from 'lspci -tv' and 'lscpi -tvn'.  Good enough?

# lspci -tv
-[00]-+-00.0  VIA Technologies, Inc. VT8371 [KX133]
      +-01.0-[01]----00.0  Matrox Graphics, Inc. MGA G400 AGP
      +-07.0  VIA Technologies, Inc. VT82C686 [Apollo Super South]
      +-07.1  VIA Technologies, Inc. Bus Master IDE
      +-07.4  VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
      +-07.5  VIA Technologies, Inc. AC97 Audio Controller
      \-09.0-[02]--+-04.0  Symbios Logic Inc. (formerly NCR) 53c895
                   \-05.0  Digital Equipment Corporation DECchip 21142/43

-[00]-+-00.0  1106:0391
      +-01.0-[01]----00.0  102b:0525
      +-07.0  1106:0686
      +-07.1  1106:0571
      +-07.4  1106:3057
      +-07.5  1106:3058
      \-09.0-[02]--+-04.0  1000:000c
                   \-05.0  1011:0019

The other, working, machine also happens to have VIA board but
"VIA Technologies, Inc. VT8363/8365 [KT133/KM133]", 1106:0305.

> Can you redownload-reinstall this kernel in case of 
> file corruption?

Not sure if I understand.

> .. and test on the other similar system to rule out
> hardware failure?

Hard to talk about a hardware failure if the system is happily
running right now 2.4.9-21 which I found in my backups.  Like I wrote
it also does not have the slightest whiff of problems if I use 2.4.9-13.
I had to get back somehow to reverse the damage. :-)  You are reading
a message typed on a keyboard connected to that system.

Once again - I tried _four_ different uniprocessor kernels from 2.4.9-31
(athlon, i686, i586, i386).  None of these even starts to boot.
Comment 3 Rex Dieter 2002-03-26 11:46:02 EST
For the record, I have a Pentium-II 400Mhz box that exhibits the same non-
booting behavior trying using kernel-2.4.9-31.i686.rpm.  (I tried the i586 
kernel also without success).  I'm using lilo with manually created initrd 
images.  (I've re-created these images several times without any change in boot 
behavior).
Comment 4 wdc 2002-03-26 15:04:36 EST
FWIW, I have an identical non-booting problem with the following hardware, which
works fine with 2.4.9-12:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 8
model name	: Pentium III (Coppermine)
stepping	: 3
cpu MHz		: 601.374
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx
fxsr sse
bogomips	: 1199.30

The actual machine is an HP Vectra VEi8.  The cool thing about the HP
line is that the motherboard does NOT change within the line, so the
following information, from the original purchase should be definitive:

HP Part #: D9788T  With 192M of memory as a 64M DIMM and a 128M DIMM
(Lilo specifies the memory explicitly because when we first installed the
systems, the 2.2 kernel didn't properly detect all of it.)

HP VECTRA VEI8 DT Pentium 600E 13.5G hard drive and CD

I believe it is the Intel i815 chip set, but could not read any numbers
off the chips on the motherboard to confirm this ancient memory.

For what it's worth, it has an on-board Matrox Millenium G200 video chip
and a 3Com 3c905B Ethernet card in the PCI Bus.
Comment 5 wdc 2002-04-04 13:43:08 EST
To try and be more helpful, I am enclosing lspci -tv and -tvn output as well.

m10-423-6.mit.edu% lspci -tv
-[00]-+-00.0  Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge
      +-01.0-[01]----00.0  Matrox Graphics, Inc. MGA G200 AGP
      +-04.0  Cirrus Logic CS 4614/22/24 [CrystalClear SoundFusion Audio Accele
ator]
      +-07.0  Intel Corporation 82371AB PIIX4 ISA
      +-07.1  Intel Corporation 82371AB PIIX4 IDE
      +-07.2  Intel Corporation 82371AB PIIX4 USB
      +-07.3  Intel Corporation 82371AB PIIX4 ACPI
      \-10.0  3Com Corporation 3c905C-TX [Fast Etherlink]
m10-423-6.mit.edu% lspci -tvn
-[00]-+-00.0  8086:7190
      +-01.0-[01]----00.0  102b:0521
      +-04.0  1013:6003
      +-07.0  8086:7110
      +-07.1  8086:7111
      +-07.2  8086:7112
      +-07.3  8086:7113
      \-10.0  10b7:9200
Comment 6 Michal Jaegermann 2002-04-08 21:22:11 EDT
The mystery is solved at least in my case (original report about Athlon).
Turns out that I had there a pretty old version of 'grub' which was not
updated for quite a long while.  It was good enough to boot 2.4.9-21 but
for some reasons 2.4.9-31 was too much for it.   Replacing grub with a newer
one (boot sector and "stage" files) solved the issue.  The same trouble
happened with "skipjack" kernels too and the same solution applies.

I cannot comment on other reports.
Comment 7 wdc 2002-04-08 23:35:45 EDT
Alas, that remedy would not help us.
We are using lilo.

Can someone more clueful about what changed with the recent grub
offer some test cases that could be run to further isolate this?

Might this be due to the lack of a
    liner
or 
    lba32
directive in lilo.conf?

To restate:  We had an HP Vectra VEi8 system that works PERFECTLY WELL
with the 2.4.9-21 kernel but hangs in the way described above with the
2.4.9-31 kernel.  The lilo config is identical for the two kernels.
We are currently working around the problem by using the older kernel
(but our site-wide system image is based on the -31 kernel so we DO need
a fix at some point.  No other Dell, IBM, or HP machine in our site has
demostrated this problem.  They're all successfully at -31 after a site-wide
update.)
Comment 8 bob 2004-03-12 20:00:42 EST
i am a total noob with linux, just forewarning all but,

i am getting this exact same error(though im attempting to install 
rh9 fresh) and have not been able to find a resolve. ive been 
googling around and found many cases of this same "hanging" after 
ok'ing the kernel. one case stated that it would work once there was 
a p/s2 mouse installed, i am trying to install this on a relativly 
old system with an AT keyboard and serial mouse, i have also seen 
reports of usb devices not working, can anyone verify the theory that 
the lack of a p/s2 mouse is causing this?
Comment 9 wdc 2004-03-13 13:16:20 EST
Mine was comment #4 two years ago.
That system DID have a PS/2 mouse.

I think the ultimate cause was some kind of lilo bug interacting with where disks landed on 
an LBA disk, but the fix is to upgrade to the kernel rev that came out subsequently 2.4.9
-31.

In the intervening two years, many many MANY kernels have come and gone.  The USB 
support has improve to the point where it almost works.  (Actually it works most of the 
time, but I have a few obscure bugs that have been too difficult to pin down and report.)

Good luck with your odyssey, bob.

-wdc
Comment 10 Bugzilla owner 2004-09-30 11:39:26 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.