Red Hat Bugzilla – Bug 39191
hex dump during boot, and then system hangs
Last modified: 2007-04-18 12:33:04 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.61 [en] (Win98; U)
Description of problem:
After installation of 7.1, my e-machine e-tower 333cs dumps hex and hangs the system on each subsequent boot.
Steps to Reproduce:
1. Install Red Hat Linux 7.1 on (at least my) e-machines e-tower 333cs machine
2. Boot up the computer
Actual Results: 3. After getting the graphical lilo screen, after a couple second or so of normal-looking booting output, the computer dumps out
a few pages of hex, and then just stops (it happens so fast and then the hex scrolls past the last line, so I do not know exactly where it got
screwed or the last successful line it ran)
Expected Results: successfully boot up linux
The computer has been running Red Hat Linux 6.2 for months with no problems.
I have reproduced the bug in the following ways:
- Trying to do a straight *upgrade* from 6.2 to 7.1
- Doing a straight workstation fresh intall to 7.1
- Doing a custom install to 7.1
Also tried booting with the following commands (as suggested by Red Hat tech support):
linux nodma noapm
linux nodma noapm nousb
Also went into rescue boot mode and the file system at least looked normal.
Please note that I then reinstalled 6.2, and that loads and seems to be working fine.
If you could copy down the oops text (including all the magic numbers) it
could be helpful.
Assigning to kernel.
I have the same problem. Originally this happened upon attempting an update,
then after doing a clean install. I also have an etower 333cs with the only
modifications being a bigger HD, more memory, an ISA modem, and an ethernet
card. A few extra details: I managed to see the message "calibrating delay
loop" before it crashes, so it's shortly after that. The same problem happens
whether booting from the HD or the boot floppy I made during installation, but
booting from the Seahat install CD in rescue mode is successful. The same thing
happens with the original 2.4.2 kernel, the newest 2.4.3-12 kernel, and the
Mandrake 2.4.3 kernel from a month or so back. The call trace numbers for both
the 2.4.2 and 2.4.3-12 kernels follow a dual arithmetic progression: for
<fc70c14f> <fc70c14f> <fc78c14f> <fc78c14f> <fc80c14f> <fc80c14f> ...
where each number is repeated twice, and the difference between successive
numbers alternates between 80000 and 20000 (hex). The only difference between
the two versions of the kernel is an overall additive constant, the differences
of 80000 and 20000 are the same. There is more than one screenful of these
numbers so I can't tell how they start, but they repeat about once every 5
seconds. The keyboard is completely disabled, so Ctrl-Alt-Del doesn't work, and
neither does the power-off button, the only way to turn the machine off is by
cutting power to it. I salvaged the Seawolf installation by downgrading the
kernel to 2.2.19 together with a few packages that depended on 2.4. But there
are minor problems on shutting down including a fail message for NFS lockd and
for halting at the very end, after the disks are unmounted. These are probably
undocumented dependencies on the 2.4 kernel so should go away if the 2.4 kernel
We just released a 2.4.3-12 updated kernel last friday.
Also, how much memory do you have and does this go away when you
type "mem=xxxM" (with "xxx" being the amount of ram in megabyte minus 2 )
I have 256M of RAM. I tried booting the new RH 2.4.3-12 kernel with the
"mem=254M" boot option. No luck. The behavior is subtly different - before, it
looked like it was in an infinite loop, with the same numbers appearing over and
over about every 5 seconds. Now, it just shows the numbers and appears to
hang. The numbers form the same pattern as before, i.e. each repeated twice,
and differences of 80000 and 20000 hex.
If you have a general idea of what's wrong, I could experiment with different
boot options. Any suggestions?
"no_hlt" or "no-hlt" would be interesting as bootoption to try
I tried no-hlt. No change. By the way, this machine uses a VIA chipset and a
Cyrix M2 CPU. The more detailed info that used to be at http://www.e4me.com
seems to be gone. The maximum memory for this machine was originally advertised
as 256M. Would that be a motherboard limitation or just the maximum you could
pop into the 2 memory slots given the size of DIMMs at that time?
Created attachment 22006 [details]
capture of boot messages via serial console
I hope this capture file will help. I also forgot to mention that even though
this machine is an i686, I also tried the i386 version of the newest kernel with
the same problem, so it's not those optimizations.
I'm running 2.4.x fine on a VIA chipset cyrix MII quite similar to yours. The
trace is a pretty clear jump to nowhere.
You say " I have the same problem. Originally this happened upon attempting an
update. then after doing a clean install"
Does that mean the installion kernel ran correctly ? and it was after the
install it died ?
First I upgraded from 7.0 to 7.1. The upgrade itself went fine, but as soon
as I tried to boot the new kernel it crashed. After a while I gave up and
decided to try a clean install of 7.1. The same problem occurred - the install
itself goes fine, but the new kernel crashed when I tried to boot it. This
happens if I boot either from the HD or the boot floppy containing the 2.4
kernel made during installation. However, I can boot fine from the 7.1 install
CD in rescue mode.
Ok that is very important info.
The BOOT kernel is built for absolute maximal compatibility with anything the
world can throw at it and also with as few feature sets as possible to keep the
size down so it fits on the boot disk
The first obvious thing it lacks is APM. Can you try booting with the additional
option to disable APM. (I forget what it is unfortunately - Arjan ?)
By using the boot option "apm=off", it now works! I upgraded the packages I
had downgraded previously due to their dependence on the 2.4 kernel, and
everything seems to work fine, except that when I halt the machine, not only
does it not power down automatically, but the power button doesn't work either,
so I have to cut power to the machine to turn it off. I had to do the same
thing anyway while running the 2.2.19 kernel with Seawolf. Is this normal
behavior with APM disabled?
That is the normal behaviour with APM disabled in many cases yes
Ok can you grab
compile it, run it as root and attach the output to the bug. That will let me
add the box to the various internal tables so we know APM is to be avoided on
You might also btw look for BIOS updates, you may find a BIOS update fixes this
Created attachment 22186 [details]
output of dmidecode
Thanks. I've added that entry to my codebase so that at some future point
kernels will automatically avoid APM on that box. Do let me know if you ever
find a BIOS upgrade exists amd if it cures it
Marked NOTABUG because it is a BIOS bug. I'll see about getting the block entry
I updated the BIOS from version 1.11 to 1.20 using the file E120.exe on the
eMachines Help Site (<http://www.e4all.info>), and then built a custom kernel,
editing out the part in dmi_scan.c that disables the APM for the Delhi3
motherboard. No luck, the problem still exists.
My machine is a dual boot RH 7.3/Win98. Although there were APM problems with
Win98 when I bought the machine (3 years ago), after a year or so they went
away, and APM works fine now in Win98 (both versions 1.11 and 1.20 of the BIOS).
Apparently Microsoft did a software workaround. Maybe the BIOS uses an
outdated version of APM. Is it possible to make a guess as to what MS did, and
is there any utility I could run in Win98 to get info on the state of the APM?
Device Manager indicates that it is normal.
After installing Fedora 2 on this machine, it is able to use ACPI
successfully instead of APM, though there are warning messages:
ACPI: IRQ9 SCI: Level Trigger.
ACPI-0179: *** Warning: The ACPI AML in your computer contains
errors, please nag the manufacturer to correct it.
ACPI-0182: *** Warning: Allowing relaxed access to fields; turn on
CONFIG_ACPI_DEBUG for details.
To use ACPI it has to be enabled in the BIOS (set "ACPI aware OS" to
"yes"). The machine appears to run normally and even powers off on