Bug 101160
Summary: | RH9 install aborts during package installation | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Bob Hockney <bhockney> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 9 | CC: | jhmail, pfrields, russell.c |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-08-25 16:11:35 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Bob Hockney
2003-07-29 18:49:27 UTC
If you switch to VC4 (cntl-alt-f4) when the error occurs see if you see alot of read/write errors to one of your hard drives. I have had the same behavior. Athlon 2800+, Giga-Byte GA-7N400 Pro motherboard (nVidia nForce 2 chipset, latest bios version F11), both new. I have made about 100 attempts and have not yet successfully installed Linux (I've done it many times on other systems). For this box, I have tried several different hard drives, some new and some used in other machines for a long time. I have checked all install media, both during the install and by installing other machines with the same media. I have used two different sets of media. I have installed from a new SONY USB DVD writer and an older IDE DVD-ROM drive. I could not boot from the USB CD directly (it worked perhaps 1 in 100 attempts). I was able to boot it from a floppy and install until disk 2, when I got a nice error suggesting a disk full or failure. Another attempt crashed in the middle of disk 1. All other attempts have crashed in disk 1 as well. Either I get an MD5 error and a nice box saying to start over, or a kernel panic. For the panic, going to C-A-F2 reveals that the boot prompt is there and responds to returns, but then hangs as soon as a command is typed. The C-A-F4 messages (last screen) of the most recent failure: Code: 8a 18 74 24 8b 42 04 89 43 04 89 18 c7 42 00 00 00 00 c7 <1>Unable to handle kernel paging request at virtual address a3e6c1a4 <4> printing eip: <4>f8893636 <1>*pde = 00000000 <4>Oops: 0000 <4>CPU: 0 <4>EIP: 0060:[<f8893636>] Not tainted <4>EFLAGS: 00010287 <4> <4>EIP is at (2.4.20-8BOOT) <4>eax: a3e6c1a4 ebx: e1268e40 ecx: f63f5870 edx: a3e6c1a4 <4>esi: a3e6c1a4 edi: 00000000 ebp: 00000001 esp: f5241e90 <4>ds: 0068 es: 0068 ss: 0068 <4>Process kjournald (pid: 143, stackpage=f5241000) <4>Stack: f63f58e4 00000000 00000f44 f378d0bc 00000000 e3e6c150 dd796720 0000099d <4> e71571e0 e1268e40 e1268ea0 e1268de0 e1268d80 e1268d20 e1268cc0 e1268420 <4> e12683c0 e1268360 e1268300 e12682a0 e1268240 e12681e0 e1268180 e1268120 <4>Call Trace: [<f8895d9c>] (0xf5241fc0)) <4>[<f8895c6c>] (0xf5241fd8)) <4>[<f8895c7c>] (0xf5241fe8)) <4>[<c0106fc1>] (0xf5241ff0)) <4> <4> <4>Code: 8a 18 74 24 8b 42 04 89 43 04 89 18 c7 42 00 00 00 00 c7 <4> ^[[[D^[[[D Note also that when I did this from the USB CD, one of the other screens included messages about the USB filesystem being unstable. I didn't understand why this might be the case. In searching for this bug, I noticed similar complaints in prior RH releases, all from athlon people. This is my first AMD CPU, and I haven't seen similar problems before. --jh-- In my particular case there were no error messages on VC4. With some experimentation I first found that I could install if I formatted the drive as ext2 instead of ext3. However, after only a short period of use (30 minutes) I had a partition corrupted beyond meaningful use or repair. There were no hardware related messages in /var/log/messages. I then found I could install if I slowed down the hdd at the first opportunity during install (on VC2 hdparm -X68) from ATA 100 (mode 5) (it's maximum) to ATA 66. (mode 4) Even though I gave the nodma parameter on the boot line DMA was apparently re-enabled. Anyway, at this speed I am able to install and do not experience corruption during use if I slow down the drive during startup. I have two Maxtor drives in the system, hda is capable of ATA 100 and hdb is capable of ATA 133. I can install and use RedHat on hdb at ATA 133 (and ATA 100) without problem, but I cannot use hda at ATA 100 under RedHat. I never saw any log messages associated with the problems. I am able to use hda at 100 with another OS on the same machine without problem. -Bob I just tried reducing the UDMA mode with hdparm as suggested, to no avail, but it still fails, most recently with Oops: 0002, process anaconda. I am suspecting hardware, since this install is so generic that others should have seen a software problem. --jh-- This appears to be related to the kernel drivers. In the mean time, I sent back that motherboard and tried another one, same problem. The only common hardware component is the CPU, but it runs for days on Morphix. The new board is an Asus A7V600, with the Via KT600 chipset. The old board had an nVIDIA nForce 2 chipset. The crashes were variously blamed (in the VC messages) on anaconda, mini-wm, and the kupdated. Arjan, are we likely to see any alternative boot images any time soon? I can't load Red Hat, and I've been trying since early August! Since this problem is keeping people from even loading Red Hat, I suggest raising the priority. --jh-- ...also, some (not all) of the crash messages said something like "Kernel BUG in buffer.c" --jh-- I have also been having crashes during Disk 1 install. In Anaconda, the Anaconda traceback was hiding any panic message, but when I switched to text mode I got similar results to what's shown here.. <1>Unable to handle kernel paging request at virtual address 236ece21 <4> printing eip: <4>cc892254 <1>*pde = 00000000 <4>Oops: 0000 <4>CPU: 0 <4>EIP: 0060:[<cc892254>] Not tainted <4>EFLAGS: 00010202 <4> <4>EIP is at (2.4.20-8BOOT) <4>eax: 00000001 ebx: 236ece09 ecx: c53ef550 edx: c53ef550 <4>esi: c0009150 edi: c0009150 ebp: c1af63f0 esp: c7551e90 <4>ds: 0068 es: 0068 ss: 0068 <4>Process kjournald (pid: 132, stackpage=c7551000) <4>Stack: c17f7ed4 00000000 00000000 00000000 00000000 cac2e820 c1af6240 0000022d <4> c539d150 c539d1b0 c539d210 c539d270 c539d2d0 c534b490 c534b4f0 c53ef610 <4> c53ef670 c53ef6d0 c53ef730 c53ef790 c53ef7f0 c53ef850 c53ef8b0 c539dab0 <4>Call Trace: [<cc894d9c>] (0xc7551fc0)) <4>[<cc894c6c>] (0xc7551fd8)) <4>[<cc894c7c>] (0xc7551fe8)) <4>[<c0106fc1>] (0xc7551ff0)) <4> <4> <4>Code: 8b 43 18 a9 04 00 00 00 8b 7f 1c 75 19 83 e0 02 0f 84 3a 0d <4> My machine is quite different from the others here, being a Celeron 333 with a Intel BX chipset. Disks are SCSI vai a Megaraid card and CDROM is SCSI via a AHA1542. These are loaded from the drivers disk. The hardware is capable of passing many test cycles with no problems. |