Bug 72387 - Radeon IGP chipset is not supported (eg Compaq Presario 900, HP z4115)
Radeon IGP chipset is not supported (eg Compaq Presario 900, HP z4115)
Status: CLOSED RAWHIDE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
9
athlon Linux
high Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
: 78123 80352 81587 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-08-23 08:28 EDT by Warren Togami
Modified: 2005-10-31 17:00 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-06-05 07:48:57 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lspci -vv and lspci -vvn, system booted with "pci=off" (14.73 KB, text/plain)
2002-08-26 10:00 EDT, Warren Togami
no flags Details
X log file with the VESA errors (87.38 KB, text/plain)
2003-01-03 13:26 EST, Diego
no flags Details
the errors I get after X fails (828 bytes, text/plain)
2003-01-03 13:27 EST, Diego
no flags Details

  None (edit)
Description Warren Togami 2002-08-23 08:28:40 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020809

Description of problem:
Null completely fails to install on the Compaq Presario 900US laptop.  This is a
very new Athlon laptop with an ALi chipset.  While things appear okay during
kernel bootup, Anaconda is unable to access either the IDE hard drive or CD-ROM
due to failures that you can see in ALT-F4.  Google found quite a few recent
posts in LKML about ALI15X3 problems.

I bought myself a laptop IDE to desktop IDE converter, so I will be able to
intall Null onto this hard drive and test various kernels on this machine. 
Tomorrow I plan on testing 2.4.20-pre, -ac and 2.5.x.  Anything else I should test?

Note that there also seems to be a problem with the 8139too kernel module. 
ALT-F4 debug messages for this are also included below.  I will post this into a
separate Bugzilla report.

Version-Release number of selected component (if applicable):
Red Hat beta Null

How reproducible:
Always

Additional info:
The following are excerps from the kernel bootup (hit Scroll Lock before it ran
Anaconda).  Unfortunately I am unable to reach any shell in the installer due to
its inability to access IDE devices, so I am unable to get the dmesg output from
during Anaconda runtime.

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ALI15X3: IDE controller on PCI bus 00 dev 80
PCI: No IRQ known for interrupt pin A of device 00:10.0. Please try using
pci=biosirq.
ALI15X3: chipset revision 196
ALI15X3: not 100% native mode: will probe irqs later
	ide0: BM-DMA at 0x8080-0x8087, BIOS settings: hda:DMA, hdb:pio
	ide1: BM-DMA at 0x8088-0x808f, BIOS settings: hdc:pio, hdd:pio
hda: TOSHIBA MK3018GAP, ATA DISK drive
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
hdc: SD-R2102, ATAPI CD/DVD-ROM drive


The following can be viewed in ALT-F4 when Anaconda attempts and fails to access
the IDE CD-ROM.  It has similar problems if I attempt a "Hard Drive" install.

<6>8139too Fast Ethernet driver 0.9.25
<6>8139too: pci dev 00:0b.0 (id 10ec:8139 rev ff) is an enhanced 8139C+ chip
<6>8139too: Use the "8139cp" driver for improved performance and stability.
<6>PCI: Assigned IRQ 11 for device 00:0b.0
<3>8139too: 00:0b.0: Chip not responding, ignoring board
<6>8139too Fast Ethernet driver 0.9.25
<6>8139too: pci dev 00:0b.0 (id 10ec:8139 rev ff) is an enhanced 8139C+ chip
<6>8139too: Use the "8139cp" driver for improved performance and stability.
<6>PCI: Assigned IRQ 11 for device 00:0b.0
<3>8139too: 00:0b.0: Chip not responding, ignoring board
<4>hdc: status timeout: status=0xd0 { Busy }
<4>hdc: drive not ready for command
<4>hdc: ATAPI reset timed-out, status=0xd0
<4>ide1: reset timed-out, status 0xd0
<4>hdc: status timeout: status=0xd0 { Busy }
<4>hdc: drive not ready for command
<4>hdc: ATAPI reset timed-out, status=0xd0
<4>ide1: reset timed-out, status=0xd0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<4>end_request: I/O error, dev 16:00 (hdc), sector 0
<6>attempt to access beyond end of device
<6>16:00: rw=0, want=2147483381, limit=557735282
<4>isofs_read_super: bread failed, dev=16:00, iso_blknum=-134, block=-268
<4>end_request: I/O error, dev 16:00 (hdc), sector 0

Please let me know what other information is needed.

Thanks,
Warren
Comment 1 Warren Togami 2002-08-23 08:34:57 EDT
Bug 72388 filed: (Compaq Presario 900) 8139too module fails
Comment 2 Warren Togami 2002-08-25 06:05:11 EDT
I swapped the hard drive into a working laptop and installed Null, compiled
2.4.20-pre4-ac1 and swapped the hard drives back.  Let me know if you want the
.config file from the -ac kernel build.

The Null kernel always fails with a Machine Check exception, similar to a crash
I see in Mandrake 9.0 beta's installer.  I compiled MCE into the -ac kernel, and
it doesn't panic at that point.  I managed to boot into the system with
"pci=conf2" or "pci=off", although I can't use any devices I could at least get
to a shell.  What lspci and other information would you need?

------------------------
2.4.20-pre4-ac1
------------------------
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
hda: status timeout: status=0x80 { Busy }
hda: drive not ready for command
ide0: reset timed-out, status=0x80
end_request: I/O error, dev 03:03 (hda), sector 2
EXT3-fs: unable to read superblock
end_request: I/O error, dev 03:03 (hda), sector 2
EXT2-fs: unable to read superblock
end_request: I/O error, dev 03:03 (hda), sector 64
isofs_read_super: bread failed, dev=03:03, iso_blknum=16, block=32
Kernel panic: VFS: Unable to mount root fs on 03:03

-----------------
kernel-2.4.18-11 (Null default)
-----------------
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfd87e, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router ALI [10b9/1533] at 00:07.0
isapnp: Scanning for PnP cards...
CPU 0: Machine Check Exception:  0000000000000007
Bank 3: b40000000000083b at 00000001fc0003b3
Kernel panic: Unable to continue
Comment 3 Warren Togami 2002-08-25 07:19:54 EDT
Unfortunately I have this laptop only for another 36 hours.  I would appreciate
it if I could have a few more things to test before I have to return it.
Comment 4 Alan Cox 2002-08-25 07:48:30 EDT
CPU 0: Machine Check Exception:  0000000000000007
Bank 3: b40000000000083b at 00000001fc0003b3

Thats a fault report from the processor

Error enabled, uncorrected error, valid

Bus/Interconnect Error on locally originated processor request
Data read request for I/O
Comment 5 Warren Togami 2002-08-25 22:24:52 EDT
My friend bought the same laptop, so we swapped hard drives and tested several
of the kernels that I had built.  All of the following testing was identical on
both machines. 

Further testing from today:
---------------------------
2.4.20-pre4-ac1 (compiled with MCE enabled)
* does NOT crash with MC error
* crashes when unable to use IDE device
* pci=off boots, PIO mode working fine, albeit system useless without devices
2.4.18-12.4 (Rawhide) and 2.4.18-11 (Null)
* always crashes with MCE error, regardless of pci=off
2.5.31 (compiled with MCE enabled)
* always crashes with MCE error, regardless of pci=off
2.5.13 (compiled without MCE)
* crashes with IDE problems similar to 2.4.20-pre4-ac1.

Is there anything notably different in Andre Hedrick's ide patch against
2.4.19-ac4?  I'll try that next.
Comment 6 Warren Togami 2002-08-26 03:54:14 EDT
Tested 2.4.19-ac4 + Hedrick's ide-2.4.19-ac4.11.patch fails in the same way.
Comment 7 Alan Cox 2002-08-26 05:25:12 EDT
An MCE is generated by a hardware problem.

Comment 8 Warren Togami 2002-08-26 09:17:35 EDT
Are you saying that a MCE always indicates a hardware defect?  Everything
reported above is identical on two brand new Presario 900US laptops, which are
rock stable in the factory shipped Windows XP.

Perhaps this is a new motherboard chipset variation, poorly tested with anything
other than Windows. =(
Comment 9 Warren Togami 2002-08-26 10:00:09 EDT
Created attachment 73004 [details]
lspci -vv and lspci -vvn, system booted with "pci=off"
Comment 10 Warren Togami 2002-08-26 10:02:06 EDT
Many different problems with this hardware...

[root@compaq root]# lspci -vv
pcilib: Cannot open /proc/bus/pci
pcilib: Bus 00 seen twice (firmware bug). Ignored.
...
... 
(continued in attachment above)
Comment 11 Alan Cox 2002-08-26 13:01:44 EDT
MCE indicates the processor found itself in what it considers to be an illegal
state. I can think of two things that might trip the CPU into believing that
other than plain faulty components

#1  - If the box relies on ACPI for setup, in which case for the moment its a 
windows only system (probably for another 3 months)

#2  - See if booting with the option "mem=nopentium" helps. That turns off the
4Mb paging feature, just in case its some weird interaction with the system.

Other things that make the box as conservative as possible are "ide=nodma noapm"

Comment 12 Warren Togami 2002-08-26 19:59:09 EDT
#1 What happens in 3 months regarding ACPI?  Something I haven't tested here was
a kernel with ACPI instead of APM.  How different is the current acpi4linux
patch from the acpi in 2.4.x or 2.5.x?

#2 Teste all combinations of mem=nopentium, ide=nodma, and noapm with the Red
Hat kernel.  No luck.
Comment 13 Alan Cox 2002-08-27 08:27:12 EDT
In about three months the current intel linux acpi patch should be solid enough
that its usable. Red Hat's shipping plans for ACPI are a seperate matter, but
roll your own kernels with ACPI ought to be usable by then
Comment 14 Warren Togami 2002-10-26 07:03:32 EDT
http://videl.ics.hawaii.edu/mailman/listinfo/linuxpresario900

I created this discussion mailing list for owners of Compaq Presario 900 series
laptops to share their findings and work toward a solution to this problem.
Comment 15 Warren Togami 2002-10-26 07:13:13 EDT
Bill McLean was able to successfully install onto this laptop with "nousb" and
"nopcmcia" but ran into data corruption later.  Is any indication of what is
breaking the IDE part of this problem?
Comment 16 Alan Cox 2002-10-26 10:14:12 EDT
The "nousb" may be significant. Especially if he got the corruption later on
having not always used "nousb". The reason for that is that there are
connections deep down in the hardware setup between the USB and IDE clocking.

I've also had one strange report that 2.2 boots on it but I've not been able to
verify that
Comment 17 Warren Togami 2002-11-03 03:40:33 EST
http://videl.ics.hawaii.edu/pipermail/linuxpresario900/2002-November/000007.html

Mark Pavlidis wrote this complete report talking about the effects of the
various "no" kernel options.  Please let us know what we can test to begin to
begin work on these problems.
Comment 18 Alan Cox 2002-11-14 17:54:37 EST
Can someone get me an

lspci -vxx -M -H1

(that will do the lspci directly even if booted with "nopci".

That may let me at least blacklist the PCI IDE so people can get a booting
system with PCI enabled to do further research.
Comment 19 Alan Cox 2002-11-14 20:21:16 EST
Ok the MCE seems to be from the pcmcia package - exclude 0x380-0x3ff according
to Chris Cheney

Comment 20 Alan Cox 2002-11-19 06:55:56 EST
*** Bug 78123 has been marked as a duplicate of this bug. ***
Comment 21 Arjan van de Ven 2002-12-26 12:08:03 EST
*** Bug 80352 has been marked as a duplicate of this bug. ***
Comment 22 Fred T. Hamster 2002-12-27 19:52:18 EST
*** Bug 80352 has been marked as a duplicate of this bug. ***
Comment 23 Diego 2003-01-02 17:50:55 EST
I have the same problem with a Presario 920CA. I can also help debug or gather
information if necessary. I followed the link posted in comment #14 I could
install RH8 but I could not get X to work and I after rebooting I could not log
into Linux the kernel would panic.
Comment 24 Warren Togami 2003-01-03 05:21:07 EST
http://videl.ics.hawaii.edu/pipermail/linuxpresario900/2002-December/000120.html
> My goal is that the next Red Hat runs on the IGP chipset boxes.
> Pleasefile bugs, please add me to the cc of any IGP related bugs.
>

Alan Cox mentioned this on the LinuxPresario900 mailing list.
Comment 25 Fred T. Hamster 2003-01-03 11:39:41 EST
i had a problem with X windows also.  i eventually realized that the initially
detected sound card on my presario 915 was causing X windows to freeze the box
also, when it tried to play a sound during startup.  after disabling the sound
card entirely (removing it from modules.conf), X windows will run and not crash
the machine.  dunno if that's related to the X problem Diego is having or not.
Comment 26 Diego 2003-01-03 13:26:32 EST
Created attachment 89101 [details]
X log file with the VESA errors
Comment 27 Diego 2003-01-03 13:27:14 EST
Created attachment 89102 [details]
the errors I get after X fails
Comment 28 Diego 2003-01-03 13:39:00 EST
Actually I reinstalled RH8 with just pci=off and everything went ok. After
installation i had to pass nomce or the kernel would panic. I removed the
0x380-0x3ff port range from /etc/pcmcia/config.opts and removed the nomce flag
but again the kernel would panic so I had   to pass noisapnp and then the system
would boot ok.

On the first boot kudzu detected the the network card, the audio device
controller (ALi M5451 PCI AC-Link), ALi USB 1.1 controller and the sound card
but the next time I rebooted the system would hang while loading the OCHI
controller so I had to add the nousb option and I would get to the login prompt.

But then again when I tried starting X then I would get errors from hdc and I
would not be able to log in to any other VT. It basically gets stuck in the
bluecurve-RH logo (where it loads the different components). I have attached my
X log and another file with the errors I get after X fails.

                       
Comment 29 Fred T. Hamster 2003-01-03 13:44:41 EST
have you tried disabling the sound card already?
or here's an experiment you can try without disabling it:
at a shell prompt, enter the command:
   play /usr/share/sounds/error.wav
if you don't have that file, try any other sound file.
if it's the sound card causing problems, then you should see the same IDE errors
and the system will act like it does with your X problem.
this is what i did in order to demonstrate to myself that it wasn't X, but was
the sound card instead.
Comment 30 Diego 2003-01-03 14:58:53 EST
Yes it is the sound card. I did the WAV experiment and I got the same error. I
will compile a new kernel to try to get the card to work. I found some help on
the ALi website about the sound card on 2.4.x kernels. Maybe that will solve the
problem. I just don't want to disable the sound, I like to listen to some tunes
while I work ;)

I'll will do this on the weekend and post my results here.

The link is:

http://www.ali.com.tw/eng/support/support_driver.htm

Search by OS There is a FIR driver and an audio driver for the 2.4.x kernels. 
Comment 31 Diego 2003-01-06 10:48:32 EST
I got sound, usb and ACPI working after following the instructions of:

http://www.wsu.edu/~ice124/

I patched and recompiled kernel 2.4.20. Although I had to tweak the volume
control before I could start getting any sound out of the box. But when it comes
to video I could not use the Radeon driver X would die on me saying "no screens
found".

Finally, I am using the 8139cp network driver as the system would suggest I use
it instead but the only way of selecting it was to go to the .config file
because the option is not inside menuconfig. So I would have to do my changes
inside menuconfig then exit and save and then go edit the .config file because
it would unselect the 8139CP driver on exit from menuconfig.

BTW is there any problem if I compile the kernel with both ACPI and APM? Or
should I just stick with ACPI?
Comment 32 Arjan van de Ven 2003-01-30 14:43:03 EST
*** Bug 81587 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.