|Summary:||kernel 22.214.171.124-12.fc8 fails to recognize disks so it does not boot|
|Product:||[Fedora] Fedora||Reporter:||Michal Jaegermann <michal>|
|Component:||kernel||Assignee:||Jeff Garzik <jgarzik>|
|Status:||CLOSED NOTABUG||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2008-03-10 22:11:38 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:|
Description Michal Jaegermann 2008-03-08 06:28:53 UTC
Description of problem: Booting after a kernel update on M2R32-MVP ASUSTeK board fails because disks are not recognized anymore. On a screen something like that shows up (this is from a hasty scribbles from a scrolling screen so some pieces are missing or may be garbled): ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: qc timeout (error 0xec) ata1.00: failed to IDENTIFY (I/O error, err .. =0x) ata1.00: failed to recognize some devices, retrying in 5 sec That is repeated few times, I think three, with long timeouts and followed by a similar sequence for the second disk. An obvious failure follows. 126.96.36.199-137.fc8 and earlier F8 kernels booted without problems. A dmesg from a boot with 188.8.131.52-137.fc8 is attached. In the past there were reported problems with a disk recognition on the same hardware but later those were fixed. Reports are bugzilla 232490 and 235787. A layout of a PCI bus is in https://bugzilla.redhat.com/attachment.cgi?id=152106 A machine is "remote" so trying other kernels/options is a somewhat protraced affair. Version-Release number of selected component (if applicable): 184.108.40.206-12.fc8 How reproducible: always Additional info: The same kernel on two other x86_64, but a different hardware configuration, booted (although on one machine something in an update totally deconfigured both network interfaces; luckily that machine was local).
Comment 1 Michal Jaegermann 2008-03-08 06:28:53 UTC
Created attachment 297268 [details] dmesg from a booting 220.127.116.11-137.fc8 kernel
Comment 2 Michal Jaegermann 2008-03-10 07:41:15 UTC
Hm, that looks like that http://kerneltrap.org/mailarchive/linux-kernel/2008/2/9/795354 http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/885364 Unfortunately it does not say which BIOS version was used. I will try to update that BIOS when I will have a chance and report back.
Comment 3 Michal Jaegermann 2008-03-10 21:43:18 UTC
I had a chance to try this morning a BIOS update. Eventually I used version 1009 which is the latest "non-beta" available from ASUS. After a protracted fight with BIOS I eventually managed to boot a machine and that worked with 18.104.22.168-12.fc8. dmesg from that boot shows what looks like essential differences so it is attached below. Should this be closed then? The issue is that an older BIOS was good enough for 22.214.171.124-137.fc8 but not for 126.96.36.199-12.fc8. Luckily a suitable update did exist (but I would not dare to tell my wife to apply it :-).
Comment 4 Michal Jaegermann 2008-03-10 21:45:32 UTC
Created attachment 297519 [details] dmesg from 188.8.131.52-12.fc8 (on the same machine after a BIOS update)
Comment 5 Chuck Ebbert 2008-03-10 22:11:38 UTC
Looks like the BIOS now disables PMP and 64-bit DMA.
Comment 6 Michal Jaegermann 2008-03-10 22:58:33 UTC
> Looks like the BIOS now disables PMP and 64-bit DMA. That could be because of mis-settings after a BIOS update. I had a really hard time to get that BIOS to a state where it was booting anything at all ("safe defaults" were not suitable for this) and after I managed to accomplish that feat I really had to hurry somewhere else with no time to rescan and fish out what else could/should possibly be changed without making the machine unbootable again. I cannot even tell you if there are options to enable the above.
Comment 7 Michal Jaegermann 2008-03-11 05:23:58 UTC
> Looks like the BIOS now disables PMP and 64-bit DMA. > (Chuck Ebbert in comment #5) I looked again at dmesg output for 184.108.40.206-137.fc8 with 0804, i.e. old, BIOS and 220.127.116.11-12.fc8 booting with 1009. If a comment is about these two lines: ahci 0000:00:12.0: controller can't do 64bit DMA, forcing 32bit ahci 0000:00:12.0: controller can't do PMP, turning off CAP_PMP then the first one is the same in both cases and only the second one showed up after updates. I do not see also any other info about DMA which would change in any essential way. (Not that I have a clue what PMP is and Google is not of much help :-).
Comment 8 Michal Jaegermann 2008-03-11 16:26:30 UTC
I am not that convinced that closing that as NOTABUG was not a bit too hasty. True, I found a way to boot that particular hardware but see also http://lkml.org/lkml/2008/3/9/136 . It looks strangely familiar. That may be buggy BIOSes but what is new here? http://lkml.org/lkml/2008/3/9/171 and follow-ups also seem to have some relevance.
Comment 9 Chuck Ebbert 2008-03-12 18:48:09 UTC
(In reply to comment #7) > (Not that I have a clue what PMP is and Google is not of much help :-). PMP = port multiplier port http://www.sata-io.org/portmultiplier.asp
Comment 10 George 2008-03-14 17:10:25 UTC
i have those issues described in the original post too. However i have found a kernel parameter that fixes it: pci=nomsi without that parameter i get: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: qc timeout (error 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata1.00: failed to recognize some devices, retrying in 5 sec With pci=nomsi it works fine and recognizes my western digital HDD (Jmicron sata controller, kernel 2.6.24-2.fc9 on FC9 Alpha, i had this in the regular FC8 too, after the latest kernel 2.6.24 update)