Bug 249853 - [ata_piix SRST] Failure to boot on 2.6.22.1-27.fc7
Summary: [ata_piix SRST] Failure to boot on 2.6.22.1-27.fc7
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 7
Hardware: i686
OS: Linux
low
medium
Target Milestone: ---
Assignee: Alan Cox
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-07-27 13:59 UTC by Simon Andrews
Modified: 2008-06-17 01:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-06-17 01:59:15 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg output (13.68 KB, text/plain)
2007-07-27 15:07 UTC, Simon Andrews
no flags Details
lspci output (776 bytes, text/plain)
2007-07-27 15:08 UTC, Simon Andrews
no flags Details
/etc/fstab (616 bytes, text/plain)
2007-08-03 07:47 UTC, Simon Andrews
no flags Details
/boot/grub/grub.conf (856 bytes, text/plain)
2007-08-03 07:47 UTC, Simon Andrews
no flags Details
image of boot hanging (1.10 MB, image/jpeg)
2007-08-08 06:25 UTC, Bennett Feitell
no flags Details
Boot log from failing 2.6.22 kernel (4.97 KB, text/plain)
2007-08-08 15:37 UTC, Simon Andrews
no flags Details
output of lspci -vvxxx (20.06 KB, text/plain)
2007-09-21 15:43 UTC, Bennett Feitell
no flags Details
lcpci -vvxxx output (7.34 KB, text/plain)
2007-09-24 07:58 UTC, Simon Andrews
no flags Details
lspci -vvxxx output of similarly afflicted HP desktop system (17.77 KB, text/plain)
2007-11-30 12:19 UTC, Nigel Metheringham
no flags Details

Description Simon Andrews 2007-07-27 13:59:55 UTC
Description of problem:
One of our servers fails to boot on 2.6.22.1-27.fc7.  It boots fine using the
previous kernel (2.6.21-1.3228.fc7)


How reproducible:
Always


Steps to Reproduce:
1. Boot from 2.6.22, hangs

Additional info:

The last messages seen before the kernel panic are (transcribed so may be
slightly approximate):

Trying to resume from /dev/sda3
Unable to access resume device (/dev/sda3)
Creating root device
Mounting root filesystem
mount: could not find filesystem '/dev/root'
Setting up other filesystems
Setting up new root fs
setuproot: moving /dev failed: No such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
unmounting old /dev
unmounting old /proc
unmounting old /sys
switchroot: mount failed: No such file or directory
Booting has failed
Kernel panic not syncing: Attempted to kill init!


From the messages before this I couldn't see any obvious errors and nothing
which was different from the previous kernel.

This box has a SCSI card but no attached scsi devices.  The disk is IDE and
shows up in the old kernel as /dev/sda.

Comment 1 Simon Andrews 2007-07-27 14:04:23 UTC
I just saw that 2.6.22.1-33 was out.  Tried this new kernel and it still fails
to boot with the same errors as before.

Comment 2 Chuck Ebbert 2007-07-27 15:02:48 UTC
Every kernel that fails to find its root disk fails with these same messages.
We need to know what kind of controller the root disk is on, so we know what
driver failed.

Comment 3 Simon Andrews 2007-07-27 15:07:37 UTC
Created attachment 160120 [details]
dmesg output

I'm not exactly sure what information you're after about the controller, but
dmesg and lspci seem to have some stuff which looks relevant.  If what you need
isn't there then let me know exactly what you want and I'll go and get it.

Comment 4 Simon Andrews 2007-07-27 15:08:27 UTC
Created attachment 160121 [details]
lspci output

Comment 5 Simon Andrews 2007-07-30 08:44:13 UTC
Just to try to complete the hardware picture I've added this machine to the
smolt database.  You can view its hardware profile at:

http://smolt.fedoraproject.org/show?UUID=f661f397-a5b0-44c9-8311-bf9f55145e8c

Comment 6 Simon Andrews 2007-08-02 11:05:23 UTC
Have just tried kernel.i686 2.6.22.1-41.fc7.  Doesn't boot.  Same symptoms as
previously.

Comment 7 Felix Miata 2007-08-02 16:26:34 UTC
I use the i440BX and sym53c875 on a Mandriva Cooker system with similar kernels.
I solved the problem by using labels instead of device names for fstab and
Grub's menu.lst. The 2.6.22 kernels result in device name aliases for real SCSI
ahead of those for ATA, which in your case probably means producing no sda
devices and sdb* for your ATA partitions. See
http://qa.mandriva.com/show_bug.cgi?id=31954 for my exact situation.

Comment 8 Chuck Ebbert 2007-08-02 21:56:52 UTC
Using hardcoded device names will not work.
This should probably be closed as NOTABUG?


Comment 9 Simon Andrews 2007-08-03 07:45:17 UTC
I'm not sure that that's what's happening here.  All of the partitions are
accessed via labels both in fstab and grub.conf.  The only exception is the swap
partition (can you even label a swap partition?).

I have several other servers with identical drive layouts on different hardware
and they're fine.

I'll attach my fstab and grub files in case there is something screwy going on
with it.

Comment 10 Simon Andrews 2007-08-03 07:47:05 UTC
Created attachment 160587 [details]
/etc/fstab

Comment 11 Simon Andrews 2007-08-03 07:47:46 UTC
Created attachment 160588 [details]
/boot/grub/grub.conf

Comment 12 Felix Miata 2007-08-03 12:22:56 UTC
(In reply to comment #9)
> can you even label a swap partition?

mkswap -L label device


Comment 13 Simon Andrews 2007-08-03 12:46:03 UTC
Well just for completeness I've moved this machine over to use a labelled swap
(everything else was accessed via label already).  It still won't boot on
2.6.22.x.  Same errors as before except that now it says:

Trying to resume from LABEL=SWAP
Unable to access resume device (LABEL=SWAP)

I'm guessing that the root of this is some change to the ATA driver since
nothing on the disk can be accessed?  The one entry in the changelog which looks
relevant is:

ata_piix: fix pio/mwdma programming

..but I'm clutching at straws here...


Comment 14 Chuck Ebbert 2007-08-03 22:24:54 UTC
Is there any way to capture the boot messages from that machine?
Serial console or netconsole would be required.


Comment 15 Bennett Feitell 2007-08-08 06:25:31 UTC
Created attachment 160883 [details]
image of boot hanging

It is better for all concerned that I not attempt to retype this error
accurately.

Comment 16 Bennett Feitell 2007-08-08 06:38:04 UTC
2.6.22.1-41.fc7 is the problem for me, and I would up the severity if I owned
the initial bug report.


Oh well, I seem to have lost all the text on that one.  The box is a VIA based
Socket A (mobile Athlon/Geode) board with root/boot on the onboard PATA
controller.  

This bug bit me on a yum upgrade from kernel-2.6.21-1.3228.fc7 (works fine) to
kernel 2.6.22.1-41.fc7 (broken, hangs on boot).

I am leery of putting my smolt UUID on here, but here is the IDE controller line:

VIA_IDE  	IDE  	pci  	VIA Technologies, Inc.  
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE  	VIA Technologies,
Inc.  	VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE

and here is the info other than my UUID:

OS:	Fedora release 7 (Moonshine)
platform:	i686
bogomips:	2524.47
CPU Speed:	1394.0
systemMemory:	979
CPUVendor:	AuthenticAMD
numCPUs:	1
language:	en_US.UTF-8
defaultRunlevel:	3
System Vendor:	VIA Technologies, Inc.
System Model:	KM266A-8237
Kernel	2.6.21-1.3228.fc7
Formfactor	desktop
Last Modified	2007-08-07 23:17:27

Comment 17 Bennett Feitell 2007-08-08 14:42:09 UTC
2.6.22.1-41.fc7 is the problem for me, and I would up the severity if I owned
the initial bug report.


Oh well, I seem to have lost all the text on that one.  The box is a VIA based
Socket A (mobile Athlon/Geode) board with root/boot on the onboard PATA
controller.  

This bug bit me on a yum upgrade from kernel-2.6.21-1.3228.fc7 (works fine) to
kernel 2.6.22.1-41.fc7 (broken, hangs on boot).

I am leery of putting my smolt UUID on here, but here is the IDE controller line:

VIA_IDE  	IDE  	pci  	VIA Technologies, Inc.  
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE  	VIA Technologies,
Inc.  	VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE

and here is the info other than my UUID:

OS:	Fedora release 7 (Moonshine)
platform:	i686
bogomips:	2524.47
CPU Speed:	1394.0
systemMemory:	979
CPUVendor:	AuthenticAMD
numCPUs:	1
language:	en_US.UTF-8
defaultRunlevel:	3
System Vendor:	VIA Technologies, Inc.
System Model:	KM266A-8237
Kernel	2.6.21-1.3228.fc7
Formfactor	desktop
Last Modified	2007-08-07 23:17:27

Comment 18 Simon Andrews 2007-08-08 15:37:11 UTC
Created attachment 160913 [details]
Boot log from failing 2.6.22 kernel

This is a transcription of the messages from lots of photos taken through the
boot process.  I think I've got pretty much all of the messages.

The bit which looks most suspicious is:

ata1: port is slow to respond, please be patient (Status 0x80)
ata1: SRST failed (errno=-16)

..but there may be other bad things which have passed me by.

Comment 19 Chuck Ebbert 2007-08-08 17:08:55 UTC
Does the kernel option "pci=nomsi" make a difference?

Comment 20 Simon Andrews 2007-08-09 07:59:07 UTC
(In reply to comment #19)
> Does the kernel option "pci=nomsi" make a difference?

No.  Same errors and failure.

Comment 21 tkunze71216 2007-08-20 15:59:20 UTC
My system doesn't boot with 2.6.22.1-41.fc7, too. It hangs for a while and then
I get this:
Unable to access resume device (LABEL=SWAP-sda9)
mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed
(some more lines similar to the one above)
Kernel panic: not syncing: Attempted to kill init!

Before that I also get this:
PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG
But I have been told this probably isn't related.

My system:
OS: Fedora release 7 (Moonshine)
Platform: x86_64
CPU: AMD Athlon64 X2 4200+
Chipset: AMD RD580 (on a MSI K9A Platinum)
HDD: attached to SATA
BIOS setting for SATA: Native IDE
Cool'n'Quiet is disabled
Kernel: 2.6.22-1.41.fc7

Comment 22 Simon Andrews 2007-08-21 14:12:11 UTC
(In reply to comment #18)
> The bit which looks most suspicious is:
> 
> ata1: port is slow to respond, please be patient (Status 0x80)
> ata1: SRST failed (errno=-16)

Looking back at the dmesg output for the 2.6.21 kernel which boots on this
machine, the quoted portion above is where the messages I get differ between the
two kernels.  On 2.6.21 it says:

ata1: port is slow to respond, please be patient (Status 0x80)
ATA: abnormal status 0x3F on port 0x000101f7

..but the second line appears to just be a warning as it then goes on with the
ATA statup, which doesn't happen in the 2.6.22 kernels:

ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata1.00: ATA-6: WDC WD2500BB-00KEA0, 08.05J08, max UDMA/100
ata1.00: 488397168 sectors, multi 16: LBA48
ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata1.00: configured for UDMA/33

Maybe the abnormal status warning is a better clue to what's going wrong on this
machine?




Comment 23 Simon Andrews 2007-08-28 11:10:24 UTC
I've just tried the new 2.6.22.4-65.fc7 kernel.  Still won't boot.  Same errors.

Comment 24 Christopher Brown 2007-09-21 11:24:33 UTC
Hello Simon,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I'm re-assigning this to the relevant maintainer who may be able to shed some
more light on the issue. In the meantime please could attach an lspci -vvxxx output.

Cheers
Chris

Comment 25 Bennett Feitell 2007-09-21 15:37:32 UTC
All of the following kernels fail to boot for me on VIA hardware.
kernel-2.6.22.1-41.fc7
kernel-2.6.22.4-65.fc7
kernel-2.6.22.5-76.fc7

As per the request I received via email, I will attach the output of lspci -vvxxx

Comment 26 Bennett Feitell 2007-09-21 15:43:02 UTC
Created attachment 202431 [details]
output of lspci -vvxxx

Here is the output of lspci -vvxxx on the machine that will not boot.  The
machine is based on VIA hardware.   It is a Biostar M7-VIG400.	The only odd
bit of hardware in the machine is a Realtek 4 port switch pci card, and boot
fails way too early for that to be an issue.

Comment 27 Alan Cox 2007-09-21 15:45:46 UTC
Please can you open a seperate bug for the VIA stuff yu are seeing as its
actually a different bug (no need to readd the attachments, just mention this
bug has them)


Comment 28 Simon Andrews 2007-09-24 07:58:20 UTC
Created attachment 203951 [details]
lcpci -vvxxx output

This is the lscpi output from the machine on which this bug was originally
reported.

Comment 29 Nigel Metheringham 2007-11-30 12:19:10 UTC
Created attachment 273731 [details]
lspci -vvxxx output of similarly afflicted HP desktop system

Comment 30 Bug Zapper 2008-05-14 13:42:18 UTC
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 31 Bug Zapper 2008-06-17 01:59:14 UTC
Fedora 7 changed to end-of-life (EOL) status on June 13, 2008. 
Fedora 7 is no longer maintained, which means that it will not 
receive any further security or bug fix updates. As a result we 
are closing this bug. 

If you can reproduce this bug against a currently maintained version 
of Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.