Bug 249853 - [ata_piix SRST] Failure to boot on 2.6.22.1-27.fc7
[ata_piix SRST] Failure to boot on 2.6.22.1-27.fc7
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
7
i686 Linux
low Severity medium
: ---
: ---
Assigned To: Alan Cox
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-27 09:59 EDT by Simon Andrews
Modified: 2008-06-16 21:59 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-06-16 21:59:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg output (13.68 KB, text/plain)
2007-07-27 11:07 EDT, Simon Andrews
no flags Details
lspci output (776 bytes, text/plain)
2007-07-27 11:08 EDT, Simon Andrews
no flags Details
/etc/fstab (616 bytes, text/plain)
2007-08-03 03:47 EDT, Simon Andrews
no flags Details
/boot/grub/grub.conf (856 bytes, text/plain)
2007-08-03 03:47 EDT, Simon Andrews
no flags Details
image of boot hanging (1.10 MB, image/jpeg)
2007-08-08 02:25 EDT, Bennett Feitell
no flags Details
Boot log from failing 2.6.22 kernel (4.97 KB, text/plain)
2007-08-08 11:37 EDT, Simon Andrews
no flags Details
output of lspci -vvxxx (20.06 KB, text/plain)
2007-09-21 11:43 EDT, Bennett Feitell
no flags Details
lcpci -vvxxx output (7.34 KB, text/plain)
2007-09-24 03:58 EDT, Simon Andrews
no flags Details
lspci -vvxxx output of similarly afflicted HP desktop system (17.77 KB, text/plain)
2007-11-30 07:19 EST, Nigel Metheringham
no flags Details

  None (edit)
Description Simon Andrews 2007-07-27 09:59:55 EDT
Description of problem:
One of our servers fails to boot on 2.6.22.1-27.fc7.  It boots fine using the
previous kernel (2.6.21-1.3228.fc7)


How reproducible:
Always


Steps to Reproduce:
1. Boot from 2.6.22, hangs

Additional info:

The last messages seen before the kernel panic are (transcribed so may be
slightly approximate):

Trying to resume from /dev/sda3
Unable to access resume device (/dev/sda3)
Creating root device
Mounting root filesystem
mount: could not find filesystem '/dev/root'
Setting up other filesystems
Setting up new root fs
setuproot: moving /dev failed: No such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
unmounting old /dev
unmounting old /proc
unmounting old /sys
switchroot: mount failed: No such file or directory
Booting has failed
Kernel panic not syncing: Attempted to kill init!


From the messages before this I couldn't see any obvious errors and nothing
which was different from the previous kernel.

This box has a SCSI card but no attached scsi devices.  The disk is IDE and
shows up in the old kernel as /dev/sda.
Comment 1 Simon Andrews 2007-07-27 10:04:23 EDT
I just saw that 2.6.22.1-33 was out.  Tried this new kernel and it still fails
to boot with the same errors as before.
Comment 2 Chuck Ebbert 2007-07-27 11:02:48 EDT
Every kernel that fails to find its root disk fails with these same messages.
We need to know what kind of controller the root disk is on, so we know what
driver failed.
Comment 3 Simon Andrews 2007-07-27 11:07:37 EDT
Created attachment 160120 [details]
dmesg output

I'm not exactly sure what information you're after about the controller, but
dmesg and lspci seem to have some stuff which looks relevant.  If what you need
isn't there then let me know exactly what you want and I'll go and get it.
Comment 4 Simon Andrews 2007-07-27 11:08:27 EDT
Created attachment 160121 [details]
lspci output
Comment 5 Simon Andrews 2007-07-30 04:44:13 EDT
Just to try to complete the hardware picture I've added this machine to the
smolt database.  You can view its hardware profile at:

http://smolt.fedoraproject.org/show?UUID=f661f397-a5b0-44c9-8311-bf9f55145e8c
Comment 6 Simon Andrews 2007-08-02 07:05:23 EDT
Have just tried kernel.i686 2.6.22.1-41.fc7.  Doesn't boot.  Same symptoms as
previously.
Comment 7 Felix Miata 2007-08-02 12:26:34 EDT
I use the i440BX and sym53c875 on a Mandriva Cooker system with similar kernels.
I solved the problem by using labels instead of device names for fstab and
Grub's menu.lst. The 2.6.22 kernels result in device name aliases for real SCSI
ahead of those for ATA, which in your case probably means producing no sda
devices and sdb* for your ATA partitions. See
http://qa.mandriva.com/show_bug.cgi?id=31954 for my exact situation.
Comment 8 Chuck Ebbert 2007-08-02 17:56:52 EDT
Using hardcoded device names will not work.
This should probably be closed as NOTABUG?
Comment 9 Simon Andrews 2007-08-03 03:45:17 EDT
I'm not sure that that's what's happening here.  All of the partitions are
accessed via labels both in fstab and grub.conf.  The only exception is the swap
partition (can you even label a swap partition?).

I have several other servers with identical drive layouts on different hardware
and they're fine.

I'll attach my fstab and grub files in case there is something screwy going on
with it.
Comment 10 Simon Andrews 2007-08-03 03:47:05 EDT
Created attachment 160587 [details]
/etc/fstab
Comment 11 Simon Andrews 2007-08-03 03:47:46 EDT
Created attachment 160588 [details]
/boot/grub/grub.conf
Comment 12 Felix Miata 2007-08-03 08:22:56 EDT
(In reply to comment #9)
> can you even label a swap partition?

mkswap -L label device
Comment 13 Simon Andrews 2007-08-03 08:46:03 EDT
Well just for completeness I've moved this machine over to use a labelled swap
(everything else was accessed via label already).  It still won't boot on
2.6.22.x.  Same errors as before except that now it says:

Trying to resume from LABEL=SWAP
Unable to access resume device (LABEL=SWAP)

I'm guessing that the root of this is some change to the ATA driver since
nothing on the disk can be accessed?  The one entry in the changelog which looks
relevant is:

ata_piix: fix pio/mwdma programming

..but I'm clutching at straws here...
Comment 14 Chuck Ebbert 2007-08-03 18:24:54 EDT
Is there any way to capture the boot messages from that machine?
Serial console or netconsole would be required.
Comment 15 Bennett Feitell 2007-08-08 02:25:31 EDT
Created attachment 160883 [details]
image of boot hanging

It is better for all concerned that I not attempt to retype this error
accurately.
Comment 16 Bennett Feitell 2007-08-08 02:38:04 EDT
2.6.22.1-41.fc7 is the problem for me, and I would up the severity if I owned
the initial bug report.


Oh well, I seem to have lost all the text on that one.  The box is a VIA based
Socket A (mobile Athlon/Geode) board with root/boot on the onboard PATA
controller.  

This bug bit me on a yum upgrade from kernel-2.6.21-1.3228.fc7 (works fine) to
kernel 2.6.22.1-41.fc7 (broken, hangs on boot).

I am leery of putting my smolt UUID on here, but here is the IDE controller line:

VIA_IDE  	IDE  	pci  	VIA Technologies, Inc.  
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE  	VIA Technologies,
Inc.  	VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE

and here is the info other than my UUID:

OS:	Fedora release 7 (Moonshine)
platform:	i686
bogomips:	2524.47
CPU Speed:	1394.0
systemMemory:	979
CPUVendor:	AuthenticAMD
numCPUs:	1
language:	en_US.UTF-8
defaultRunlevel:	3
System Vendor:	VIA Technologies, Inc.
System Model:	KM266A-8237
Kernel	2.6.21-1.3228.fc7
Formfactor	desktop
Last Modified	2007-08-07 23:17:27
Comment 17 Bennett Feitell 2007-08-08 10:42:09 EDT
2.6.22.1-41.fc7 is the problem for me, and I would up the severity if I owned
the initial bug report.


Oh well, I seem to have lost all the text on that one.  The box is a VIA based
Socket A (mobile Athlon/Geode) board with root/boot on the onboard PATA
controller.  

This bug bit me on a yum upgrade from kernel-2.6.21-1.3228.fc7 (works fine) to
kernel 2.6.22.1-41.fc7 (broken, hangs on boot).

I am leery of putting my smolt UUID on here, but here is the IDE controller line:

VIA_IDE  	IDE  	pci  	VIA Technologies, Inc.  
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE  	VIA Technologies,
Inc.  	VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE

and here is the info other than my UUID:

OS:	Fedora release 7 (Moonshine)
platform:	i686
bogomips:	2524.47
CPU Speed:	1394.0
systemMemory:	979
CPUVendor:	AuthenticAMD
numCPUs:	1
language:	en_US.UTF-8
defaultRunlevel:	3
System Vendor:	VIA Technologies, Inc.
System Model:	KM266A-8237
Kernel	2.6.21-1.3228.fc7
Formfactor	desktop
Last Modified	2007-08-07 23:17:27
Comment 18 Simon Andrews 2007-08-08 11:37:11 EDT
Created attachment 160913 [details]
Boot log from failing 2.6.22 kernel

This is a transcription of the messages from lots of photos taken through the
boot process.  I think I've got pretty much all of the messages.

The bit which looks most suspicious is:

ata1: port is slow to respond, please be patient (Status 0x80)
ata1: SRST failed (errno=-16)

..but there may be other bad things which have passed me by.
Comment 19 Chuck Ebbert 2007-08-08 13:08:55 EDT
Does the kernel option "pci=nomsi" make a difference?
Comment 20 Simon Andrews 2007-08-09 03:59:07 EDT
(In reply to comment #19)
> Does the kernel option "pci=nomsi" make a difference?

No.  Same errors and failure.
Comment 21 tkunze71216 2007-08-20 11:59:20 EDT
My system doesn't boot with 2.6.22.1-41.fc7, too. It hangs for a while and then
I get this:
Unable to access resume device (LABEL=SWAP-sda9)
mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed
(some more lines similar to the one above)
Kernel panic: not syncing: Attempted to kill init!

Before that I also get this:
PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG
But I have been told this probably isn't related.

My system:
OS: Fedora release 7 (Moonshine)
Platform: x86_64
CPU: AMD Athlon64 X2 4200+
Chipset: AMD RD580 (on a MSI K9A Platinum)
HDD: attached to SATA
BIOS setting for SATA: Native IDE
Cool'n'Quiet is disabled
Kernel: 2.6.22-1.41.fc7
Comment 22 Simon Andrews 2007-08-21 10:12:11 EDT
(In reply to comment #18)
> The bit which looks most suspicious is:
> 
> ata1: port is slow to respond, please be patient (Status 0x80)
> ata1: SRST failed (errno=-16)

Looking back at the dmesg output for the 2.6.21 kernel which boots on this
machine, the quoted portion above is where the messages I get differ between the
two kernels.  On 2.6.21 it says:

ata1: port is slow to respond, please be patient (Status 0x80)
ATA: abnormal status 0x3F on port 0x000101f7

..but the second line appears to just be a warning as it then goes on with the
ATA statup, which doesn't happen in the 2.6.22 kernels:

ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata1.00: ATA-6: WDC WD2500BB-00KEA0, 08.05J08, max UDMA/100
ata1.00: 488397168 sectors, multi 16: LBA48
ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata1.00: configured for UDMA/33

Maybe the abnormal status warning is a better clue to what's going wrong on this
machine?


Comment 23 Simon Andrews 2007-08-28 07:10:24 EDT
I've just tried the new 2.6.22.4-65.fc7 kernel.  Still won't boot.  Same errors.
Comment 24 Christopher Brown 2007-09-21 07:24:33 EDT
Hello Simon,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I'm re-assigning this to the relevant maintainer who may be able to shed some
more light on the issue. In the meantime please could attach an lspci -vvxxx output.

Cheers
Chris
Comment 25 Bennett Feitell 2007-09-21 11:37:32 EDT
All of the following kernels fail to boot for me on VIA hardware.
kernel-2.6.22.1-41.fc7
kernel-2.6.22.4-65.fc7
kernel-2.6.22.5-76.fc7

As per the request I received via email, I will attach the output of lspci -vvxxx
Comment 26 Bennett Feitell 2007-09-21 11:43:02 EDT
Created attachment 202431 [details]
output of lspci -vvxxx

Here is the output of lspci -vvxxx on the machine that will not boot.  The
machine is based on VIA hardware.   It is a Biostar M7-VIG400.	The only odd
bit of hardware in the machine is a Realtek 4 port switch pci card, and boot
fails way too early for that to be an issue.
Comment 27 Alan Cox 2007-09-21 11:45:46 EDT
Please can you open a seperate bug for the VIA stuff yu are seeing as its
actually a different bug (no need to readd the attachments, just mention this
bug has them)
Comment 28 Simon Andrews 2007-09-24 03:58:20 EDT
Created attachment 203951 [details]
lcpci -vvxxx output

This is the lscpi output from the machine on which this bug was originally
reported.
Comment 29 Nigel Metheringham 2007-11-30 07:19:10 EST
Created attachment 273731 [details]
lspci -vvxxx output of similarly afflicted HP desktop system
Comment 30 Bug Zapper 2008-05-14 09:42:18 EDT
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 31 Bug Zapper 2008-06-16 21:59:14 EDT
Fedora 7 changed to end-of-life (EOL) status on June 13, 2008. 
Fedora 7 is no longer maintained, which means that it will not 
receive any further security or bug fix updates. As a result we 
are closing this bug. 

If you can reproduce this bug against a currently maintained version 
of Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.