Bug 300701 - Wrong initrd ? Failure to boot 2.6.22 series kernels.
Wrong initrd ? Failure to boot 2.6.22 series kernels.
Status: CLOSED DUPLICATE of bug 237415
Product: Fedora
Classification: Fedora
Component: mkinitrd (Show other bugs)
7
i686 Linux
low Severity medium
: ---
: ---
Assigned To: Peter Jones
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-09-21 12:07 EDT by Bennett Feitell
Modified: 2007-12-13 11:42 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-12-13 11:42:41 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bennett Feitell 2007-09-21 12:07:26 EDT
+++ This bug was initially created as a clone of Bug #249853 +++

Description of problem:
One of my machines fails to boot any of the released 2.6.22 kernels, this includes:
kernel-2.6.22.1-41.fc7
kernel-2.6.22.4-65.fc7
kernel-2.6.22.5-76.fc7

It boots fine using the previous kernel (2.6.21-1.3228.fc7)


How reproducible:
Always


Steps to Reproduce:
1. Boot from 2.6.22, hangs

Additional info:

The last messages seen before the kernel panic are (transcribed so may be
slightly approximate):

Mounting root filesystem
mount: could not find filesystem '/dev/root'
Setting up other filesystems
Setting up new root fs
setuproot: moving /dev failed: No such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
unmounting old /dev
unmounting old /proc
unmounting old /sys
switchroot: mount failed: No such file or directory
Booting has failed
Kernel panic not syncing: Attempted to kill init!

-----  material below is from original bug from which this bug has been cloned
-----  the new bug is very similar in behavior, but occurs on VIA hardware.

From the messages before this I couldn't see any obvious errors and nothing
which was different from the previous kernel.

This box has a SCSI card but no attached scsi devices.  The disk is IDE and
shows up in the old kernel as /dev/sda.

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-07-27 10:04 EST --
I just saw that 2.6.22.1-33 was out.  Tried this new kernel and it still fails
to boot with the same errors as before.

-- Additional comment from cebbert@redhat.com on 2007-07-27 11:02 EST --
Every kernel that fails to find its root disk fails with these same messages.
We need to know what kind of controller the root disk is on, so we know what
driver failed.

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-07-27 11:07 EST --
Created an attachment (id=160120)
dmesg output

I'm not exactly sure what information you're after about the controller, but
dmesg and lspci seem to have some stuff which looks relevant.  If what you need
isn't there then let me know exactly what you want and I'll go and get it.

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-07-27 11:08 EST --
Created an attachment (id=160121)
lspci output


-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-07-30 04:44 EST --
Just to try to complete the hardware picture I've added this machine to the
smolt database.  You can view its hardware profile at:

http://smolt.fedoraproject.org/show?UUID=f661f397-a5b0-44c9-8311-bf9f55145e8c

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-02 07:05 EST --
Have just tried kernel.i686 2.6.22.1-41.fc7.  Doesn't boot.  Same symptoms as
previously.

-- Additional comment from mrmazda@ij.net on 2007-08-02 12:26 EST --
I use the i440BX and sym53c875 on a Mandriva Cooker system with similar kernels.
I solved the problem by using labels instead of device names for fstab and
Grub's menu.lst. The 2.6.22 kernels result in device name aliases for real SCSI
ahead of those for ATA, which in your case probably means producing no sda
devices and sdb* for your ATA partitions. See
http://qa.mandriva.com/show_bug.cgi?id=31954 for my exact situation.

-- Additional comment from cebbert@redhat.com on 2007-08-02 17:56 EST --
Using hardcoded device names will not work.
This should probably be closed as NOTABUG?


-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-03 03:45 EST --
I'm not sure that that's what's happening here.  All of the partitions are
accessed via labels both in fstab and grub.conf.  The only exception is the swap
partition (can you even label a swap partition?).

I have several other servers with identical drive layouts on different hardware
and they're fine.

I'll attach my fstab and grub files in case there is something screwy going on
with it.

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-03 03:47 EST --
Created an attachment (id=160587)
/etc/fstab


-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-03 03:47 EST --
Created an attachment (id=160588)
/boot/grub/grub.conf


-- Additional comment from mrmazda@ij.net on 2007-08-03 08:22 EST --
(In reply to comment #9)
> can you even label a swap partition?

mkswap -L label device


-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-03 08:46 EST --
Well just for completeness I've moved this machine over to use a labelled swap
(everything else was accessed via label already).  It still won't boot on
2.6.22.x.  Same errors as before except that now it says:

Trying to resume from LABEL=SWAP
Unable to access resume device (LABEL=SWAP)

I'm guessing that the root of this is some change to the ATA driver since
nothing on the disk can be accessed?  The one entry in the changelog which looks
relevant is:

ata_piix: fix pio/mwdma programming

..but I'm clutching at straws here...


-- Additional comment from cebbert@redhat.com on 2007-08-03 18:24 EST --
Is there any way to capture the boot messages from that machine?
Serial console or netconsole would be required.


-- Additional comment from bugzilla@bfeitell.users.panix.com on 2007-08-08 02:25
EST --
Created an attachment (id=160883)
image of boot hanging

It is better for all concerned that I not attempt to retype this error
accurately.

-- Additional comment from bugzilla@bfeitell.users.panix.com on 2007-08-08 02:38
EST --
2.6.22.1-41.fc7 is the problem for me, and I would up the severity if I owned
the initial bug report.


Oh well, I seem to have lost all the text on that one.  The box is a VIA based
Socket A (mobile Athlon/Geode) board with root/boot on the onboard PATA
controller.  

This bug bit me on a yum upgrade from kernel-2.6.21-1.3228.fc7 (works fine) to
kernel 2.6.22.1-41.fc7 (broken, hangs on boot).

I am leery of putting my smolt UUID on here, but here is the IDE controller line:

VIA_IDE  	IDE  	pci  	VIA Technologies, Inc.  
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE  	VIA Technologies,
Inc.  	VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE

and here is the info other than my UUID:

OS:	Fedora release 7 (Moonshine)
platform:	i686
bogomips:	2524.47
CPU Speed:	1394.0
systemMemory:	979
CPUVendor:	AuthenticAMD
numCPUs:	1
language:	en_US.UTF-8
defaultRunlevel:	3
System Vendor:	VIA Technologies, Inc.
System Model:	KM266A-8237
Kernel	2.6.21-1.3228.fc7
Formfactor	desktop
Last Modified	2007-08-07 23:17:27

-- Additional comment from bugzilla@bfeitell.users.panix.com on 2007-08-08 10:42
EST --
2.6.22.1-41.fc7 is the problem for me, and I would up the severity if I owned
the initial bug report.


Oh well, I seem to have lost all the text on that one.  The box is a VIA based
Socket A (mobile Athlon/Geode) board with root/boot on the onboard PATA
controller.  

This bug bit me on a yum upgrade from kernel-2.6.21-1.3228.fc7 (works fine) to
kernel 2.6.22.1-41.fc7 (broken, hangs on boot).

I am leery of putting my smolt UUID on here, but here is the IDE controller line:

VIA_IDE  	IDE  	pci  	VIA Technologies, Inc.  
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE  	VIA Technologies,
Inc.  	VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE

and here is the info other than my UUID:

OS:	Fedora release 7 (Moonshine)
platform:	i686
bogomips:	2524.47
CPU Speed:	1394.0
systemMemory:	979
CPUVendor:	AuthenticAMD
numCPUs:	1
language:	en_US.UTF-8
defaultRunlevel:	3
System Vendor:	VIA Technologies, Inc.
System Model:	KM266A-8237
Kernel	2.6.21-1.3228.fc7
Formfactor	desktop
Last Modified	2007-08-07 23:17:27

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-08 11:37 EST --
Created an attachment (id=160913)
Boot log from failing 2.6.22 kernel

This is a transcription of the messages from lots of photos taken through the
boot process.  I think I've got pretty much all of the messages.

The bit which looks most suspicious is:

ata1: port is slow to respond, please be patient (Status 0x80)
ata1: SRST failed (errno=-16)

..but there may be other bad things which have passed me by.

-- Additional comment from cebbert@redhat.com on 2007-08-08 13:08 EST --
Does the kernel option "pci=nomsi" make a difference?

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-09 03:59 EST --
(In reply to comment #19)
> Does the kernel option "pci=nomsi" make a difference?

No.  Same errors and failure.

-- Additional comment from tkunze71216@gmx.de on 2007-08-20 11:59 EST --
My system doesn't boot with 2.6.22.1-41.fc7, too. It hangs for a while and then
I get this:
Unable to access resume device (LABEL=SWAP-sda9)
mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed
(some more lines similar to the one above)
Kernel panic: not syncing: Attempted to kill init!

Before that I also get this:
PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG
But I have been told this probably isn't related.

My system:
OS: Fedora release 7 (Moonshine)
Platform: x86_64
CPU: AMD Athlon64 X2 4200+
Chipset: AMD RD580 (on a MSI K9A Platinum)
HDD: attached to SATA
BIOS setting for SATA: Native IDE
Cool'n'Quiet is disabled
Kernel: 2.6.22-1.41.fc7

-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-21 10:12 EST --
(In reply to comment #18)
> The bit which looks most suspicious is:
> 
> ata1: port is slow to respond, please be patient (Status 0x80)
> ata1: SRST failed (errno=-16)

Looking back at the dmesg output for the 2.6.21 kernel which boots on this
machine, the quoted portion above is where the messages I get differ between the
two kernels.  On 2.6.21 it says:

ata1: port is slow to respond, please be patient (Status 0x80)
ATA: abnormal status 0x3F on port 0x000101f7

..but the second line appears to just be a warning as it then goes on with the
ATA statup, which doesn't happen in the 2.6.22 kernels:

ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata1.00: ATA-6: WDC WD2500BB-00KEA0, 08.05J08, max UDMA/100
ata1.00: 488397168 sectors, multi 16: LBA48
ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata1.00: configured for UDMA/33

Maybe the abnormal status warning is a better clue to what's going wrong on this
machine?




-- Additional comment from simon.andrews@bbsrc.ac.uk on 2007-08-28 07:10 EST --
I've just tried the new 2.6.22.4-65.fc7 kernel.  Still won't boot.  Same errors.

-- Additional comment from snecklifter@gmail.com on 2007-09-21 07:24 EST --
Hello Simon,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I'm re-assigning this to the relevant maintainer who may be able to shed some
more light on the issue. In the meantime please could attach an lspci -vvxxx output.

Cheers
Chris

-- Additional comment from bugzilla@bfeitell.users.panix.com on 2007-09-21 11:37
EST --
All of the following kernels fail to boot for me on VIA hardware.
kernel-2.6.22.1-41.fc7
kernel-2.6.22.4-65.fc7
kernel-2.6.22.5-76.fc7

As per the request I received via email, I will attach the output of lspci -vvxxx

-- Additional comment from bugzilla@bfeitell.users.panix.com on 2007-09-21 11:43
EST --
Created an attachment (id=202431)
output of lspci -vvxxx

Here is the output of lspci -vvxxx on the machine that will not boot.  The
machine is based on VIA hardware.   It is a Biostar M7-VIG400.	The only odd
bit of hardware in the machine is a Realtek 4 port switch pci card, and boot
fails way too early for that to be an issue.

-- Additional comment from alan@redhat.com on 2007-09-21 11:45 EST --
Please can you open a seperate bug for the VIA stuff yu are seeing as its
actually a different bug (no need to readd the attachments, just mention this
bug has them)
Comment 1 Bennett Feitell 2007-09-28 06:07:05 EDT
I think this may be another manifestation of BUG# 237415.

On another machine (p4), I recreated the same problem by moving the non-LVM
installation to a new disk using LVM.  With labels is /etc/grub.conf, and
/etc/fstab," --force-lvm-probe --with=lvm " was not enough to have mkinitrd
build an initrd with LVM!  I followed comment #3 of BUG#237415, I shifted
/etc/grub.conf and /etc/fstab to /dev/mapper/foo-bar references and mkinitrd
properly built an initrd that will boot the system.

NOTE: I have not tested this workaround on the AMD system that is the basis of
this bug report.  I will add an additional comment if the fix works on the AMD
machine.  I think this also might be a workaround for #249853.

This looks like a continuing problem with lvm.static and/or anaconda.

 
Comment 2 Bennett Feitell 2007-09-28 06:08:18 EDT
Here is a direct link to comment #3
https://bugzilla.redhat.com/show_bug.cgi?id=237415#c3
Comment 3 Bennett Feitell 2007-09-28 06:25:46 EDT
BINGO!

Switching to /dev/mapper/foo-bar references for / (rather than using labels) in
both /etc/grub.conf and /etc/fstab allows my AMD system to boot the current kernels.

Once again, for reference, the workaround I employed is based upon:
https://bugzilla.redhat.com/show_bug.cgi?id=237415#c3

I do believe this might be relevant to:
https://bugzilla.redhat.com/show_bug.cgi?id=249853
Comment 4 Christopher Brown 2007-12-13 11:42:41 EST

*** This bug has been marked as a duplicate of 237415 ***

Note You need to log in before you can comment on or make changes to this bug.