Bug 483423 - mkinitrd still fails for me
Summary: mkinitrd still fails for me
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: mkinitrd
Version: 10
Hardware: i386
OS: Linux
low
high
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-01 04:38 UTC by Wei Vy
Modified: 2009-02-20 07:38 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-02-20 07:38:47 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
lspci output (1.24 KB, text/plain)
2009-02-01 04:38 UTC, Wei Vy
no flags Details
fstab from problem system (534 bytes, text/plain)
2009-02-03 14:42 UTC, Wei Vy
no flags Details
blkid output (redacted) (903 bytes, text/plain)
2009-02-03 14:44 UTC, Wei Vy
no flags Details
The init file from the fail (1.83 KB, text/plain)
2009-02-07 20:37 UTC, Wei Vy
no flags Details
log of mkinitrd -v (9.66 KB, text/plain)
2009-02-07 20:39 UTC, Wei Vy
no flags Details
Init from last working (and currently still working) kernel 2.6.26.6-49.fc8 (2.64 KB, text/plain)
2009-02-19 06:49 UTC, Wei Vy
no flags Details

Description Wei Vy 2009-02-01 04:38:59 UTC
Created attachment 330536 [details]
lspci output

Description of problem:
mkinitrd creates bad init script for my hardware/filesystem configuration

Version-Release number of selected component (if applicable): 6.0.71-3


How reproducible: 100%


Steps to Reproduce:
1. mkinitrd -f /boot/initrd-2.6.27.12-170.2.5.fc10.i686.img 2.6.27.12-170.2.5.fc10.i686  --with=scsi_wait_scan
2. reboot
3. fail
  
Actual results: system fails to see logical volumes during boot and so does not complete boot


Expected results: boot


Additional info: mkinitrd still makes a bad initrd for my hardware. My system boots just fine with the 2.6.26.6-49.fc8 kernel and initrd.  I have tried all the tricks I could find from the common bugs page, including modifying all of grub.conf, sysconfig/mkinitrd, etc, and I've even tried hand hacking the init script generated by mkinitrd, and downgrading to an earlier version of mkinitrd, all with no luck.

My system has two ata controllers and a scsi controller.  The motherboard ata controller has nothing attached to it, the second add-in pci controller has my boot drive and lv on it, and the third, scsi controller has a zip drive on it.
There's nothing special about the lv. No encryption, only one disk... it's just not on the mobo ata controller.

Comment 1 Hans de Goede 2009-02-01 08:33:46 UTC
Can you please attach your /etc/fstab and the output of the blkid command?

Comment 2 Wei Vy 2009-02-03 14:42:04 UTC
Created attachment 330731 [details]
fstab from problem system

Nothing remarkable here

Comment 3 Wei Vy 2009-02-03 14:44:48 UTC
Created attachment 330732 [details]
blkid output (redacted)

System was upgraded from fc8 to fc10.  uuids redacted.  I'll share them via direct email, but I don't want to post them.

Comment 4 Hans de Goede 2009-02-05 10:21:58 UTC
Can you please as root do:
zcat /boot/<faultyinitrd.img> | cpio -i init

This will extract the init file from the initrd, and then attach that file here?

And also, run:
mkinitrd -v test.img $(uname -r) > log

And attach the log file ?

Comment 5 Wei Vy 2009-02-07 20:37:18 UTC
Created attachment 331216 [details]
The init file from the fail

When I use the latest kernel, I never see any text output indicating detection/probe of the dc39x. I do see them with the fc8 kernel.

Comment 6 Wei Vy 2009-02-07 20:39:53 UTC
Created attachment 331217 [details]
log of mkinitrd -v

Was unable to comply precisely, because I can't run any fc10 kernels.  I am currently only able to boot 2.6.26.6-49.fc8.  This log is from the command 'mkinitrd -v -f /boot/initrd-2.6.27.12-170.2.5.fc10.i686.img 2.6.27.12-170.2.5.fc10.i686 > log.txt'.

Comment 7 Wei Vy 2009-02-07 20:46:36 UTC
I've also gone into the init file and inserted 'stabilized' by hand just after the modprobe -q dc395x... still no joy.

# diff -u init.txt init_2.txt
--- init.txt	2009-02-07 13:28:02.000000000 -0700
+++ init_2.txt	2009-02-07 13:40:10.000000000 -0700
@@ -51,6 +51,8 @@
 modprobe -q pata_via
 echo "Loading dc395x module"
 modprobe -q dc395x
+echo Waiting for driver initialization.
+stabilized --hash --interval 2500 /proc/scsi/scsi
 echo Making device-mapper control node
 mkdmnod
 modprobe scsi_wait_scan

Comment 8 Hans de Goede 2009-02-08 13:50:36 UTC
If I read the logs right, your / filesystem is /dev/VolGroup00/LogVol00, right?

And VolGroup00 uses a single partition sda3 which is on  pata disk, which is connected to the promise pata controller, correct?

Can you please attach /etc/grub.conf (or atleast tell me the root= argument to the kernel) ?

Unless there is a mistake there the initrd is fine and this is a kernel issue.

Comment 9 Wei Vy 2009-02-11 16:14:30 UTC
Your observations are correct.  I've pasted the grub.conf below. I'm a little leery of punting this to a kernel problem just yet, but it could be.  I'm going to try ripping out my dc395 hardware this weekend and seeing if the boot works.

# cat grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,1)
#          kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
#          initrd /initrd-version.img
#boot=/dev/sda
default=2
timeout=5
splashimage=(hd0,1)/grub/splash.xpm.gz
# hiddenmenu
title Fedora (2.6.27.12-170.2.5.fc10.i686) by hand
        root (hd0,1)
        kernel /vmlinuz-2.6.27.12-170.2.5.fc10.i686 ro root=/dev/VolGroup00/LogVol00 selinux=0 mod_scsi.scan=sync
        initrd /initrd.img
title Fedora (2.6.27.12-170.2.5.fc10.i686)
        root (hd0,1)
        kernel /vmlinuz-2.6.27.12-170.2.5.fc10.i686 ro root=/dev/VolGroup00/LogVol00 selinux=0 mod_scsi.scan=sync
        initrd /initrd-2.6.27.12-170.2.5.fc10.i686.img
title Fedora (2.6.26.6-49.fc8)
        root (hd0,1)
        kernel /vmlinuz-2.6.26.6-49.fc8 ro root=/dev/VolGroup00/LogVol00 selinux=0
        initrd /initrd-2.6.26.6-49.fc8.img
title WIN98SE
        rootnoverify (hd0,0)
        chainloader +1

Comment 10 Wei Vy 2009-02-18 02:58:08 UTC
I removed the dc395 scsi card from this system, as well as altering the initrd to remove the driver.  The system still fails to boot in what appears to be precisely the same way. I'm open to suggestions of things to try to help diagnose the problem.

Comment 11 Hans de Goede 2009-02-18 21:53:29 UTC
Can you please as root do:
zcat /boot/<workinginitrd.img> | cpio -i init

And attach the resulting init

Also can you try the latest mkinitrd from updates-testing:
yum update --enablerepo=updates-testing mkinitrd

And then regenerate the initrd?

Comment 12 Wei Vy 2009-02-19 06:49:42 UTC
Created attachment 332502 [details]
Init from last working (and currently still working) kernel 2.6.26.6-49.fc8

Comment 13 Wei Vy 2009-02-19 06:57:11 UTC
will try the testing mkinitrd/nash some time after 8am edt

Comment 14 Wei Vy 2009-02-19 15:12:34 UTC
System does not boot with updates-testing nash/initrd. Same symptoms.

Comment 15 Hans de Goede 2009-02-19 18:57:38 UTC
(In reply to comment #12)
> Created an attachment (id=332502) [details]
> Init from last working (and currently still working) kernel 2.6.26.6-49.fc8

There are no relevant changes between the working and non working initrd's, so I still believe this is a kernel bug.

The only other option I see is that the "mod_scsi.scan=sync" kernel cmdline option you added is having bad interactions with scsi_wait_scan, so the last thing you can try is to remove that, if that doesn't work either I *really* don't know what the problem is.

Comment 16 Wei Vy 2009-02-20 03:03:16 UTC
The mod_scsi.scan was something I added after the problem started, so I'm sure it's not the culprit.

At this point, I agree that it's a kernel problem.  I decided to try the experiment of using the newer mkinitrd to rebuild the image for the older fc8 kernel.  It still boots just fine.

It looks to me like the 2.6.27 stream is back to hating my motherboard (old tyan tiger-133).  IIRC there was a long period in the early 2.6 kernel days where I had to use noacpi to get this board to boot because of acpi BIOS issues.  It looks like I'm back in that boat.

At this point, you can close this or modify it to be a kernel bug report.

Thanks for all your attention!

Comment 17 Wei Vy 2009-02-20 05:54:34 UTC
Interesting development... I noticed that I was seeing multiple sets of the init script messages and lvm reports.  So, I added nosmp to the kernel boot and succeeded in booting the 2.6.27 kernel.  It sure looks like the kernel is the fail.

Comment 18 Hans de Goede 2009-02-20 07:38:47 UTC
Closing per comment #16, note you may want to file a kernel bug for this (and reference this bug there), I think its better to do this in a new bug, since this one contains too much noise (looking from the kernel bug pov). Also you may want to try your luck with a 2.6.29 kernel from rawhide first.


Note You need to log in before you can comment on or make changes to this bug.