Bug 447231 - kernel-2.6.26-0.13.rc2.git5.fc10.i686 fails to find all PVs
Summary: kernel-2.6.26-0.13.rc2.git5.fc10.i686 fails to find all PVs
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: i386
OS: Linux
low
low
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-18 23:12 UTC by Clyde E. Kunkel
Modified: 2008-06-11 19:26 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-06-10 03:00:27 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
mkinitrd -v -f initrd-2.6.26-0.13.rc2.git5.fc10.i686.img etc (12.05 KB, text/plain)
2008-05-19 00:10 UTC, Clyde E. Kunkel
no flags Details
Serial console output from working FC10 initrd (25.48 KB, text/plain)
2008-05-22 19:05 UTC, Clyde E. Kunkel
no flags Details
serial console output from failing FC10 initrd (27.49 KB, text/plain)
2008-05-22 19:37 UTC, Clyde E. Kunkel
no flags Details

Description Clyde E. Kunkel 2008-05-18 23:12:34 UTC
Description of problem:
kernel-2.6.26-0.13.rc2.git5.fc10.i686 won't boot due to failure to find PVs for
root volgroup

Version-Release number of selected component (if applicable):
kernel-2.6.26-0.13.rc2.git5.fc10.i686

How reproducible:
each boot

Steps to Reproduce:
1.boot system
2.
3.
  
Actual results:
root file system not found, kernel panic

Expected results:
normal boot

Additional info:
Rawhide / is on ext4 LV on LogVol00 and kernel-2.6.25.2-5.fc10.i686 initrd can't
find VG.  

Only msg that seems pertinent is that one of the VG devices can't be found and
it turns out it is PV0 of the VG and is on an ide device.  The remaining 3 PVs
for the VG are on SATA drives.  2.6.25.2-5.fc10.i686 boots fine.  I don't have a
serial port on my test computer, so can't capture all msgs.

Compared initrds kernel-2.6.26-0.13.rc2.git5.fc10.i686 and
kernel-2.6.25.2-5.fc10.i686 and the inits in each are the same.  Modules in each
reflect only expected size differences due to kernel versions and one additional
module in 2.6.26 called dm-log.ko.  usr/lib/lib-elf.so.1 is 0 bytes in 2.6.26.

Comment 1 Dave Jones 2008-05-18 23:30:07 UTC
The 0 byte libelf is definitly suspect.
afaik, mkinitrd hasn't changed at all in f10 yet, so I'm curious what could've
caused this.   Can you try recreating the initrd with mkinitrd, and pass the -v
option, to see if it spits out anything interesting ?

Comment 2 Clyde E. Kunkel 2008-05-19 00:10:31 UTC
Created attachment 305882 [details]
mkinitrd -v -f initrd-2.6.26-0.13.rc2.git5.fc10.i686.img etc

As requested.  Don't see anything unusual.  Same error as before. Issue seems
to be something is not identifying a PV on an ide drive on a JMicron
controller.  MOBO is ASUS P5K-E WiFi.

Comment 3 Clyde E. Kunkel 2008-05-19 13:06:47 UTC
kernel 2.6.26-0.17.rc3.fc10 fails same way

Comment 4 Jeremy Katz 2008-05-19 14:55:51 UTC
Do you see the drive actually being found by the kernel?  

Comment 5 Clyde E. Kunkel 2008-05-19 17:49:03 UTC
/boot is on the drive so grub is loading the initrd.img from the drive and it 
is doing its thing.  Other than that the msg that the device with UUID=OzUZGl-
cHYu-etc, etc cannot be found indicates that the PV on the drive cannot be 
found.

I don't know how to se if there are other msgs indicating drive awareness.  I 
tried a vga setting to get more lines on the screen, but I didn't see any other 
msgs that are suspect.  I will try again, and if I can get some time, I will 
hack the initrd to see if I can insert some debugging statements in the init.  
Any suggestions here?


Comment 6 Clyde E. Kunkel 2008-05-19 19:18:13 UTC
OK, put some sleep statements in init and can see that a msg referencing the ide
drive as a PATA device is being displayed after the jmicron module is loaded,
BUT it is not being seen later as /dev/sda as it should be.  Only the sata
drives are being seen.  

Expected              failing kernels
/dev/sda 300GB ide    /dev/sda 150GB sata
/dev/sdb 150GB sata   /dev/sdb 150GB sata
/dev/sdc 150GB sata   /dev/sdc 150GB sata
/dev/sdd 150GB sata   

Is there a command I can insert in the initrd to verify that the ide drive is or
is not present before the lvm vgscan command?

BIOS order is same as expected column.

Comment 7 Clyde E. Kunkel 2008-05-22 19:05:46 UTC
Created attachment 306412 [details]
Serial console output from working FC10 initrd

Comment 8 Clyde E. Kunkel 2008-05-22 19:37:47 UTC
Created attachment 306418 [details]
serial console output from failing FC10 initrd

Took the time to install serial port hardware on two computers.  Now can
capture initrd output....lots of fun.

I noticed that the failing initrds start with Decompressing Linux.... Parsing
ELF..done.  Not seeing that on the good initrd.

In addition please note that the failing initrd DOES NOT indicate that the ide
drive was initialized after loading the jmicron driver.

Is this the problem?

Comment 9 Clyde E. Kunkel 2008-05-23 14:04:04 UTC
e-mail response from Alan Cox:

"Check the kernel builders didn't turn on PCIE ASPM. That breaks the Jmicron
totally in the current kernels."

Looking in the config file for the failing initrd listed above:

CONFIG_PCIEAER=y
CONFIG_PCIEASPM=y     <=======================================!!!!
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y

I do not see this option in the config file of the last working FC10 kernel.

Could we get a kernel with PCIEASM turned off?  What does it do anyway?


Comment 10 Dave Jones 2008-05-23 16:18:56 UTC
It's interesting that 2.6.25.2-5.fc10.i686 boots fine for you, because PCIEASPM
has been on for about four months.


Comment 11 Clyde E. Kunkel 2008-05-23 17:19:55 UTC
Yes, that is interesting, however, the config file installed by the yum update
on my computer for 2.6.25.2-5.fc10.i686 does not include any option named
PCIEASPM.  Did it have a different name then?

Here is that section:
#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=m
CONFIG_PCIEAER=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
CONFIG_HT_IRQ=y
CONFIG_ISA_DMA_API=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
CONFIG_K8_NB=y
CONFIG_PCCARD=y
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y


Comment 12 Clyde E. Kunkel 2008-05-24 15:46:14 UTC
Just finished building 2.6.26-0.25.rc3.git4.fc10.i686 **without** PCIEASPM and
it works nicely as expected.

Hopefully, we can get this turned off in future kernels, or a switch hacked in,
until fixed.


Comment 13 Clyde E. Kunkel 2008-05-26 14:46:00 UTC
kernel 2.6.26-0.30.rc3.git6.fc10.i686 has PCIEASM turned off and it boots nicely
finding the ide drive on the JMicron controller.

Leave bz open until PCIEASM or JMICRON driver fixed?




Comment 14 Chuck Ebbert 2008-05-29 02:10:42 UTC
jmicron is fixed now, we should be able to turn pcieaspm on again:

Commit:     ddc9753fcddfe5f9885dc133824962c047252b43
PCI: don't enable ASPM on devices with mixed PCIe/PCI functions


Comment 15 Clyde E. Kunkel 2008-06-11 19:26:36 UTC
[root@P5K-EWIFI ~]# uname -r
2.6.26-0.57.rc5.git3.fc10.i686
[root@P5K-EWIFI ~]# cat /boot/config-2.6.26-0.57.rc5.git3.fc10.i686 | grep PCIEA
CONFIG_PCIEAER=y
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set


All is well, thanks.


Note You need to log in before you can comment on or make changes to this bug.