Bug 757196

Summary: kernel-3.1.2-1.fc16.x86_64 fails to boot
Product: [Fedora] Fedora Reporter: lars <lars>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16CC: dasergatskov, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, rmy
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-25 07:40:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
dmesg output
none
dmesg of 3.1.1-2.fc16.x86_64 (last working kernel)
none
kernel panic none

Description lars@bistromatic.de 2011-11-25 18:23:56 UTC
Created attachment 536401 [details]
dmesg output

Description of problem:

kernel-3.1.2-1.fc16.x86_64 fails to boot. Hangs after 'Started Monitoring of LVM2 mirrors, snapshots etc.'

Version-Release number of selected component (if applicable):

kernel-3.1.2-1.fc16.x86_64

kernel 3.1.1-2.fc16.x86_64 works flawlessly


How reproducible:

every time

Steps to Reproduce:
1. install kernel-3.1.2-1.fc16.x86_64
2. boot
3. 
  
Actual results:

kernel fails to boot into runlevel 5


Expected results:

boot into runlevel 5, present gdm login screen.

Additional info:

Comment 1 Chuck Ebbert 2011-11-28 21:47:37 UTC
Can you attach the messages from the older working kernel?

Comment 2 lars@bistromatic.de 2011-11-28 22:11:43 UTC
Created attachment 537693 [details]
dmesg of 3.1.1-2.fc16.x86_64 (last working kernel)

Last working kernel is 3.1.1-2.fc16.x86_64.

Comment 3 lars@bistromatic.de 2011-12-02 21:45:23 UTC
A few more details about this problem.

kernel 3.1.2 fails to mount /srv and goes into single user mode afterwards.

/srv is a separate md mirror partition.

/etc/fstab contains the following:
UUID=3b4be705-bb3b-462e-86eb-5b0f2e54de53 /     ext4    defaults        1 1
UUID=0031da00-1149-44e2-a534-0cc08e498b4b /boot ext4    defaults        1 2
UUID=b17947f1-0210-40e0-91a2-af86888f2e61 /home ext4    defaults        1 2
UUID=77ab7452-7984-48ae-b24d-948fc8a8568e /srv  ext4    defaults        1 2
UUID=9f8f08eb-b15f-4e9c-9627-2a3f64af2d0d swap  swap    defaults        0 0

/dev/disk/by-uuid under 3.1.1 looks like this
lrwxrwxrwx. 1 root root 10 Dec  2 21:45 0031da00-1149-44e2-a534-0cc08e498b4b -> ../../sda1
lrwxrwxrwx. 1 root root 11 Dec  2 21:45 3b4be705-bb3b-462e-86eb-5b0f2e54de53 -> ../../md125
lrwxrwxrwx. 1 root root 11 Dec  2 21:45 65ae55a5-428e-464f-985a-9c2729472abf -> ../../md126
lrwxrwxrwx. 1 root root 10 Dec  2 21:45 66864DB2864D8411 -> ../../sdc1
lrwxrwxrwx. 1 root root 11 Dec  2 21:45 77ab7452-7984-48ae-b24d-948fc8a8568e -> ../../md127
lrwxrwxrwx. 1 root root 10 Dec  2 21:45 8565e75e-bdaf-48b7-bd74-41638feb5cc3 -> ../../sdb1
lrwxrwxrwx. 1 root root 11 Dec  2 21:45 9f8f08eb-b15f-4e9c-9627-2a3f64af2d0d -> ../../md124
lrwxrwxrwx. 1 root root 11 Dec  2 21:45 b17947f1-0210-40e0-91a2-af86888f2e61 -> ../../md123

under 3.1.2 it contains only:
lrwxrwxrwx. 1 root root 10 Dec  1 22:25 0031da00-1149-44e2-a534-0cc08e498b4b -> ../../sda1
lrwxrwxrwx. 1 root root 11 Dec  1 22:25 3b4be705-bb3b-462e-86eb-5b0f2e54de53 -> ../../md125
lrwxrwxrwx. 1 root root 11 Dec  1 22:25 65ae55a5-428e-464f-985a-9c2729472abf -> ../../md127
lrwxrwxrwx. 1 root root 10 Dec  1 22:25 66864DB2864D8411 -> ../../sdc1
lrwxrwxrwx. 1 root root 10 Dec  1 22:25 8565e75e-bdaf-48b7-bd74-41638feb5cc3 -> ../../sdb1
lrwxrwxrwx. 1 root root 11 Dec  1 22:25 9f8f08eb-b15f-4e9c-9627-2a3f64af2d0d -> ../../md124
lrwxrwxrwx. 1 root root 11 Dec  1 22:25 b17947f1-0210-40e0-91a2-af86888f2e61 -> ../../md123

under 3.1.1 /dev/md/md-device-map contains
md126 0.90 0961aae9:0b98a804:bfe78010:bc810f04 /dev/md/127_0
md125 0.90 f855eb39:443da9de:bfe78010:bc810f04 /dev/md125
md123 0.90 391be442:47601378:bfe78010:bc810f04 /dev/md123
md124 0.90 5dbed8e7:37c995a2:bfe78010:bc810f04 /dev/md124
md127 0.90 7ab39582:ab258c2f:c70c5873:4d3b1ccc /dev/md127

under 3.1.2 /dev/md/md-device-map only contains
md123 0.90 391be442:47601378:bfe78010:bc810f04 /dev/md123
md125 0.90 f855eb39:443da9de:bfe78010:bc810f04 /dev/md125
md124 0.90 5dbed8e7:37c995a2:bfe78010:bc810f04 /dev/md124
md127 0.90 0961aae9:0b98a804:bfe78010:bc810f04 /dev/md/126_0

... so md126 becomes md127 and the old md127 is missing.

Comment 4 Dmitri A. Sergatskov 2011-12-05 17:52:14 UTC
Created attachment 541036 [details]
kernel panic

3.1.2-1 kernel panic

Comment 5 Dmitri A. Sergatskov 2011-12-05 17:54:26 UTC
I lost comments to the kernel panic I posted.

This is a fresh install/update on Dell T3500 (they sell sertified for RHEL).

Hardware profile:

http://www.smolts.org/client/show/pub_84bf7aaf-f2b3-4c66-a2cd-c01a9c4fb6ba

3.1.0-7.fc16.x86_64 works fine.

Dmitri.

Comment 6 Dmitri A. Sergatskov 2011-12-05 17:55:16 UTC
Comment on attachment 541036 [details]
kernel panic

hardware profile:
http://www.smolts.org/client/show/pub_84bf7aaf-f2b3-4c66-a2cd-c01a9c4fb6ba

Comment 7 Dmitri A. Sergatskov 2011-12-05 21:51:43 UTC
For some reason the update did not work out first time. After i re-installed --
everything works fine. So, disregard my report and sorry for the noise.

Dmitri.

Comment 8 lars@bistromatic.de 2011-12-07 10:46:57 UTC
Works again with kernel kernel-3.1.4-1.fc16.x86_64.

Comment 9 lars@bistromatic.de 2011-12-07 13:13:04 UTC
Sorry, I've to reopen this bug.

3.1.4-1.fc16.x86_64 works only after warm booting from 3.1.1-2.fc16.x86_64.
Doing a cold boot results in the reported bug.

Comment 10 Josh Boyer 2012-01-24 20:40:43 UTC
Does the 3.2.1 update make this any better?

Comment 11 lars@bistromatic.de 2012-01-24 22:11:46 UTC
No, this bug still exists in kernel-3.2.1-3.

Comment 12 Ron Yorston 2012-02-18 21:55:06 UTC
I've just been struggling with a similar problem. I too have a number of old MD RAID partitions that weren't being assembled properly. This was causing the hang after 'Started monitoring of LVM2 mirrors'.

What seemed to fix it for me was removing 'rd.md=0' from the kernel command line and adding 'ARRAY UUID=' lines to mdadm.conf for each of the arrays, but without specifying a device name. (When I did specify names one of the arrays was somehow grabbing a name that didn't belong to it. This resulted in the array that was supposed to have that name not being assembled.)

Comment 13 lars@bistromatic.de 2012-02-21 11:16:24 UTC
Starting with kernel-3.2.2-1.fc16 the system boots without any hangs or single user mode detours (maybe mdadm-3.2.3-3.fc16 update is related to that change). Anyway, two of my raid partitions still degrade from time to time (and kernel-3.1.2-1.fc16 still works flawlessly, without any raid hiccups).

So I have a lot of practice doing things like
mdadm --zero-superblock /dev/sdb6
mdadm --add /dev/md123 /dev/sdb6

I had to build a new mdadm.conf after installing some updates anyway, because one raid partition was missing (don't remember which update it was. maybe mdadm-3.2.3-3.fc16 or kernel-3.2.2/3). The raid missing wasn't mounted and not listed in fstab at the time of the update, so that might be the reason it was removed from the old mdadm.conf.

Comment 14 Dave Jones 2012-03-22 16:50:23 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 15 Dave Jones 2012-03-22 16:54:42 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 16 Dave Jones 2012-03-22 17:05:25 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 17 lars@bistromatic.de 2012-03-25 07:40:31 UTC
I've exchanged my hds and switched to 1.2 format superblocks.
The problems are gone now and I'm unable to retest with the original configuration, so I'm closing this bug.