Bug 466867

Summary: kernel-2.6.27-3.fc10.i686 hang/lock up on boot
Product: [Fedora] Fedora Reporter: Γριφεγ Γ. <grfgguvf+2>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: medium    
Version: rawhideCC: gvarisco, kernel-maint, quantumburnz, ralston
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-10-22 19:39:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Γριφεγ Γ. 2008-10-14 08:41:23 UTC
Description of problem:
kernel-2.6.27-3.fc10.i686 hangs on boot, it's the kernel because not even Alt+SysRQ+b reboots, computer needs to be unplugged.
kernel-2.6.27-1.fc10.i686 works fine.

Version-Release number of selected component (if applicable):
kernel-2.6.27-3.fc10.i686

How reproducible:
100%

Steps to Reproduce:
1. upgrade to kernel-2.6.27-3.fc10.i686
2. try booting, using default setup (plymouth, etc)
3.
  
Actual results:
system hangs

Expected results:
using Fedora desktop

Additional info:

Comment 1 Gianluca Varisco 2008-10-14 08:44:45 UTC
Hi,

Are you able to attach any (other) additional information regarding your issue? Hangs it with an error displayed on your screen?

Thanks.

Comment 2 Γριφεγ Γ. 2008-10-14 09:16:41 UTC
Hi,

No. Apparently it's not 100% reproducible. Here's what I did:

At bootup I pressed Esc so the Plymouth progress bars are not shown, in hope of seeing some error message. But then, there is no hang.

The hangs before happened at around the time when I am asked for the encrypted partition password...

I will test more

Comment 3 James Ralston 2008-10-14 15:19:17 UTC
This is 100% reproducible for me, on my Dell Latitude D620 laptop.

When I boot without "rhgb quiet", after unlocking my LUKS-encrypted partitions, I see these messages:

don't know how to make device "loop2"
don't know how to make device "loop3"
don't know how to make device "loop4"
don't know how to make device "loop5"
don't know how to make device "loop6"
don't know how to make device "loop7"
don't know how to make device "lp1"
don't know how to make device "lp2"
don't know how to make device "lp3"
[...]
Remounting root filesystem in read-write mode: [ OK ]
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Session finished...exiting logger [ OK ]

And that's it.  There's no further disk activity; the system just sits there.

If I press Ctrl-Alt-Del, the following two messages appear:

init: rcS main process (726) killed by TERM signal [ OK ]
Entering non-interactive startup

Pressing Ctrl-Alt-Del again just alternates between these two messages, one at a time; my only recourse is to hold the power switch until the system powers off.

As with the original reporter, 2.6.27-1 works fine.

Comment 4 James Ralston 2008-10-14 15:35:11 UTC
Ok, I see now that the "don't know how to make device" messages are bug 466385; they probably aren't related to this problem.

Bizarrely, I am seeing that some older kernels now hang in the exact same manner, whereas before they worked perfectly:

2.6.27-0.166.rc0.git8: works
2.6.27-0.370.rc8: hangs
2.6.27-1: works
2.6.27-3: hangs

Perhaps this is really a plymouth bug, and only certain kernels manage to trigger it?

Comment 5 James Ralston 2008-10-22 19:21:47 UTC
Kernels newer than 2.6.27-3 have *NOT* been hanging for me.  I've explicitly tested:

kernel-2.6.27-13
kernel-2.6.27.3-27
kernel-2.6.27.3-30

I suspect whatever bug that was causing the hangs might have been squashed.

Original reporter: do you see hangs with the above kernels?

Comment 6 Christopher D. Stover 2008-10-22 19:39:00 UTC
I'm closing the ticket as fixed. Please make a comment to this bug if you're still having problems after upgrading to the newest kernel Γριφεγ.

Comment 7 Γριφεγ Γ. 2008-10-23 03:45:10 UTC
The newer kernels have fixed it for me too.

Comment 8 Γριφεγ Γ. 2008-10-29 17:23:44 UTC
Just an interesting bit: This was most probably the same issue that also caused the e1000e corruption: http://lwn.net/SubscriberLink/304105/d32d3f1f921d9b73/