Bug 799834

Summary: Stuck during booting
Product: Red Hat Enterprise Linux 6 Reporter: Andrey Vagin <avagin>
Component: kernelAssignee: Frederic Weisbecker <fweisbec>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.4CC: avagin, harald, khorenko, syeghiay, vvs, ykawada
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-31 16:18:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 782183, 846704    
Attachments:
Description Flags
console log with udev messages
none
A patch which probably fix this bug. none

Description Andrey Vagin 2012-03-05 08:08:26 UTC
Created attachment 567500 [details]
console log with udev messages

Description of problem:

I executes RHEL 6 in virtual machines. Sometimes they hangs up on boot screen. It will be continue booting if any device is hot-plugged (provoke an udev event).

I found, that "udevadm settle" returns the same error each times:
queue is empty but kernel events still pending [928]<->[925]

Then I added the options rdshell rdudevinfo and got udev.log, which is attached.
From log I found that the kernel gives events and they are not sorted:

udevd[77]: seq 924 queued, 'add' 'bdi'
udevd[77]: seq 926 queued, 'add' 'block'
udevd[77]: seq 927 queued, 'add' 'block'
udevd[77]: seq 928 queued, 'add' 'blockserial8250: too much work for irq4
udevd[77]: seq 925 queued, 'add' 'drivers'

This is the reason why a boot process hangs up.

Let's look at update_queue(). It contains the follow code:

/* now write to the queue */
if (state == DEVICE_QUEUED) {
        udev_queue_export->queued_count++;
        udev_queue_export->seqnum_min = seqnum;
}

where seqnum_min is latest sequence number in queue file. Probably we should check that seqnum is not less than seqnum_min and update it only in this case.

I patched udev by this way and this bug isn't reproduced.

Version-Release number of selected component (if applicable):
udev-147-2.40.el6.x86_64
Linux localhost.localdomain 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
100% in a specific environment. I don't know how to create such environment. Now I have a host with four virtual machines where this bug is reproduced each times.

Steps to Reproduce:
1. Boot RHEL 6
  
Actual results:
The system doesn't boot, it hangs up.

Expected results:
The system booted and does something useful.

Additional info:

Comment 2 Andrey Vagin 2012-03-05 08:27:32 UTC
Created attachment 567514 [details]
A patch which probably fix this bug.

Comment 3 Andrey Vagin 2012-03-07 20:30:19 UTC
Here is a fix for this bug https://lkml.org/lkml/2012/3/7/107. It will be committed in 2.6.32-stable.

Comment 4 RHEL Program Management 2012-05-03 05:13:39 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 5 Andrey Vagin 2012-05-03 07:52:30 UTC
(In reply to comment #3)
> Here is a fix for this bug https://lkml.org/lkml/2012/3/7/107. It will be
> committed in 2.6.32-stable.

It has not been committed in 2.6.32-stable, because its life cycle is completed.

Pls, commit this patch in RHEL6 kernel.

Comment 6 RHEL Program Management 2012-07-10 08:29:35 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 7 RHEL Program Management 2012-07-10 23:34:27 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 8 RHEL Program Management 2012-07-26 19:20:24 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 10 Frederic Weisbecker 2012-07-31 13:23:09 UTC
This looks like a duplicate of: https://bugzilla.redhat.com/show_bug.cgi?id=801694

The fix you mentioned has been applied and is available in kernel-2.6.32-264.el6
Do you have any way to test it and tell us whether this solves your issue?

Thanks.

Comment 11 Andrew Vagin 2012-07-31 13:47:30 UTC
This bug is hard for reproducing. Actually you commited the patch, which I have tested, so you can close this bug. Thanks.

Comment 13 Frederic Weisbecker 2012-07-31 16:18:20 UTC
Thanks Andrew for your reply.

I'm closing this ticket.

*** This bug has been marked as a duplicate of bug 801694 ***