Bug 198704

Summary: LVM2 init may fail if dm_mod is still checking disks
Product: Red Hat Enterprise Linux 4 Reporter: Gordon Messmer <gordon.messmer>
Component: udevAssignee: Harald Hoyer <harald>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: agk, dwysocha, jjneely, k.georgiou, mbroz, notting, pknirsch, sal.scotto, ykopkova
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 20:28:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ls -lR /dev, during LVM2 init in rc.sysinit
none
ls -lR /dev, during LVM2 init in rc.sysinit, after sleeping
none
strace of vgscan, during LVM2 init in rc.sysinit
none
strace of vgscan, during LVM2 init in rc.sysinit, after sleeping none

Description Gordon Messmer 2006-07-12 23:18:42 UTC
Description of problem:
Some of the time, the LVM2 init section of rc.sysinit will be ineffective. 
Specifically, vgscan will not find any volume groups.  I believe that the system
started up normally once, but since has not done so without modifications to the
rc.sysinit file.

I've determined that if I add a 'sleep' to rc.sysinit, immediately before
calling vgscan, vgscan will locate the volume groups.  I'm using 30 seconds now,
but don't know what minimum is required.

I'm guessing that the device mapper hadn't found the volume on sdb yet, but I'm
not sure how I'd prove it without more knowledge of the way that those kernel
components function.

Version-Release number of selected component (if applicable):
initscripts-7.93.24.EL-1.1

How reproducible:
Unsure, but it seems consistent and reproducible here.

Steps to Reproduce:
1. Create LVM device (ours is a 4.55TB volume with just one PV: /dev/sdb)
2. Reboot
3. Log in and check "vgdisplay", you'll see no volumes have been found.
  
Actual results:
vgscan didn't find the volume we'd created, when run during rc.sysinit, unless
there was a brief pause after loading the dm_mod module.

Expected results:
vgscan should find the volume, and vgchange should activate it.

Additional info:
sdb is a RAID5 volume on a 3ware 9550SX card.  We used LVM because we plan to
put a second card and set of disks in the system at some point, and will want to
include those disks in the same filesystem.

Comment 1 Bill Nottingham 2006-07-13 00:13:06 UTC
Hm, can you get a strace of it failing?

Comment 2 Gordon Messmer 2006-07-13 18:11:13 UTC
Yeah, sure.  I traced the process before and after the sleep, and saw that "sdb"
didn't exist in /dev on the first run.  Afterward, I also got the output of 'ls
-lR /dev' before and after the 'sleep'.

Comment 3 Gordon Messmer 2006-07-13 18:12:19 UTC
Created attachment 132391 [details]
ls -lR /dev, during LVM2 init in rc.sysinit

Comment 4 Gordon Messmer 2006-07-13 18:12:45 UTC
Created attachment 132392 [details]
ls -lR /dev, during LVM2 init in rc.sysinit, after sleeping

Comment 5 Gordon Messmer 2006-07-13 18:14:24 UTC
Created attachment 132393 [details]
strace of vgscan, during LVM2 init in rc.sysinit

Comment 6 Gordon Messmer 2006-07-13 18:14:47 UTC
Created attachment 132394 [details]
strace of vgscan, during LVM2 init in rc.sysinit, after sleeping

Comment 7 Bill Nottingham 2006-08-01 15:51:21 UTC
How hard would it be to add something like udevsettle to RHEL 4's udev?


Comment 8 Bill Nottingham 2006-08-01 16:36:55 UTC
*** Bug 178728 has been marked as a duplicate of this bug. ***

Comment 9 Harald Hoyer 2006-08-01 16:51:17 UTC
would be possible, not sooo hard.

Comment 10 Harald Hoyer 2007-03-08 13:33:11 UTC
is this still a problem?

Comment 11 Gordon Messmer 2007-03-14 17:54:31 UTC
I may be able to check next week, when summer vacation begins.  The only system
where we see this problem is a file server for our students.  I don't have a
test system on which to reproduce the problem outside of the production environment.

Comment 12 Gordon Messmer 2007-03-19 21:02:53 UTC
We updated the system today, and it seems to be fixed.  If the problem comes
back, I'll reopen this bug.

Comment 13 Jack Neely 2007-08-10 22:02:22 UTC
I just encountered this bug on a RHEL 4.4 fully updated system.  (Which means
RHEL 4.5)  I inserted a "sleep 20" around line 512.  After the sleep the device
/dev/sdb exists and the vgchange command activates its logical volumes.

sdb is a 10T JetStor 516F connected via 4Gb FC to a QLogic QLE2460 HBA and the
hardware platform is a Dell 2950.

Comment 15 RHEL Program Management 2008-09-05 17:12:08 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 20 errata-xmlrpc 2009-05-18 20:28:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1004.html