Bug 592967

Summary: Boot fails with LVM logging to file and / not on LV
Product: Red Hat Enterprise Linux 5 Reporter: David Kovalsky <dkovalsk>
Component: lvm2Assignee: Dave Wysochanski <dwysocha>
Status: CLOSED ERRATA QA Contact: Corey Marthaler <cmarthal>
Severity: high Docs Contact:
Priority: high    
Version: 5.5CC: agk, benl, coughlan, dwysocha, heinzm, jbrassow, joe.thornber, jturner, mbroz, prockai
Target Milestone: rcKeywords: Regression, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.56-12.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 22:41:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 601079    

Description David Kovalsky 2010-05-17 13:56:57 UTC
When upgrading from 5.4 kernel 2.6.18-164.15.1 to 5.5 kernel 2.6.18-194 I've hit a regression of system not booting. 

My setup: 
/dev/md2 -> boot
/dev/md3 -> root

/dev/md100 -> PV -> VG -> logical volumes


snippets from fstab:
"""
/dev/md3   /        ext3    noatime,nodiratime  1 1
/dev/md2   /boot    ext3    noatime,nodiratime  1 2 
LABEL=XENSAVE   /var/lib/xen/save/  ext3 noatime,nodiratime,noexec 1 2 
"""

# lvs |grep -i xensave
  xensave          vg_store02   -wi-ao   8.00G  

In /etc/lvm/lvm.conf, if I uncomment the log file on 151 the system doesn't boot with the 5.5 kernel. System logs that / is read only and lvm2.log cannot be opened. In the rescue shell I realized that vgchange, vgs, lvs fails because of the log file open error. Since the logical volumes are not available, /var/lib/xen/save is not mounted and the system stops booting. 
139 log {
140
141     # Controls the messages sent to stdout or stderr.
142     # There are three levels of verbosity, 3 being the most verbose. 
143     verbose = 0
144
145     # Should we send log messages through syslog?
146     # 1 is yes; 0 is no.
147     syslog = 1
148
149     # Should we log error and debug messages to a file?
150     # By default there is no log file.
151     file = "/var/log/lvm2.log"
152
153     # Should we overwrite the log file each time the program is run?
154     # By default we append.
155     overwrite = 0   
...



Facts: 
 * works OK with all 5.4 kernels
 * works OK if root partition is on LVM. LVM then gets activated (though the error of ro system is still printed)
 * doesn't boot with 5.5 kernel with root NOT on LVM
 * commenting out line 151 makes the system bootable in all configurations I tried

kernel-2.6.18-194.3.1.el5
lvm2-2.02.56-8.el5_5.1

Comment 2 Alasdair Kergon 2010-05-28 20:21:01 UTC
Review create_toolcontext().

Are the liblvm requirements *really* different from the normal command line tool ones?

What is the 'if (stored_errno)' test actually meant for, given that the function already returns NULL on failure?  Should the field be cleared after operations we don't care about failing?

Comment 3 Dave Wysochanski 2010-06-01 15:49:37 UTC
I don't understand why we added this code in init_lvm():
	if (stored_errno()) {
		destroy_toolcontext(cmd);
		return_NULL;
	}

liblvm returns cmd in this case - it does not tear down the context.  So the tools seem to have become more restrictive than liblvm, which is the bug IMO.  We should revert the above code.

Comment 4 Dave Wysochanski 2010-06-01 16:40:46 UTC
I take comment #3 about reverting the code back.  I agree with comment #2 and the IRC discussion between agk and kabi - we should call reset_lvm_errno(1) at various points in that create_toolcontext() for init functions that do not return an error or the error message is ignored.  Perhaps the reset should go inside the specific init function.

Comment 5 Dave Wysochanski 2010-06-01 21:50:37 UTC
Two patches checked in upstream, one resolves this issue, and a second fixes a related init issue (if init_rand fails).

Comment 9 Milan Broz 2010-06-07 10:37:47 UTC
Fixed in lvm2-2.02.56-12.el5.

Comment 12 Corey Marthaler 2010-11-08 23:29:00 UTC
Testing mentioned in comment #10 passed in the latest rpm (lvm2-2.02.74-1.el5). Marking verified.

[root@grant-01 tmp]# pvscan
  /tmp/log/bar/foo/coreys_fake_file.log: fopen failed: No such file or directory
    Logging initialised at Mon Nov  8 17:27:05 2010
    Set umask to 0077
  read_urandom: /dev/urandom: open failed: No such file or directory
    Wiping cache of LVM-capable devices
    Wiping internal VG cache
    Walking through all physical volumes
  PV /dev/sdc1   VG centipede    lvm2 [54.49 GB / 54.49 GB free]
  PV /dev/sdc2   VG centipede    lvm2 [54.49 GB / 54.49 GB free]
  PV /dev/sdc3   VG centipede    lvm2 [54.48 GB / 54.48 GB free]
  PV /dev/sdc5   VG centipede    lvm2 [54.49 GB / 54.49 GB free]
  PV /dev/sdc6   VG centipede    lvm2 [54.48 GB / 54.48 GB free]
  PV /dev/sdb1   VG centipede    lvm2 [40.87 GB / 40.87 GB free]
  PV /dev/sdb2   VG centipede    lvm2 [40.87 GB / 40.87 GB free]
  PV /dev/sdb3   VG centipede    lvm2 [40.87 GB / 40.87 GB free]
  PV /dev/sdb5   VG centipede    lvm2 [40.88 GB / 40.88 GB free]
  PV /dev/sda2   VG VolGroup00   lvm2 [74.38 GB / 0    free]
  Total: 10 [510.30 GB] / in use: 10 [510.30 GB] / in no VG: 0 [0   ]
    Wiping internal VG cache

Comment 14 errata-xmlrpc 2011-01-13 22:41:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0052.html