Bug 837603

Summary: [lvmetad] Metadata not rescanned after lvmetad is disabled and enabled in lvm.conf
Product: Red Hat Enterprise Linux 6 Reporter: Marian Csontos <mcsontos>
Component: lvm2Assignee: Petr Rockai <prockai>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: agk, cmarthal, coughlan, dwysocha, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.98-1.el6 Doc Type: Known Issue
Doc Text:
When the administrator disables use of the lvmetad daemon in the lvm.conf file, but the daemon is still running, the cached metadata are remembered until the daemon is restarted. However, if the use_lvmetad parameter in lvm.conf is reset to 1 without an intervening lvmetad restart, the cached metadata can be incorrect. Consequently, VG metadata can be overwriten with previous versions. To work around this problem, stop the lvmedat daemon manually when disabling use_lvmetad in lvm.conf. The daemon can only be restarted after use_lvmetad has been set to 1. To recover from an out-of-sync lvmetad cache, execute the pvscan --cache command or restart lvmetad. To restore metadata to correct versions, use vgcfrestore with a corresponding file in /etc/lvm/archive.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 08:11:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
test none

Description Marian Csontos 2012-07-04 11:18:07 UTC
Created attachment 596193 [details]
test

Description of problem:
Neither `lvmetad` is decativated nor metadata is flushed when lvmetad is disabled in lvm.conf. This may lead to VG corruption.

Version-Release number of selected component (if applicable):
lvm2-2.02.95-10.el6

How reproducible:
100%

Steps to Reproduce:
# lvmetad is running
1. edit lvm.conf and turn lvmetad off
2. do changes to the VG
3. edit lvm.conf and turn lvmetad on
4. do lvs/vgs/pvs
5. do more changes to VG - see attached test2.sh
  
Actual results:
The old metadata are now considered valid and any LV operations work with them.

Expected results:
This should either fail with CRC check or lvmetad should be stopped when disabled in lvm.conf.

Additional info:

Comment 2 Petr Rockai 2012-09-10 14:24:05 UTC
Sadly, there is no easy way out. Stopping lvmetad is something an admin needs to do manually. I would propose a big warning in lvm.conf comments maybe? We have a mechanism to notice filter changes in lvm.conf, but entirely disabling lvmetad means we can no longer talk to it. Moreover, since we can run multiple instances of lvmetad based on different configuration files, we can't really go around killing lvmetad if *this process* is not using it. Also, overriding use_lvmetad to 0 on command line is a legitimate thing to do in specific situations, and should not get lvmetad killed. Thoughts?

Comment 3 Petr Rockai 2012-10-09 13:59:19 UTC
I have checked in an lvm.conf warning in d414fe28fa1754690e8c72a16df5c2cdc1cc87e1. I don't think much else can be done for this report, apart from mentioning the problem in documentation. Also, it is something much more easily encountered in testing than in production. I suppose a technical solution would be possible, but quite impractical (e.g. forcing a lot of re-scanning, say a full re-scan upon any lvm.conf change). I'll close this based on the documentation fix alone, but if there is a consensus that a technical solution would be preferable even despite the downsides, please re-open the bug.

Comment 5 Alasdair Kergon 2012-10-09 18:52:05 UTC
I'm wondering if we can stop with a warning if the metad setting is off in lvm.conf but we detect it is running (pidfile?).  That combination doesn't seem valid to me except on test systems (not a concern) or as a workaround within vgimportclone (which should be solved a better way in future).  Despite all the potential logging/documentation, I think we should try to do more to make it harder for this situation to arise.

Comment 6 Petr Rockai 2012-10-09 19:15:10 UTC
I think a non-fatal warning is a reasonable compromise for now. Making the condition fatal risks introducing significant bugs and/or breaking scripts. Either way, we can re-consider in a later release. Knowing whether the warning is seen in the field (and under what conditions) would be an useful input for deciding whether it should be fatal or not.

Testing for this bug should then consist of checking that a warning is printed by all LVM commands whenever lvmetad is running while use_lvmetad is set to 0 in lvm.conf. I'll add an upstream test to that effect.

Comment 7 Petr Rockai 2012-10-10 12:56:30 UTC
I have also added a runtime warning based on presence of lvmetad pidfile when use_lvmetad is 0. This should cover our bases well enough. 71d718a4a4202d0e8a10bed1551878c68dd99a59. See #6 for how to test this. The upstream test looks like this:

. lib/test

test -e LOCAL_LVMETAD || skip
kill $(cat LOCAL_LVMETAD)

test -e $LVMETAD_PIDFILE && skip
lvmetad
test -e $LVMETAD_PIDFILE
cp $LVMETAD_PIDFILE LOCAL_LVMETAD
pvs 2>&1 | not grep "lvmetad is running"
aux lvmconf "global/use_lvmetad = 0"
pvs 2>&1 | grep "lvmetad is running"

kill $(cat $LVMETAD_PIDFILE)
not ls $LVMETAD_PIDFILE

Comment 9 Corey Marthaler 2013-01-25 23:20:10 UTC
Verified warning exists in the following cmds after disabling lvmetad in lvm.conf.


2.6.32-354.el6.x86_64
lvm2-2.02.98-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
lvm2-libs-2.02.98-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
lvm2-cluster-2.02.98-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
udev-147-2.43.el6    BUILT: Thu Oct 11 05:59:38 CDT 2012
device-mapper-1.02.77-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
device-mapper-libs-1.02.77-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
device-mapper-event-1.02.77-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
device-mapper-event-libs-1.02.77-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013
cmirror-2.02.98-9.el6    BUILT: Wed Jan 23 10:06:55 CST 2013


[root@taft-01 ~]# pvs
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# lvs
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# vgs
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# pvscan
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# vgscan
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# lvscan
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# lvcreate
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# vgcreate
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# pvcreate
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# lvconvert
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]
[root@taft-01 ~]# lvchange
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  [...]

Comment 10 errata-xmlrpc 2013-02-21 08:11:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0501.html