Bug 538515

Summary: lvm2-cluster does not properly refresh device cache for newly appeared devices
Product: Red Hat Enterprise Linux 5 Reporter: Shane Bradley <sbradley>
Component: lvm2-clusterAssignee: Milan Broz <mbroz>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: low    
Version: 5.4CC: agk, ccaulfie, cmarthal, cward, dwysocha, edamato, haselden, heinzm, jbrassow, mbroz, prockai, pvrabec, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 09:02:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shane Bradley 2009-11-18 18:13:39 UTC
Description of problem:

Feature Request:

Any time a new device is added the man page of clvmd states that the
-R should be ran to refresh the cache on all the nodes so that all nodes
have the same view of the shared storage.

However, if this option is not ran then it can lead to errors in the
creation of the filesystem. 

No errors are displayed when an existing LVM logical volume is
extended. The pvcreate, vgextend, lvextend, gfs_grow will all work and
produce no errors.  This gives the end user the impression that
extending the volume worked correctly.

In order to fix the error, a reboot was required.  

I believe that when lvm is operating in cluster mode, then "-R" should
be ran before pvcreate adds the device in order to have the nodes
cache all synced up. This would eliminate the user from having to
perform this operation and it would do itself.

Version-Release number of selected component (if applicable):
lvm2-cluster-2.02.40-7.el5-x86_64 

How reproducible:
Everytime

Steps to Reproduce:
1) setup 2 node cluster on domain0 with rh5, then start the vms.

2) on domain0 add a new device after machines startup

  $ xm block-attach rh5node1_iscsi \
    tap:aio:/var/lib/xen/images/disks/rh5nodes_iscsi/disk1.img xvdb w!

  $ xm block-attach rh5node2_iscsi \
    tap:aio:/var/lib/xen/images/disks/rh5nodes_iscsi/disk1.img xvdb w!

3) On vm start cluster services
 
  $ service cman start; service clvmd start (make sure locking_type=3)

4) Create a logical volume on that device for the vms

  $ pvcreate /dev/xvdb
  $ vgcreate -c y gfsvg /dev/xvdb
  $ lvcreate -l 499 -n gfs1 gfsvg

5) On domain0 add a new device

  $ xm block-attach rh5node1_iscsi \
    tap:aio:/var/lib/xen/images/disks/rh5nodes_iscsi/disk2.img xvdb w!

  $ xm block-attach rh5node2_iscsi \
    tap:aio:/var/lib/xen/images/disks/rh5nodes_iscsi/disk2.img xvdb w!

6) Add in the new device to extend the logical volume

  $ pvcreate /dev/xvdc
  $ vgextend /dev/gfsvg /dev/xvdc
  $ lvextend -l +499 /dev/gfsvg/gfs1

7) Check the table to see if there is error
  
  $ dmsetup table | grep 'error'
  gfsvg-gfs1-missing_1_0: 0 4087808 error 
  
Actual results:
  Error is returned on the extended logical volume.

Expected results:
  The extended logical volume should not error out.

Additional info:

Comment 2 Milan Broz 2009-11-18 21:17:18 UTC
The cache is refreshed when manipulating wth orphan PVs, here in vgcreate and vgextend (the global lock - taken when manipulating with oprhans - is propagated to other nodes and should flush cache).

I am probably missing something here - all commands mentioned are run from domain0 or there is some command inside VM?

Comment 3 Shane Bradley 2009-11-19 14:51:33 UTC
This has nothing to do with vms. This was just the easiest way to recreate the issue and demo it. This happens on non-vms and vms alike. 

Here is my point and I am not sure of the overhead involved, so this might be expensive and reason we are not doing it this way. We are requesting that end-user be in charge of refreshing the device cache anytime a device is added or changes.

$ man clvmd
-R
    Tells all the running clvmd in the cluster to reload their device cache and re-read the lvm configuration file. This command should be run whenever the devices on a cluster system are changed.  


My point is that seems a lot of responsibility for any end user. Not all end users know this and it is not well documented and i understand that they should read the man pages. It seems that we could detect if we are running in cluster mode and if so, then operations that are going to manipulate the lvm stack should verify that all cluster nodes have a refreshed view of the devices. Moving an end user responsibility into an automated responsibility.

Comment 4 Milan Broz 2009-11-19 19:31:25 UTC
Actually I think this is bug and not RFE - in this situation no clvmd -R should be needed.

Comment 6 Milan Broz 2009-11-23 19:22:18 UTC
The cache is not properly refreshed, apparently this leads to incorrect mapping and possible data corruption.
(Reproduced with recent upstream & RHE5.4 code.)

Comment 7 Milan Broz 2009-11-24 17:41:03 UTC
Should be fixed in upstream code now.

Comment 9 Milan Broz 2009-11-24 19:22:42 UTC
Fixed in lvm2-cluster-2_02_56-1_el5.

Comment 13 Chris Ward 2010-02-11 10:33:04 UTC
~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~

RHEL 5.5 Beta has been released! There should be a fix present in this 
release that addresses your request. Please test and report back results 
here, by March 3rd 2010 (2010-03-03) or sooner.

Upon successful verification of this request, post your results and update 
the Verified field in Bugzilla with the appropriate value.

If you encounter any issues while testing, please describe them and set 
this bug into NEED_INFO. If you encounter new defects or have additional 
patch(es) to request for inclusion, please clone this bug per each request
and escalate through your support representative.

Comment 15 Corey Marthaler 2010-03-25 19:59:26 UTC
Fix was verified in lvm2-cluster-2.02.56-7.el5.

Comment 16 errata-xmlrpc 2010-03-30 09:02:29 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0299.html