Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1451933

Summary: HA LVM agent needs to update metadata (pvscan --cache) before starting/relocation tagged resource
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.4CC: agk, cfeist, cluster-maint, fdinitto, tlavigne
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: resource-agents-3.9.5-102.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 15:00:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2017-05-17 22:44:03 UTC
Description of problem:
I found this when attempting to verify feature bug 1159328 (lvmcache support for RH Cluster). 

I set up HA cached volumes and then uncached them a couple ways ('lvconvert --splitcache' and 'lvconvert --uncache'), then attempted to relocate the resources. However w/o the clvmd HA method, as cache is only supported with tagging, there's no consistent metadata view, so the node being relocated to didn't know of the change. I checked the resource-agent and found that on start it either does either a 'vgscan $vg' (which isn't supported and will fail) or a vgscan, which doesn't properly pick up metadata changes like a 'pvscan --cache' would. This affects all different lvm types that could potentially be HA using the tagging method, and have metadata changed on the active node.


        ocf_log info "Activating volume group $vg"
        if [ "$LVM_MAJOR" -eq "1" ]; then
                ocf_run vgscan $vg
        else
                ocf_run vgscan
        fi


I took HA out of the picture to illustrate what the agent is basically doing here with two shared storage machines with no locking.


# HARDING-02
[root@harding-02 ~]# lvcreate -n origin -L 100M VG
  Logical volume "origin" created.
[root@harding-02 ~]# lvcreate --type cache-pool -n POOL -L 100M VG /dev/mapper/mpathb1
  Using default stripesize 64.00 KiB.
  Logical volume "POOL" created.
[root@harding-02 ~]# lvconvert --yes --type cache --cachepool VG/POOL VG/origin
  Logical volume VG/origin is now cached.
[root@harding-02 ~]# lvs -a -o +devices
  LV              VG  Attr       LSize   Pool   Origin         Data%  Meta%  Cpy%Sync Devices               
  [POOL]          VG  Cwi---C--- 100.00m                       0.00   0.49   0.00     POOL_cdata(0)         
  [POOL_cdata]    VG  Cwi-ao---- 100.00m                                              /dev/mapper/mpathb1(4)
  [POOL_cmeta]    VG  ewi-ao----   8.00m                                              /dev/mapper/mpathb1(2)
  [lvol0_pmspare] VG  ewi-------   8.00m                                              /dev/mapper/mpathb1(0)
  origin          VG  Cwi-a-C--- 100.00m [POOL] [origin_corig] 0.00   0.49   0.00     origin_corig(0)       
  [origin_corig]  VG  owi-aoC--- 100.00m                                              /dev/mapper/mpatha1(0)


# HARDING-03
[root@harding-03 ~]# pvscan --cache  # it now has a consistent storage view
[root@harding-03 ~]# lvs -a -o +devices
  LV              VG  Attr       LSize    Pool   Origin         Data%  Meta%  Cpy%Sync Devices               
  [POOL]          VG  Cwi---C---  100.00m                                              POOL_cdata(0)         
  [POOL_cdata]    VG  Cwi-------  100.00m                                              /dev/mapper/mpathc1(4)
  [POOL_cmeta]    VG  ewi-------    8.00m                                              /dev/mapper/mpathc1(2)
  [lvol0_pmspare] VG  ewi-------    8.00m                                              /dev/mapper/mpathc1(0)
  origin          VG  Cwi---C---  100.00m [POOL] [origin_corig]                        origin_corig(0)       
  [origin_corig]  VG  owi---C---  100.00m                                              /dev/mapper/mpatha1(0)


# HARDING-02 meta data change
[root@harding-02 ~]# lvconvert --splitcache VG/origin
  Logical volume VG/origin is not cached and cache pool VG/POOL is unused.
[root@harding-02 ~]# lvs -a -o +devices
  LV              VG  Attr       LSize   Pool Origin Data%  Meta%  Cpy%Sync Devices               
  POOL            VG  Cwi---C--- 100.00m                                    POOL_cdata(0)         
  [POOL_cdata]    VG  Cwi------- 100.00m                                    /dev/mapper/mpathb1(4)
  [POOL_cmeta]    VG  ewi-------   8.00m                                    /dev/mapper/mpathb1(2)
  [lvol0_pmspare] VG  ewi-------   8.00m                                    /dev/mapper/mpathb1(0)
  origin          VG  -wi-a----- 100.00m                                    /dev/mapper/mpatha1(0)
[root@harding-02 ~]# lvchange -an VG/origin




# HARDING-03 w/o a consistent storage view now

# vgscan doesn't take an argument so that's invalid in the script
[root@harding-03 ~]# vgscan VG
  Command does not accept argument: VG.

[root@harding-03 ~]# vgscan
  Reading volume groups from cache.
  Found volume group "VG" using metadata type lvm2

# Still thinks this volume is cached when it's not
[root@harding-03 ~]# lvs -a -o +devices
  LV              VG  Attr       LSize    Pool   Origin         Data%  Meta%  Cpy%Sync Devices               
  [POOL]          VG  Cwi---C---  100.00m                                              POOL_cdata(0)         
  [POOL_cdata]    VG  Cwi-------  100.00m                                              /dev/mapper/mpathc1(4)
  [POOL_cmeta]    VG  ewi-------    8.00m                                              /dev/mapper/mpathc1(2)
  [lvol0_pmspare] VG  ewi-------    8.00m                                              /dev/mapper/mpathc1(0)
  origin          VG  Cwi---C---  100.00m [POOL] [origin_corig]                        origin_corig(0)       
  [origin_corig]  VG  owi---C---  100.00m                                              /dev/mapper/mpatha1(0)
[root@harding-03 ~]# lvchange -ay VG/origin
[root@harding-03 ~]# lvs -a -o +devices
  LV              VG  Attr       LSize    Pool   Origin         Data%  Meta%  Cpy%Sync Devices               
  [POOL]          VG  Cwi---C---  100.00m                       0.12   0.68   0.00     POOL_cdata(0)         
  [POOL_cdata]    VG  Cwi-ao----  100.00m                                              /dev/mapper/mpathc1(4)
  [POOL_cmeta]    VG  ewi-ao----    8.00m                                              /dev/mapper/mpathc1(2)
  [lvol0_pmspare] VG  ewi-------    8.00m                                              /dev/mapper/mpathc1(0)
  origin          VG  Cwi-a-C---  100.00m [POOL] [origin_corig] 0.12   0.68   0.00     origin_corig(0)       
  [origin_corig]  VG  owi-aoC---  100.00m                                              /dev/mapper/mpatha1(0)

[root@harding-03 ~]# pvscan --cache  # now it has a consistent view but it's too late.
[root@harding-03 ~]# lvs -a -o +devices
  Internal error: WARNING: Segment type cache found does not match expected type striped for VG/origin.
  LV              VG  Attr       LSize    Pool Origin Data%  Meta%  Cpy%Sync Devices               
  POOL            VG  Cwi---C---  100.00m                                    POOL_cdata(0)         
  [POOL_cdata]    VG  Cwi-ao----  100.00m                                    /dev/mapper/mpathc1(4)
  [POOL_cmeta]    VG  ewi-ao----    8.00m                                    /dev/mapper/mpathc1(2)
  [lvol0_pmspare] VG  ewi-------    8.00m                                    /dev/mapper/mpathc1(0)
  origin          VG  -wi-XX--X-  100.00m                                    /dev/mapper/mpatha1(0)


 

Version-Release number of selected component (if applicable):
resource-agents-3.9.5-99.el7.x86_64

Comment 2 Corey Marthaler 2017-05-19 16:20:18 UTC
It appears that 'vgscan --cache' would also suffice in these cache alteration cases. I edited the resource agent to use it, altered the LVs, relocated resources, and saw no issues.

Comment 3 Oyvind Albrigtsen 2017-05-22 13:04:00 UTC
https://github.com/ClusterLabs/resource-agents/pull/980

Comment 4 Oyvind Albrigtsen 2017-05-30 14:37:35 UTC
Additional patch for warning message when not using writethrough cache-mode: https://github.com/ClusterLabs/resource-agents/pull/984

Comment 8 Corey Marthaler 2017-06-13 23:42:07 UTC
Verified that the splitcache and uncache scenarios listed in comment #0 now work with the latest resource-agent. Marking verified in resource-agents-3.9.5-104.

That said, any "more intensive" VG metadata alteration scenarios end up resulting in bug 1430948.

Comment 9 errata-xmlrpc 2017-08-01 15:00:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1844