Bug 1430948

Summary: HA LVM: What is the proper/supported way to create/convert lvm volumes when using existing tagged resources
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED NOTABUG QA Contact: cluster-qe <cluster-qe>
Severity: low Docs Contact:
Priority: unspecified    
Version: 7.3CC: agk, cfeist, cluster-maint, cmarthal, fdinitto, heinzm, jbrassow, msnitzer, prajnoha, sbradley, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-08 15:24:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2017-03-10 00:03:31 UTC
Description of problem:
I was wondering this while updating QA tools to create different types of HA LVM volumes and noticed that this was even a question asked on the HA LVM config kbase article:
https://access.redhat.com/solutions/3067



Creating PVs out of /dev/sdf1 /dev/sde1 /dev/sdc1 /dev/sdd1 /dev/sda1 /dev/sdg1 /dev/sdb1 /dev/sdh1                                                                                                                             
Creating single VG STSRHTS31975 out of /dev/sdf1 /dev/sde1 /dev/sdc1 /dev/sdd1 /dev/sda1 /dev/sdg1 /dev/sdb1 /dev/sdh1                                                                                                          
                                                                                                                                                                                                                                
Creating HA raid5 LV(s) and xfs filesystems on VG STSRHTS31975
lvcreate --activate ly --type raid5 --nosync -L 8G -n ha1 STSRHTS31975
  WARNING: New raid5 won't be synchronised. Don't read what you didn't write!
lvcreate --activate ly --type raid5 --nosync -L 8G -n ha2 STSRHTS31975
  WARNING: New raid5 won't be synchronised. Don't read what you didn't write!

Adding volume_list, and remaking initrd on...
        host-081
        host-082
        host-083

pcs resource create lvm --group HA_LVM LVM volgrpname="STSRHTS31975" exclusive=true
pcs resource create fs1 --group="HA_LVM" Filesystem device="/dev/STSRHTS31975/ha1" directory="/mnt/ha1" fstype="xfs" "options=noatime" op monitor interval=10s on-fail=fence
pcs resource create fs2 --group="HA_LVM" Filesystem device="/dev/STSRHTS31975/ha2" directory="/mnt/ha2" fstype="xfs" "options=noatime" op monitor interval=10s on-fail=fence

Cleaning up to fix any timing issues that often occur at setup
pcs resource cleanup

Checking status of services on all nodes
Current owner for lvm is host-081

Enabling automatic startup
pcs cluster enable --all






[root@host-081 ~]# pcs status
Cluster name: STSRHTS31975
Stack: corosync
Current DC: host-083 (version 1.1.16-2.el7-94ff4df) - partition with quorum
Last updated: Thu Mar  9 17:38:38 2017
Last change: Thu Mar  9 17:38:19 2017 by root via cibadmin on host-081

3 nodes configured
6 resources configured

Online: [ host-081 host-082 host-083 ]

Full list of resources:

 fence-host-081 (stonith:fence_xvm):    Started host-081
 fence-host-082 (stonith:fence_xvm):    Started host-082
 fence-host-083 (stonith:fence_xvm):    Started host-083
 Resource Group: HA_LVM
     lvm        (ocf::heartbeat:LVM):   Started host-081
     fs1        (ocf::heartbeat:Filesystem):    Started host-081
     fs2        (ocf::heartbeat:Filesystem):    Started host-081

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled







# Now let's try and create another raid volume for another fs resource:

[root@host-081 ~]# lvcreate --activate ly --type raid5 --nosync -L 8G -n ha3 STSRHTS31975
  Using default stripesize 64.00 KiB.
  WARNING: New raid5 won't be synchronised. Don't read what you didn't write!
  Volume "STSRHTS31975/ha3_rmeta_0" is not active locally.
  Failed to zero STSRHTS31975/ha3_rmeta_0

# Without a way to activate, there's no way to make a filesystem, resource, etc...

# This appears to be the easiest hack I could think of but what do we actually recommend/support? Certainly not disabling the resource, editing the volume_list, creating new LVs, reediting the volume_list, bringing back the resource, etc... ?

[root@host-081 ~]# lvcreate --addtag foo --activate ly --type raid5 --nosync -L 8G -n ha4 STSRHTS31975 --config 'activation { volume_list = [ "@foo" ] }'
  Using default stripesize 64.00 KiB.
  WARNING: New raid5 won't be synchronised. Don't read what you didn't write!
  Logical volume "ha4" created.

[root@host-081 ~]# lvs -a -o +devices STSRHTS31975/ha4
  LV   VG           Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                        
  ha4  STSRHTS31975 Rwi-a-r--- 8.00g                                    100.00           ha4_rimage_0(0),ha4_rimage_1(0),ha4_rimage_2(0)

[root@host-081 ~]# mkfs.xfs /dev/STSRHTS31975/ha4
meta-data=/dev/STSRHTS31975/ha4  isize=512    agcount=8, agsize=262128 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=2097024, imaxpct=25
         =                       sunit=16     swidth=32 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@host-081 ~]# pcs resource create fs4 --group="HA_LVM" Filesystem device="/dev/STSRHTS31975/ha4" directory="/mnt/ha4" fstype="xfs" "options=noatime" op monitor interval=10s on-fail=fence
Assumed agent name 'ocf:heartbeat:Filesystem' (deduced from 'Filesystem')

[root@host-081 ~]# pcs status
[...]
 Resource Group: HA_LVM
     lvm        (ocf::heartbeat:LVM):   Started host-081
     fs1        (ocf::heartbeat:Filesystem):    Started host-081
     fs2        (ocf::heartbeat:Filesystem):    Started host-081
     fs4        (ocf::heartbeat:Filesystem):    Started host-081





Version-Release number of selected component (if applicable):
3.10.0-589.el7.x86_64

resource-agents-3.9.5-87.el7   BUILT: Thu 02 Feb 2017 08:17:38 AM CST
lvm2-2.02.166-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
lvm2-libs-2.02.166-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
lvm2-cluster-2.02.166-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
device-mapper-1.02.135-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
device-mapper-libs-1.02.135-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
device-mapper-event-1.02.135-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
device-mapper-event-libs-1.02.135-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016
cmirror-2.02.166-1.el7_3.3    BUILT: Thu Feb  9 08:15:52 CST 2017

Comment 2 Corey Marthaler 2017-05-30 19:21:32 UTC
Changing the subject to reflect that this issue affects convert operations as well.



# Attempt to convert HA linear to HA cache
[root@host-122 ~]# lvchange -aly STSRHTS73701/pool
[root@host-122 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize  Pool   Origin     Data%  Meta%  Move Log Cpy%Sync Convert Devices       
  ha              STSRHTS73701  -wi-ao----  4.00g                                                          /dev/sdh2(0)  
  [lvol0_pmspare] STSRHTS73701  ewi------- 12.00m                                                          /dev/sdd2(0)  
  pool            STSRHTS73701  Cwi---C---  6.00g                                                          pool_cdata(0) 
  [pool_cdata]    STSRHTS73701  Cwi-------  6.00g                                                          /dev/sdd2(6)  
  [pool_cmeta]    STSRHTS73701  ewi------- 12.00m                                                          /dev/sdd2(3)  

# w/o the tag hack
[root@host-122 ~]# lvconvert --yes --type cache --cachepool STSRHTS73701/pool STSRHTS73701/ha
  Volume "STSRHTS73701/pool" is not active locally.
  Aborting. Failed to wipe cache pool STSRHTS73701/pool.

# w/ the tag hack
[root@host-122 ~]# lvconvert --yes --type cache --cachepool STSRHTS73701/pool  --config 'activation { volume_list = [ "@foo" ] }' STSRHTS73701/ha
  Logical volume STSRHTS73701/ha is now cached.





# Attempt to convert HA linear to HA raid
 Resource Group: HA_LVM2
     lvm2       (ocf::heartbeat:LVM):   Started host-121
     fs2        (ocf::heartbeat:Filesystem):    Started host-121

# w/o the tag hack
[root@host-121 ~]# lvconvert --type raid1 -m 1 STSRHTS73702/ha
Are you sure you want to convert linear LV STSRHTS73702/ha to raid1 with 2 images enhancing resilience? [y/n]: y
  Volume "STSRHTS73702/ha_rmeta_0" is not active locally.
  Failed to zero STSRHTS73702/ha_rmeta_0.

[root@host-121 ~]# lvs -a -o +devices
  LV           VG            Attr       LSize   Pool   Origin     Data%  Meta%  Move Log Cpy%Sync Convert Devices        
  ha           STSRHTS73702  -wi-ao---- 4.00g                                                           /dev/sdc1(0)   
  ha_rimage_1  STSRHTS73702  -wi------- 4.00g                                                           /dev/sdb2(1)   
  ha_rmeta_0   STSRHTS73702  -wi------- 4.00m                                                           /dev/sdc1(1024)
  ha_rmeta_1   STSRHTS73702  -wi------- 4.00m                                                           /dev/sdb2(0)   

# w/ the tag hack
# After removing the zombie rimage and rmeta volumes,  the tag hack still doesn't work in this case.
[root@host-121 ~]# lvconvert --yes --type raid1 -m 1 --config 'activation { volume_list = [ "@foo" ] }' STSRHTS73702/ha
  Volume "STSRHTS73702/ha_rmeta_0" is not active locally.
  Failed to zero STSRHTS73702/ha_rmeta_0.

Comment 4 Oyvind Albrigtsen 2017-11-01 14:54:18 UTC
Bumping to 7.6.

Comment 6 Red Hat Bugzilla 2023-09-14 03:55:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days