Bug 1332909
Summary: | LVM resource agent activates partial vg volume even though partial_activation=false | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | michal novacek <mnovacek> | ||||
Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.8 | CC: | agk, cluster-maint, fdinitto, mnovacek, oalbrigt, prajnoha, tlavigne, zkabelac | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | resource-agents-3.9.5-46.el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1392432 (view as bug list) | Environment: | |||||
Last Closed: | 2017-03-21 09:27:44 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1392432 | ||||||
Attachments: |
|
And which lvm2 package version? (Obviously "(null)" should not be appearing in the lvm message that reports a detected misconfiguration.) (In reply to Alasdair Kergon from comment #1) > And which lvm2 package version? $ rpm -q lvm2 lvm2-2.02.143-7.el6.x86_64 The same issue occurs in RHEL 7.3: lvm resource agent does activate partial vg with partial_activation set to false. lvm2-2.02.166-1.el7.x86_64 resource-agents-3.9.5-82.el7.x86_64 New patch to solve issue when the error is reported to stdout: https://github.com/ClusterLabs/resource-agents/pull/919 stderr (not stdout). Updated patch: https://github.com/ClusterLabs/resource-agents/pull/921 I have verified that LVM resource agent will not activate lvm volumes marked partial when partial_activation=false with resource-agents-3.9.5-46 --- Common setup: * have configured running cluster (1) * have raid vg configured atop of several pvs (2) * create and start LVM resource for this vg before the fix (resource-agents-3.9.5-34.el6.x86_64) ==================================================== [root@virt-151 ~]# pcs resource havg (ocf::heartbeat:LVM): Started virt-151 [root@virt-151 ~]# echo offline > /sys/block/sda/device/state [root@virt-151 ~]# pvs Couldn't find device with uuid J3IjmU-47ju-nkE8-786h-aKGZ-0Fpp-TpXFip. PV VG Fmt Attr PSize PFree /dev/sdb raidvg lvm2 a--u 5,00g 0 /dev/sdc raidvg lvm2 a--u 5,00g 0 /dev/sdd raidvg lvm2 a--u 5,00g 0 /dev/sde raidvg lvm2 a--u 5,00g 0 /dev/sdf raidvg lvm2 a--u 5,00g 0 /dev/vda2 vg_virt151 lvm2 a--u 8,07g 0 unknown device raidvg lvm2 a-mu 5,00g 0 << sda is missing [root@virt-151 ~]# vgs Couldn't find device with uuid J3IjmU-47ju-nkE8-786h-aKGZ-0Fpp-TpXFip. VG #PV #LV #SN Attr VSize VFree raidvg 6 1 0 wz-pn- 29,98g 0 vg_virt155 1 2 0 wz--n- 8,07g 0 [root@virt-151 ~]# sleep 30 >> resource is not recognized as partial and continue to run [root@virt-151 ~]# pcs resource havg (ocf::heartbeat:LVM): Started virt-155 >> monitoring of the resource still thinks it's cool [root@virt-151 ~]# pcs resource debug-monitor havg Operation monitor for havg (ocf:heartbeat:LVM) returned 0 > stdout: volume_list="vg_virt151" > stderr: Couldn't find device with uuid J3IjmU-47ju-nkE8-786h-aKGZ-0Fpp-TpXFip. > stderr: Couldn't find device with uuid J3IjmU-47ju-nkE8-786h-aKGZ-0Fpp-TpXFip. [root@virt-151 ~]# echo $? 0 after the fix (resource-agents-3.9.5-46.el6) ============================================ [root@virt-163 ~]# pcs resource havg (ocf::heartbeat:LVM): Started virt-163 [root@virt-163 ~]# echo offline > /sys/block/sda/device/state [root@virt-163 ~]# pvs Couldn't find device with uuid 2Iest2-mRcy-41rD-0GQt-NGDZ-tiYo-J3C2bZ. PV VG Fmt Attr PSize PFree /dev/sdb raidvg lvm2 a--u 5,00g 0 /dev/sdc raidvg lvm2 a--u 5,00g 0 /dev/sdd raidvg lvm2 a--u 5,00g 0 /dev/sde raidvg lvm2 a--u 5,00g 0 /dev/sdf raidvg lvm2 a--u 5,00g 0 /dev/vda2 vg_virt163 lvm2 a--u 8,07g 0 unknown device raidvg lvm2 a-mu 5,00g 0 [root@virt-163 ~]# vgs Couldn't find device with uuid 2Iest2-mRcy-41rD-0GQt-NGDZ-tiYo-J3C2bZ. VG #PV #LV #SN Attr VSize VFree raidvg 6 1 0 wz-pn- 29,98g 0 vg_virt163 1 2 0 wz--n- 8,07g 0 # resource is moved to a different node [root@virt-163 ~]# pcs resource havg (ocf::heartbeat:LVM): Started virt-151 [root@virt-163 ~]# ssh virt-151 [root@virt-151 ~]# vgs VG #PV #LV #SN Attr VSize VFree raidvg 6 1 0 wz--n- 29,98g 0 vg_virt151 1 2 0 wz--n- 8,07g 0 ----- >> (1) pcs config [root@virt-155 ~]# pcs config Cluster Name: STSRHTS2281 Corosync Nodes: virt-163 virt-155 virt-156 virt-157 virt-158 virt-159 virt-163 virt-167 virt-168 virt-169 virt-170 virt-171 virt-172 virt-174 virt-184 virt-185 Pacemaker Nodes: virt-163 virt-155 virt-156 virt-157 virt-158 virt-159 virt-163 virt-167 virt-168 virt-169 virt-170 virt-171 virt-172 virt-174 virt-184 virt-185 Resources: Resource: havg (class=ocf provider=heartbeat type=LVM) Attributes: volgrpname=raidvg exclusive=true partial_activation=false Operations: start interval=0s timeout=30 (havg-start-interval-0s) stop interval=0s timeout=30 (havg-stop-interval-0s) monitor interval=10 timeout=30 (havg-monitor-interval-10) Stonith Devices: Resource: fence-virt-163 (class=stonith type=fence_xvm) Attributes: delay=5 pcmk_host_check=static-list pcmk_host_list=virt-163 pcmk_host_map=virt-151:virt-151.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-163-monitor-interval-60s) Resource: fence-virt-155 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-155 pcmk_host_map=virt-155:virt-155.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-155-monitor-interval-60s) Resource: fence-virt-156 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-156 pcmk_host_map=virt-156:virt-156.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-156-monitor-interval-60s) Resource: fence-virt-157 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-157 pcmk_host_map=virt-157:virt-157.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-157-monitor-interval-60s) Resource: fence-virt-158 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-158 pcmk_host_map=virt-158:virt-158.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-158-monitor-interval-60s) Resource: fence-virt-159 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-159 pcmk_host_map=virt-159:virt-159.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-159-monitor-interval-60s) Resource: fence-virt-163 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-163 pcmk_host_map=virt-163:virt-163.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-163-monitor-interval-60s) Resource: fence-virt-167 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-167 pcmk_host_map=virt-167:virt-167.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-167-monitor-interval-60s) Resource: fence-virt-168 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-168 pcmk_host_map=virt-168:virt-168.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-168-monitor-interval-60s) Resource: fence-virt-169 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-169 pcmk_host_map=virt-169:virt-169.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-169-monitor-interval-60s) Resource: fence-virt-170 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-170 pcmk_host_map=virt-170:virt-170.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-170-monitor-interval-60s) Resource: fence-virt-171 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-171 pcmk_host_map=virt-171:virt-171.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-171-monitor-interval-60s) Resource: fence-virt-172 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-172 pcmk_host_map=virt-172:virt-172.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-172-monitor-interval-60s) Resource: fence-virt-174 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-174 pcmk_host_map=virt-174:virt-174.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-174-monitor-interval-60s) Resource: fence-virt-184 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-184 pcmk_host_map=virt-184:virt-184.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-184-monitor-interval-60s) Resource: fence-virt-185 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=virt-185 pcmk_host_map=virt-185:virt-185.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-185-monitor-interval-60s) Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Ticket Constraints: Alerts: No alerts defined Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: cman dc-version: 1.1.15-4.el6-e174ec8 have-watchdog: false (2) configuration of pv/vg/lv [root@virt-155 ~]# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices raidlv raidvg Rwi-a-r--- 14,98g 100,00 raidlv_rimage_0(0),raidlv_rimage_1(0) [raidlv_rimage_0] raidvg iwi-aor--- 14,98g /dev/sdd(1) [raidlv_rimage_0] raidvg iwi-aor--- 14,98g /dev/sde(0) [raidlv_rimage_0] raidvg iwi-aor--- 14,98g /dev/sdf(0) [raidlv_rimage_1] raidvg iwi-aor--- 14,98g /dev/sdb(1) [raidlv_rimage_1] raidvg iwi-aor--- 14,98g /dev/sda(0) [raidlv_rimage_1] raidvg iwi-aor--- 14,98g /dev/sdc(0) [raidlv_rmeta_0] raidvg ewi-aor--- 4,00m /dev/sdd(0) [raidlv_rmeta_1] raidvg ewi-aor--- 4,00m /dev/sdb(0) lv_root vg_virt156 -wi-ao---- 7,21g /dev/vda2(0) lv_swap vg_virt156 -wi-ao---- 876,00m /dev/vda2(1847) [root@virt-155 ~]# vgs -a VG #PV #LV #SN Attr VSize VFree raidvg 6 1 0 wz--nc 29,98g 0 vg_virt155 1 2 0 wz--n- 8,07g 0 [root@virt-155 ~]# pvs -o +devices PV VG Fmt Attr PSize PFree Devices /dev/sda raidvg lvm2 a--u 5,00g 0 /dev/sda(0) /dev/sdb raidvg lvm2 a--u 5,00g 0 /dev/sdb(0) /dev/sdb raidvg lvm2 a--u 5,00g 0 /dev/sdb(1) /dev/sdc raidvg lvm2 a--u 5,00g 0 /dev/sdc(0) /dev/sdd raidvg lvm2 a--u 5,00g 0 /dev/sdd(0) /dev/sdd raidvg lvm2 a--u 5,00g 0 /dev/sdd(1) /dev/sde raidvg lvm2 a--u 5,00g 0 /dev/sde(0) /dev/sdf raidvg lvm2 a--u 5,00g 0 /dev/sdf(0) /dev/vda2 vg_virt156 lvm2 a--u 8,07g 0 /dev/vda2(0) /dev/vda2 vg_virt156 lvm2 a--u 8,07g 0 /dev/vda2(1847) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0602.html |
Created attachment 1153785 [details] 'pcs cluster report' output Description of problem: Heartbeat:LVM resource agent activates type1 partial vg even though it has partial_activation=false set. Version-Release number of selected component (if applicable): resource-agents-3.9.5-34.el6.x86_64 How reproducible: always Steps to Reproduce: . create vg consisted of several shared pvs . create lvm raid on this vg. I used the following command: lvcreate -ay --name raidlv --type raid1 --extents 100%VG \ --nosync --mirrors 1 \ raidvg . create cluster ready to use halvm . create lvm resource with partial_activation=false atop of raidvg . start it . fail one of the disks on the node running the resource: # echo offline > /sys/block/sda/device/state . disable the resource . check that vg is seen as partial on all nodes and the 'offline'd disk is recognized as missing . enable the resource Actual results: lvm resource starts Expected results: lvm resource refuse to start on any of the nodes Additional info: --- # pcs config Cluster Name: STSRHTS2774 Corosync Nodes: virt-131.cluster-qe.lab.eng.brq.redhat.com virt-145.cluster-qe.lab.eng.brq.redhat.com virt-155.cluster-qe.lab.eng.brq.redhat.com Pacemaker Nodes: virt-131.cluster-qe.lab.eng.brq.redhat.com virt-145.cluster-qe.lab.eng.brq.redhat.com virt-155.cluster-qe.lab.eng.brq.redhat.com Resources: Resource: havg (class=ocf provider=heartbeat type=LVM) Attributes: exclusive=true partial_activation=false volgrpname=raidvg Operations: start interval=0s timeout=30 (havg-start-interval-0s) stop interval=0s timeout=30 (havg-stop-interval-0s) monitor interval=10 timeout=30 (havg-monitor-interval-10) Stonith Devices: Resource: fence-virt-131 (class=stonith type=fence_xvm) Attributes: delay=5 action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-131.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-131-monitor-interval-60s) Resource: fence-virt-145 (class=stonith type=fence_xvm) Attributes: action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-145.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-145-monitor-interval-60s) Resource: fence-virt-155 (class=stonith type=fence_xvm) Attributes: action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-155.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-155-monitor-interval-60s) Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: cman dc-version: 1.1.14-8.el6-70404b0 have-watchdog: false [root@virt-131 ~]# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert raidlv raidvg Rwi-a-r--- 14.97g 100.00 [raidlv_rimage_0] raidvg iwi-aor--- 14.97g [raidlv_rimage_1] raidvg iwi-aor--- 14.97g [raidlv_rmeta_0] raidvg ewi-aor--- 4.00m [raidlv_rmeta_1] raidvg ewi-aor--- 4.00m lv_root vg_virt189 -wi-ao---- 6.86g lv_swap vg_virt189 -wi-ao---- 832.00m [root@virt-131 ~]# vgs -a VG #PV #LV #SN Attr VSize VFree raidvg 6 1 0 wz--n- 29.95g 0 vg_virt189 1 2 0 wz--n- 7.67g 0 [root@virt-131 ~]# rpm -q resource-agents resource-agents-3.9.5-34.el6.x86_64 [root@virt-131 ~]# pvs PV VG Fmt Attr PSize PFree /dev/sda1 raidvg lvm2 a--u 4.99g 0 /dev/sdb1 raidvg lvm2 a--u 4.99g 0 /dev/sdc1 raidvg lvm2 a--u 4.99g 0 /dev/sdd1 raidvg lvm2 a--u 4.99g 0 /dev/sde1 raidvg lvm2 a--u 4.99g 0 /dev/sdf1 raidvg lvm2 a--u 4.99g 0 /dev/vda2 vg_virt189 lvm2 a--u 7.67g 0 [root@virt-131 ~]# echo offline > /sys/block/sdd/device/state [root@virt-131 ~]# pvs /dev/sdd1: read failed after 0 of 512 at 5362745344: Input/output error /dev/sdd1: read failed after 0 of 512 at 5362839552: Input/output error /dev/sdd1: read failed after 0 of 512 at 0: Input/output error /dev/sdd1: read failed after 0 of 512 at 4096: Input/output error /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid lqs9vf-0to6-aTCH-CnSk-dZyl-hc9q-LQMYBc. Couldn't find device for segment belonging to raidvg/raidlv_rimage_1 while checking used and assumed devices. PV VG Fmt Attr PSize PFree /dev/sda1 raidvg lvm2 a--u 4.99g 0 /dev/sdb1 raidvg lvm2 a--u 4.99g 0 /dev/sdc1 raidvg lvm2 a--u 4.99g 0 /dev/sde1 raidvg lvm2 a--u 4.99g 0 /dev/sdf1 raidvg lvm2 a--u 4.99g 0 /dev/vda2 vg_virt189 lvm2 a--u 7.67g 0 unknown device raidvg lvm2 a-mu 4.99g 0 [root@virt-131 ~]# pcs resource disable havg [root@virt-131 ~]# pcs resource enable havg [root@virt-131 ~]# pcs resource havg (ocf::heartbeat:LVM): Started virt-145.cluster-qe.lab.eng.brq.redhat.com [root@virt-131 ~]# ssh virt-145 vgs raidvg WARNING: Device mismatch detected for raidvg/raidlv_rimage_1 which is accessing /dev/sdd1 instead of (null). VG #PV #LV #SN Attr VSize VFree raidvg 6 1 0 wz-pn- 29.95g 0 [root@virt-131 ~]# ssh virt-145 pvs WARNING: Device mismatch detected for raidvg/raidlv_rimage_1 which is accessing /dev/sdd1 instead of (null). PV VG Fmt Attr PSize PFree /dev/sda1 raidvg lvm2 a--u 4.99g 0 /dev/sdb1 raidvg lvm2 a--u 4.99g 0 /dev/sdc1 raidvg lvm2 a--u 4.99g 0 /dev/sdd1 raidvg lvm2 a-mu 4.99g 0 /dev/sde1 raidvg lvm2 a--u 4.99g 0 /dev/sdf1 raidvg lvm2 a--u 4.99g 0 /dev/vda2 vg_virt189 lvm2 a--u 7.67g 0 [root@virt-131 ~]# ssh virt-145 lvs -agg WARNING: Device mismatch detected for raidvg/raidlv_rimage_1 which is accessing /dev/sdd1 instead of (null). LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert raidlv raidvg Rwi-a-r-p- 14.97g 100.00 [raidlv_rimage_0] raidvg iwi-aor--- 14.97g [raidlv_rimage_1] raidvg iwi-aor-p- 14.97g [raidlv_rmeta_0] raidvg ewi-aor--- 4.00m [raidlv_rmeta_1] raidvg ewi-aor--- 4.00m lv_root vg_virt189 -wi-ao---- 6.86g lv_swap vg_virt189 -wi-ao---- 832.00m