Bug 1619428
Summary: | HA LVM-activate: warn user they provided an invalid value for vg_access_mode | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> |
Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.6 | CC: | agk, cfeist, cluster-maint, fdinitto, lmiksik, mlisik, rbednar |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | resource-agents-4.1.1-12.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-30 11:39:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2018-08-20 20:09:28 UTC
[root@mckinley-02 ~]# pcs resource describe LVM-activate [...] vg_access_mode (required): This option decides which solution will be used to protect the volume group in cluster environment. Optional solutions are: lvmlockd, clvmd, system_id and tagging. Option validation for fence agents was added in this RFE 1434936. Perhaps resource agents should have been covered by this as well? Seems like it just needs to return OCF_ERR_CONFIGURED instead of OCF_ERR_ARGS from validate. It appears the latest scratch build does not fix this issue. The behavior is still the same. [root@mckinley-01 ~]# rpm -qi resource-agents Name : resource-agents Version : 4.1.1 Release : 8.el7 Architecture: x86_64 Install Date: Tue 21 Aug 2018 11:51:46 AM CDT Group : System Environment/Base Size : 1366639 License : GPLv2+ and LGPLv2+ and ASL 2.0 Signature : (none) Source RPM : resource-agents-4.1.1-8.el7.src.rpm Build Date : Tue 21 Aug 2018 05:29:33 AM CDT Cluster name: MCKINLEY Stack: corosync Current DC: mckinley-02 (version 1.1.19-3.el7-c3c624ea3d) - partition with quorum Last updated: Tue Aug 21 13:14:28 2018 Last change: Tue Aug 21 12:53:03 2018 by root via cibadmin on mckinley-03 3 nodes configured 7 resources configured Online: [ mckinley-01 mckinley-02 mckinley-03 ] Full list of resources: mckinley-apc (stonith:fence_apc): Started mckinley-01 Clone Set: dlm_for_lvmlockd-clone [dlm_for_lvmlockd] Started: [ mckinley-01 mckinley-02 mckinley-03 ] Clone Set: lvmlockd-clone [lvmlockd] Started: [ mckinley-01 mckinley-02 mckinley-03 ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: inactive/disabled [root@mckinley-01 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert ha MCKINLEY1 rwi-a-r--- 8.00g ha MCKINLEY2 rwi-a-r--- 8.00g home rhel_mckinley-01 -wi-ao---- <502.75g root rhel_mckinley-01 -wi-ao---- 50.00g swap rhel_mckinley-01 -wi-ao---- 4.00g [root@mckinley-01 ~]# pcs resource create lvm1 --group HA_LVM1 ocf:heartbeat:LVM-activate vgname=MCKINLEY1 vg_access_mode=exclusive # mckinley-02 Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: State transition S_IDLE -> S_POLICY_ENGINE Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Start lvm1 ( mckinley-02 ) Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: Calculated transition 1, saving inputs in /var/lib/pacemaker/pengine/pe-input-130.bz2 Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Initiating monitor operation lvm1_monitor_0 on mckinley-03 Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Initiating monitor operation lvm1_monitor_0 locally on mckinley-02 Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Initiating monitor operation lvm1_monitor_0 on mckinley-01 Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Result of probe operation for lvm1 on mckinley-02: 7 (not running) Aug 21 13:14:34 mckinley-02 crmd[2342]: warning: Action 9 (lvm1_monitor_0) on mckinley-01 failed (target: 7 vs. rc: 0): Error Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Transition aborted by operation lvm1_monitor_0 'modify' on mckinley-01: Event failed Aug 21 13:14:34 mckinley-02 crmd[2342]: warning: Action 9 (lvm1_monitor_0) on mckinley-01 failed (target: 7 vs. rc: 0): Error Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Transition 1 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=3, Source=/var/lib/pacemaker/pengine/pe-input-130.bz2): Complete Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Move lvm1 ( mckinley-01 -> mckinley-02 ) Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: Calculated transition 2, saving inputs in /var/lib/pacemaker/pengine/pe-input-131.bz2 Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Initiating stop operation lvm1_stop_0 on mckinley-01 Aug 21 13:14:34 mckinley-02 crmd[2342]: warning: Action 31 (lvm1_stop_0) on mckinley-01 failed (target: 0 vs. rc: 6): Error Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Transition aborted by operation lvm1_stop_0 'modify' on mckinley-01: Event failed Aug 21 13:14:34 mckinley-02 crmd[2342]: warning: Action 31 (lvm1_stop_0) on mckinley-01 failed (target: 0 vs. rc: 6): Error Aug 21 13:14:34 mckinley-02 crmd[2342]: notice: Transition 2 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=6, Source=/var/lib/pacemaker/pengine/pe-input-131.bz2): Complete Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Processing failed stop of lvm1 on mckinley-01: not configured Aug 21 13:14:34 mckinley-02 pengine[2341]: error: Preventing lvm1 from re-starting anywhere: operation stop failed 'not configured' (6) Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Processing failed stop of lvm1 on mckinley-01: not configured Aug 21 13:14:34 mckinley-02 pengine[2341]: error: Preventing lvm1 from re-starting anywhere: operation stop failed 'not configured' (6) Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Cluster node mckinley-01 will be fenced: lvm1 failed there Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Scheduling Node mckinley-01 for STONITH Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: Stop of failed resource lvm1 is implicit after mckinley-01 is fenced Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Fence (reboot) mckinley-01 'lvm1 failed there' Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Move mckinley-apc ( mckinley-01 -> mckinley-02 ) Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Stop dlm_for_lvmlockd:2 ( mckinley-01 ) due to node availability Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Stop lvmlockd:2 ( mckinley-01 ) due to node availability Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Stop lvm1 ( mckinley-01 ) due to node availability Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Calculated transition 3 (with warnings), saving inputs in /var/lib/pacemaker/pengine/pe-warn-17.bz2 Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Processing failed stop of lvm1 on mckinley-01: not configured Aug 21 13:14:34 mckinley-02 pengine[2341]: error: Preventing lvm1 from re-starting anywhere: operation stop failed 'not configured' (6) Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Processing failed stop of lvm1 on mckinley-01: not configured Aug 21 13:14:34 mckinley-02 pengine[2341]: error: Preventing lvm1 from re-starting anywhere: operation stop failed 'not configured' (6) Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Cluster node mckinley-01 will be fenced: lvm1 failed there Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Forcing lvm1 away from mckinley-01 after 1000000 failures (max=1000000) Aug 21 13:14:34 mckinley-02 pengine[2341]: warning: Scheduling Node mckinley-01 for STONITH Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: Stop of failed resource lvm1 is implicit after mckinley-01 is fenced Aug 21 13:14:34 mckinley-02 pengine[2341]: notice: * Fence (reboot) mckinley-01 'lvm1 failed there' # mckinley-03 Aug 21 13:14:34 mckinley-03 crmd[2206]: notice: Result of probe operation for lvm1 on mckinley-03: 7 (not running) Aug 21 13:14:34 mckinley-03 stonith-ng[2202]: notice: mckinley-apc can fence (reboot) mckinley-01 (aka. '1'): static-list Aug 21 13:14:34 mckinley-03 stonith-ng[2202]: notice: mckinley-apc can fence (reboot) mckinley-01 (aka. '1'): static-list Aug 21 13:14:34 mckinley-03 fence_apc: Unable to connect/login to fencing device Aug 21 13:14:34 mckinley-03 stonith-ng[2202]: warning: fence_apc[6022] stderr: [ 2018-08-21 13:14:34,711 ERROR: Unable to connect/login to fencing device ] Aug 21 13:14:34 mckinley-03 stonith-ng[2202]: warning: fence_apc[6022] stderr: [ ] Aug 21 13:14:34 mckinley-03 stonith-ng[2202]: warning: fence_apc[6022] stderr: [ ] Aug 21 13:14:41 mckinley-03 corosync[1786]: [TOTEM ] A processor failed, forming new configuration. Aug 21 13:14:43 mckinley-03 stonith-ng[2202]: notice: Operation 'reboot' [6028] (call 2 from crmd.2342) for host 'mckinley-01' with device 'mckinley-apc' returned: 0 (OK) Aug 21 13:14:45 mckinley-03 corosync[1786]: [TOTEM ] A new membership (10.15.104.62:640) was formed. Members left: 1 Aug 21 13:14:45 mckinley-03 corosync[1786]: [TOTEM ] Failed to receive the leave message. failed: 1 Aug 21 13:14:45 mckinley-03 attrd[2204]: notice: Node mckinley-01 state is now lost Aug 21 13:14:45 mckinley-03 cib[2201]: notice: Node mckinley-01 state is now lost Aug 21 13:14:45 mckinley-03 attrd[2204]: notice: Removing all mckinley-01 attributes for peer loss Aug 21 13:14:45 mckinley-03 attrd[2204]: notice: Lost attribute writer mckinley-01 Aug 21 13:14:45 mckinley-03 cib[2201]: notice: Purged 1 peer with id=1 and/or uname=mckinley-01 from the membership cache Aug 21 13:14:45 mckinley-03 attrd[2204]: notice: Purged 1 peer with id=1 and/or uname=mckinley-01 from the membership cache Aug 21 13:14:45 mckinley-03 stonith-ng[2202]: notice: Node mckinley-01 state is now lost Aug 21 13:14:45 mckinley-03 stonith-ng[2202]: notice: Purged 1 peer with id=1 and/or uname=mckinley-01 from the membership cache Aug 21 13:14:45 mckinley-03 corosync[1786]: [QUORUM] Members[2]: 2 3 Aug 21 13:14:45 mckinley-03 corosync[1786]: [MAIN ] Completed service synchronization, ready to provide service. Aug 21 13:14:45 mckinley-03 crmd[2206]: notice: Node mckinley-01 state is now lost Aug 21 13:14:45 mckinley-03 pacemakerd[2187]: notice: Node mckinley-01 state is now lost Aug 21 13:14:45 mckinley-03 kernel: dlm: closing connection to node 1 Aug 21 13:14:46 mckinley-03 stonith-ng[2202]: notice: Operation reboot of mckinley-01 by mckinley-03 for crmd.2342: OK The behavior appears to be the same in the latest rpms. [root@harding-02 ~]# rpm -qi resource-agents Name : resource-agents Version : 4.1.1 Release : 10.el7 Architecture: x86_64 Install Date: Mon 24 Sep 2018 11:07:42 AM CDT Group : System Environment/Base Size : 1368011 License : GPLv2+ and LGPLv2+ and ASL 2.0 Signature : (none) Source RPM : resource-agents-4.1.1-10.el7.src.rpm Build Date : Wed 05 Sep 2018 02:48:08 AM CDT [root@harding-02 ~]# pcs status Cluster name: HARDING Stack: corosync Current DC: harding-03 (version 1.1.19-7.el7-c3c624ea3d) - partition with quorum Last updated: Mon Sep 24 11:38:18 2018 Last change: Mon Sep 24 11:38:12 2018 by root via cibadmin on harding-02 2 nodes configured 5 resources configured Online: [ harding-02 harding-03 ] Full list of resources: smoke-apc (stonith:fence_apc): Started harding-02 Clone Set: dlm_for_lvmlockd-clone [dlm_for_lvmlockd] Started: [ harding-02 harding-03 ] Clone Set: lvmlockd-clone [lvmlockd] Started: [ harding-02 harding-03 ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@harding-02 ~]# pcs status Cluster name: HARDING Stack: corosync Current DC: harding-03 (version 1.1.19-7.el7-c3c624ea3d) - partition with quorum Last updated: Mon Sep 24 11:40:05 2018 Last change: Mon Sep 24 11:38:12 2018 by root via cibadmin on harding-02 2 nodes configured 5 resources configured Online: [ harding-02 harding-03 ] Full list of resources: smoke-apc (stonith:fence_apc): Started harding-02 Clone Set: dlm_for_lvmlockd-clone [dlm_for_lvmlockd] Started: [ harding-02 harding-03 ] Clone Set: lvmlockd-clone [lvmlockd] Started: [ harding-02 harding-03 ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled Creating VG HARDING1 out of /dev/mapper/mpatha1 /dev/mapper/mpathb1 /dev/mapper/mpathc1 harding-02: vgchange --lock-start HARDING1 harding-03: vgchange --lock-start HARDING1 Creating HA raid1 LV(s) and ext4 filesystems on VG HARDING1 lvcreate --activate ey --type raid1 --nosync -L 8G -n ha HARDING1 WARNING: New raid1 won't be synchronised. Don't read what you didn't write! WARNING: ext4 signature detected on /dev/HARDING1/ha at offset 1080. Wipe it? [y/n]: [n] Aborted wiping of ext4. 1 existing signature left on the device. Creating ext4 filesystem mke2fs 1.42.9 (28-Dec-2013) [root@harding-02 ~]# lvs -a -o +devices LV VG Attr LSize Cpy%Sync Convert Devices ha HARDING1 Rwi-a-r--- 8.00g 100.00 ha_rimage_0(0),ha_rimage_1(0) [ha_rimage_0] HARDING1 iwi-aor--- 8.00g /dev/mapper/mpatha1(1) [ha_rimage_1] HARDING1 iwi-aor--- 8.00g /dev/mapper/mpathb1(1) [ha_rmeta_0] HARDING1 ewi-aor--- 4.00m /dev/mapper/mpatha1(0) [ha_rmeta_1] HARDING1 ewi-aor--- 4.00m /dev/mapper/mpathb1(0) [root@harding-02 ~]# pcs resource create lvm1 --group HA_LVM1 ocf:heartbeat:LVM-activate volgrpname=HARDING1 activation_mode=exclusive Error: invalid resource option 'volgrpname', allowed options are: activation_mode, lvname, tag, trace_file, trace_ra, vg_access_mode, vgname, use --force to override Error: required resource options 'vg_access_mode', 'vgname' are missing, use --force to override [root@harding-02 ~]# pcs resource create lvm1 --group HA_LVM1 ocf:heartbeat:LVM-activate vgname=HARDING1 activation_mode=exclusive Error: required resource option 'vg_access_mode' is missing, use --force to override [root@harding-02 ~]# pcs resource create lvm1 --group HA_LVM1 ocf:heartbeat:LVM-activate vgname=HARDING1 vg_access_mode=exclusive [root@harding-02 ~]# echo $? 0 # the error still exists in the messages (on another node in the cluster) Sep 24 11:42:30 harding-03 crmd[3464]: notice: Initiating start operation lvm1_start_0 locally on harding-03 Sep 24 11:42:30 harding-03 LVM-activate(lvm1)[11518]: ERROR: You specified an invalid value for vg_access_mode: exclusive Sep 24 11:42:30 harding-03 lrmd[3461]: notice: lvm1_start_0:11518:stderr [ ocf-exit-reason:You specified an invalid value for vg_access_mode: excl] Sep 24 11:42:30 harding-03 crmd[3464]: notice: Result of start operation for lvm1 on harding-03: 6 (not configured) Sep 24 11:42:30 harding-03 crmd[3464]: notice: harding-03-lvm1_start_0:31 [ ocf-exit-reason:You specified an invalid value for vg_access_mode: exc] Sep 24 11:42:30 harding-03 crmd[3464]: warning: Action 26 (lvm1_start_0) on harding-03 failed (target: 0 vs. rc: 6): Error Sep 24 11:42:30 harding-03 crmd[3464]: notice: Transition aborted by operation lvm1_start_0 'modify' on harding-03: Event failed Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3278 |