RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1901357 - crypt resource agent appears incapable of opening the crypt device itself at resource definition time
Summary: crypt resource agent appears incapable of opening the crypt device itself at ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: resource-agents
Version: 8.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 8.0
Assignee: Oyvind Albrigtsen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-24 22:59 UTC by Corey Marthaler
Modified: 2024-10-01 17:07 UTC (History)
5 users (show)

Fixed In Version: resource-agents-4.1.1-80.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-18 15:12:05 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 31567 0 None None None 2020-12-14 16:02:39 UTC

Description Corey Marthaler 2020-11-24 22:59:04 UTC
Description of problem:
When going through the current GFS+crypt documentation I noticed that the 'cryptsetup luksOpen' cmd was absent, which makes sense, since if the cluster is going to manage the location and starting/stopping of this resource it will need to do the luksOpen itself. 

I was only able to get the GFS+crypt resource working by cloning the HA group that contained the shared LVM volume early on, disabling stonith, and doing a manual luksOpen myself to ensure the volume was available and active on all nodes before the crypt resource definition.

I then set out to see if w/o these manual hacks, I could get an exclusive crypt resource running w/o GFS in the picture and debug why it was failing otherwise.


/tmp/luks_key_file -> host-073:/etc/luks_key_file
/tmp/luks_key_file -> host-092:/etc/luks_key_file
/tmp/luks_key_file -> host-093:/etc/luks_key_file
Creating single VG STSRHTS13085 out of /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1
host-073: vgchange --lock-start STSRHTS13085
host-092: vgchange --lock-start STSRHTS13085
host-093: vgchange --lock-start STSRHTS13085

Creating HA striped LV(s) and ext4 filesystems on VG STSRHTS13085
        lvcreate --yes --activate ey --type striped -L 8G -i 2 -n lv1 STSRHTS13085

cryptsetup luksFormat /dev/STSRHTS13085/lv1 --type luks2 --key-file=/etc/luks_key_file
LUKS_UUID=487adddd-fd62-46a8-98d8-98940fe39d7e

pcs resource create STSRHTS13085 --group HA_STSRHTS13085 ocf:heartbeat:LVM-activate vgname="STSRHTS13085" activation_mode=exclusive vg_access_mode=lvmlockd

# didn't want to deal with nodes being fenced
[root@host-092 ~]# pcs property set stonith-enabled=false

# Currently a healthy cluster with one lvm resource, currently exclusive and active on only one node (host-092):
[root@host-092 ~]# pcs status
Cluster name: STSRHTS13085
Cluster Summary:
  * Stack: corosync
  * Current DC: host-073 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
  * Last updated: Tue Nov 24 16:23:41 2020
  * Last change:  Tue Nov 24 16:23:24 2020 by root via cibadmin on host-092
  * 3 nodes configured
  * 10 resource instances configured

Node List:
  * Online: [ host-073 host-092 host-093 ]

Full List of Resources:
  * fence-host-073      (stonith:fence_xvm):     Started host-073
  * fence-host-092      (stonith:fence_xvm):     Started host-092
  * fence-host-093      (stonith:fence_xvm):     Started host-093
  * Clone Set: locking-clone [locking]:
    * Started: [ host-073 host-092 host-093 ]
  * Resource Group: HA_STSRHTS13085:
    * STSRHTS13085      (ocf::heartbeat:LVM-activate):   Started host-092

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


# Quick verification that this is a valid/active LV w/ luks formatting and able to be opened and closed:
[root@host-092 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool   Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                  
  lv1             STSRHTS13085  -wi-a-----   8.00g                                                       /dev/sda1(0),/dev/sdb1(0)
  [lvol0_pmspare] rhel_host-092 ewi-------   4.00m                                                       /dev/vda2(0)             
  pool00          rhel_host-092 twi-aotz--  <4.79g               68.51  51.76                            pool00_tdata(0)          
  [pool00_tdata]  rhel_host-092 Twi-ao----  <4.79g                                                       /dev/vda2(1)             
  [pool00_tmeta]  rhel_host-092 ewi-ao----   4.00m                                                       /dev/vda2(1226)          
  root            rhel_host-092 Vwi-aotz--  <4.79g pool00        68.51                                                            
  swap            rhel_host-092 -wi-ao---- 820.00m                                                       /dev/vda2(1227)          
[root@host-092 ~]# cryptsetup luksOpen /dev/STSRHTS13085/lv1 luks_lv1 --key-file=/etc/luks_key_file
[root@host-092 ~]# dmsetup ls
luks_lv1        (253:7)
STSRHTS13085-lv1        (253:6)
[root@host-092 ~]# cryptsetup luksClose luks_lv1 

# So far so good. Now to define the crypt resource and attach it to the HA group that has the LVM resource in it so they run and are activated together

[root@host-092 ~]# pcs resource create crypt1 --force --group HA_STSRHTS13085 ocf:heartbeat:crypt crypt_dev="luks_lv1" crypt_type=luks2 key_file=/etc/luks_key_file encrypted_dev=487adddd-fd62-46a8-98d8-98940fe39d7e
[root@host-092 ~]# 

# This quickly fails.

# from host-092 log:
Nov 24 16:28:55 host-092 pacemaker-controld[1449112]: notice: Result of probe operation for crypt1 on host-092: not running
Nov 24 16:28:55 host-092 pacemaker-attrd[1449110]: notice: Setting fail-count-crypt1#stop_0[host-073]: (unset) -> INFINITY
Nov 24 16:28:55 host-092 pacemaker-attrd[1449110]: notice: Setting last-failure-crypt1#stop_0[host-073]: (unset) -> 1606256935
Nov 24 16:28:55 host-092 pacemaker-attrd[1449110]: notice: Setting fail-count-crypt1#stop_0[host-093]: (unset) -> INFINITY
Nov 24 16:28:55 host-092 pacemaker-attrd[1449110]: notice: Setting last-failure-crypt1#stop_0[host-093]: (unset) -> 1606256935



[root@host-092 ~]# pcs status
Cluster name: STSRHTS13085
Cluster Summary:
  * Stack: corosync
  * Current DC: host-073 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
  * Last updated: Tue Nov 24 16:32:39 2020
  * Last change:  Tue Nov 24 16:28:54 2020 by root via cibadmin on host-092
  * 3 nodes configured
  * 11 resource instances configured (1 BLOCKED from further action due to failure)

Node List:
  * Online: [ host-073 host-092 host-093 ]

Full List of Resources:
  * fence-host-073      (stonith:fence_xvm):     Started host-073
  * fence-host-092      (stonith:fence_xvm):     Started host-092
  * fence-host-093      (stonith:fence_xvm):     Started host-093
  * Clone Set: locking-clone [locking]:
    * Started: [ host-073 host-092 host-093 ]
  * Resource Group: HA_STSRHTS13085:
    * STSRHTS13085      (ocf::heartbeat:LVM-activate):   Started host-092
    * crypt1    (ocf::heartbeat:crypt):  FAILED (blocked) [ host-093 host-073 ]

Failed Resource Actions:
  * crypt1_stop_0 on host-093 'invalid parameter' (2): call=38, status='complete', exitreason='Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible', last-rc-change='2020-11-24 16:28:55 -06:00', queued=0ms, exec=73ms
  * crypt1_stop_0 on host-073 'invalid parameter' (2): call=38, status='complete', exitreason='Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible', last-rc-change='2020-11-24 16:28:55 -06:00', queued=0ms, exec=69ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


# Why is the cluster trying to run the crypt resource on host-073 when currently the LVM is active on host-092, and if it's going to run on host-073, it needs to relocate the lvm resource which is a part of that group as well. Then, once (and only once the lvm vol is active) can the luksOpen be attempted and the resource brought online on host-073.


# From host-073 log:
Nov 24 16:28:54 host-073 pacemaker-controld[760698]: notice: State transition S_IDLE -> S_POLICY_ENGINE
Nov 24 16:28:54 host-073 pacemaker-schedulerd[760697]: notice:  * Start      crypt1             ( host-092 )
Nov 24 16:28:54 host-073 pacemaker-schedulerd[760697]: notice: Calculated transition 530, saving inputs in /var/lib/pacemaker/pengine/pe-input-57.bz2
Nov 24 16:28:54 host-073 pacemaker-controld[760698]: notice: Initiating monitor operation crypt1_monitor_0 on host-093
Nov 24 16:28:54 host-073 pacemaker-controld[760698]: notice: Initiating monitor operation crypt1_monitor_0 on host-092
Nov 24 16:28:54 host-073 pacemaker-controld[760698]: notice: Initiating monitor operation crypt1_monitor_0 locally on host-073
Nov 24 16:28:55 host-073 crypt(crypt1)[1755199]: ERROR: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible
Nov 24 16:28:55 host-073 pacemaker-execd[760695]: notice: crypt1_monitor_0[1755199] error output [ ocf-exit-reason:Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible ]
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Result of probe operation for crypt1 on host-073: invalid parameter
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: host-073-crypt1_monitor_0:37 [ ocf-exit-reason:Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible\n ]
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 530 aborted by operation crypt1_monitor_0 'modify' on host-073: Event failed
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 530 action 11 (crypt1_monitor_0 on host-073): expected 'not running' but got 'invalid parameter'
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 530 action 13 (crypt1_monitor_0 on host-093): expected 'not running' but got 'invalid parameter'
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 530 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=3, Source=/var/lib/pacemaker/pengine/pe-input-57.bz2): Complete
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for probe of crypt1 on host-093 at Nov 24 16:28:54 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: If it is not possible for crypt1 to run on host-093, see the resource-discovery option for location constraints
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for probe of crypt1 on host-093 at Nov 24 16:28:54 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: If it is not possible for crypt1 to run on host-093, see the resource-discovery option for location constraints
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for probe of crypt1 on host-073 at Nov 24 16:28:54 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: If it is not possible for crypt1 to run on host-073, see the resource-discovery option for location constraints
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-073 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for probe of crypt1 on host-073 at Nov 24 16:28:54 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: If it is not possible for crypt1 to run on host-073, see the resource-discovery option for location constraints
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-073 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Resource crypt1 is active on 2 nodes (attempting recovery)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice:  * Recover    crypt1             ( host-093 -> host-092 )
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Calculated transition 531 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-7.bz2
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Initiating stop operation crypt1_stop_0 locally on host-073
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Initiating stop operation crypt1_stop_0 on host-093
Nov 24 16:28:55 host-073 crypt(crypt1)[1755217]: ERROR: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible
Nov 24 16:28:55 host-073 pacemaker-execd[760695]: notice: crypt1_stop_0[1755217] error output [ ocf-exit-reason:Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible ]
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Result of stop operation for crypt1 on host-073: invalid parameter
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: host-073-crypt1_stop_0:38 [ ocf-exit-reason:Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible\n ]
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 531 aborted by operation crypt1_stop_0 'modify' on host-073: Event failed
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 531 action 12 (crypt1_stop_0 on host-073): expected 'ok' but got 'invalid parameter'
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 531 action 8 (crypt1_stop_0 on host-093): expected 'ok' but got 'invalid parameter'
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 531 (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-error-7.bz2): Complete
Nov 24 16:28:55 host-073 pacemaker-attrd[760696]: notice: Setting fail-count-crypt1#stop_0[host-073]: (unset) -> INFINITY
Nov 24 16:28:55 host-073 pacemaker-attrd[760696]: notice: Setting last-failure-crypt1#stop_0[host-073]: (unset) -> 1606256935
Nov 24 16:28:55 host-073 pacemaker-attrd[760696]: notice: Setting fail-count-crypt1#stop_0[host-093]: (unset) -> INFINITY
Nov 24 16:28:55 host-073 pacemaker-attrd[760696]: notice: Setting last-failure-crypt1#stop_0[host-093]: (unset) -> 1606256935
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-093 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-093 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-073 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-073 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-073 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-073 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-073 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-073 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Resource crypt1 is active on 2 nodes (attempting recovery)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Calculated transition 532 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-8.bz2
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-093 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-093 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-073 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-073 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-073 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: No further recovery can be attempted for crypt1 because stop on host-073 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible) was recorded for stop of crypt1 on host-073 at Nov 24 16:28:55 2020
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Preventing crypt1 from restarting on host-073 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/487adddd-fd62-46a8-98d8-98940fe39d7e not accessible)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Forcing crypt1 away from host-073 after 1000000 failures (max=1000000)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: warning: Forcing crypt1 away from host-093 after 1000000 failures (max=1000000)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Resource crypt1 is active on 2 nodes (attempting recovery)
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: notice: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information
Nov 24 16:28:55 host-073 pacemaker-schedulerd[760697]: error: Calculated transition 533 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-9.bz2
Nov 24 16:28:55 host-073 pacemaker-controld[760698]: notice: Transition 533 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-error-9.bz2): Complete



Version-Release number of selected component (if applicable):
resource-agents-4.1.1-74.el8.x86_64


How reproducible:
Everytime

Comment 1 Oyvind Albrigtsen 2020-11-25 08:28:40 UTC
Can you add output from starting the crypt resource with "pcs resource debug-start --full"?

Comment 3 Oyvind Albrigtsen 2020-11-26 12:00:38 UTC
https://github.com/ClusterLabs/resource-agents/pull/1587

Comment 6 Corey Marthaler 2020-11-30 18:06:04 UTC
I dont see a difference in behavior here with the latest rpm.

I set up a "perfectly healthy" LVM-activate resource, and even relocated it to ensure all is well before attempting the crypt resource definition.

[root@host-073 ~]# rpm -qi resource-agents
Name        : resource-agents
Version     : 4.1.1
Release     : 79.el8
Architecture: x86_64
Install Date: Mon 30 Nov 2020 10:59:38 AM CST
Group       : System Environment/Base
Size        : 1509374
License     : GPLv2+ and LGPLv2+
Signature   : (none)
Source RPM  : resource-agents-4.1.1-79.el8.src.rpm
Build Date  : Mon 30 Nov 2020 04:49:33 AM CST
Build Host  : x86-vm-09.build.eng.bos.redhat.com
Relocations : (not relocatable)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Vendor      : Red Hat, Inc.
URL         : https://github.com/ClusterLabs/resource-agents
Summary     : Open Source HA Reusable Cluster Resource Scripts
Description :
A set of scripts to interface with several services to operate in a
High Availability environment for both Pacemaker and rgmanager
service managers.



/tmp/luks_key_file -> host-073:/etc/luks_key_file
/tmp/luks_key_file -> host-092:/etc/luks_key_file
/tmp/luks_key_file -> host-093:/etc/luks_key_file
Creating single VG STSRHTS13085 out of /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1
host-073: vgchange --lock-start STSRHTS13085
host-092: vgchange --lock-start STSRHTS13085
host-093: vgchange --lock-start STSRHTS13085

Creating HA striped LV(s) and ext4 filesystems on VG STSRHTS13085
        lvcreate --yes --activate ey --type striped -L 8G -i 2 -n lv1 STSRHTS13085

cryptsetup luksFormat /dev/STSRHTS13085/lv1 --type luks2 --key-file=/etc/luks_key_file
LUKS_UUID=f84b9e75-72c0-43aa-b550-a563b13ae517

pcs resource create STSRHTS13085 --group HA_STSRHTS13085 ocf:heartbeat:LVM-activate vgname="STSRHTS13085" activation_mode=exclusive vg_access_mode=lvmlockd

[root@host-093 ~]# pcs status
Cluster name: STSRHTS13085
Cluster Summary:
  * Stack: corosync
  * Current DC: host-073 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
  * Last updated: Mon Nov 30 11:47:13 2020
  * Last change:  Mon Nov 30 11:47:01 2020 by root via cibadmin on host-093
  * 3 nodes configured
  * 10 resource instances configured

Node List:
  * Online: [ host-073 host-092 host-093 ]

Full List of Resources:
  * fence-host-073      (stonith:fence_xvm):     Started host-073
  * fence-host-092      (stonith:fence_xvm):     Started host-092
  * fence-host-093      (stonith:fence_xvm):     Started host-093
  * Clone Set: locking-clone [locking]:
    * Started: [ host-073 host-092 host-093 ]
  * Resource Group: HA_STSRHTS13085:
    * STSRHTS13085      (ocf::heartbeat:LVM-activate):   Started host-093

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
[root@host-093 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool   Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                  
  lv1             STSRHTS13085  -wi-a-----   8.00g                                                       /dev/sda1(0),/dev/sdb1(0)

[root@host-093 ~]# pcs resource move HA_STSRHTS13085 host-073
[root@host-093 ~]# pcs status
Cluster name: STSRHTS13085
Cluster Summary:
  * Stack: corosync
  * Current DC: host-073 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
  * Last updated: Mon Nov 30 11:49:04 2020
  * Last change:  Mon Nov 30 11:48:53 2020 by root via crm_resource on host-093
  * 3 nodes configured
  * 10 resource instances configured

Node List:
  * Online: [ host-073 host-092 host-093 ]

Full List of Resources:
  * fence-host-073      (stonith:fence_xvm):     Started host-073
  * fence-host-092      (stonith:fence_xvm):     Started host-092
  * fence-host-093      (stonith:fence_xvm):     Started host-093
  * Clone Set: locking-clone [locking]:
    * Started: [ host-073 host-092 host-093 ]
  * Resource Group: HA_STSRHTS13085:
    * STSRHTS13085      (ocf::heartbeat:LVM-activate):   Started host-073

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

# LVM resource is now properly active on host-073:
[root@host-073 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool   Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                  
  lv1             STSRHTS13085  -wi-a-----   8.00g                                                       /dev/sdd1(0),/dev/sdb1(0)


# Attempt crypt definition:
[root@host-073 ~]# pcs resource create crypt1 --force --group HA_STSRHTS13085 ocf:heartbeat:crypt crypt_dev="luks_lv1" crypt_type=luks2 key_file=/etc/luks_key_file encrypted_dev=f84b9e75-72c0-43aa-b550-a563b13ae517

[root@host-073 ~]# pcs status
Cluster name: STSRHTS13085
Cluster Summary:
  * Stack: corosync
  * Current DC: host-073 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
  * Last updated: Mon Nov 30 11:51:31 2020
  * Last change:  Mon Nov 30 11:50:57 2020 by root via cibadmin on host-073
  * 3 nodes configured
  * 11 resource instances configured (1 BLOCKED from further action due to failure)

Node List:
  * Online: [ host-073 host-092 host-093 ]

Full List of Resources:
  * fence-host-073      (stonith:fence_xvm):     Started host-073
  * fence-host-092      (stonith:fence_xvm):     Started host-092
  * fence-host-093      (stonith:fence_xvm):     Started host-093
  * Clone Set: locking-clone [locking]:
    * Started: [ host-073 host-092 host-093 ]
  * Resource Group: HA_STSRHTS13085:
    * STSRHTS13085      (ocf::heartbeat:LVM-activate):   Started host-073
    * crypt1    (ocf::heartbeat:crypt):  FAILED (blocked) [ host-093 host-092 ]

Failed Resource Actions:
  * crypt1_stop_0 on host-093 'invalid parameter' (2): call=41, status='complete', exitreason='Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible', last-rc-change='2020-11-30 11:50:57 -06:00', queued=0ms, exec=75ms
  * crypt1_stop_0 on host-092 'invalid parameter' (2): call=38, status='complete', exitreason='Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible', last-rc-change='2020-11-30 11:50:57 -06:00', queued=0ms, exec=55ms

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Nov 30 11:48:57 host-073 LVM-activate(STSRHTS13085)[5179]: INFO: STSRHTS13085: activated successfully.
Nov 30 11:48:57 host-073 pacemaker-controld[3415]: notice: Result of start operation for STSRHTS13085 on host-073: ok
Nov 30 11:48:57 host-073 pacemaker-controld[3415]: notice: Initiating monitor operation STSRHTS13085_monitor_30000 locally on host-073
Nov 30 11:48:57 host-073 pacemaker-controld[3415]: notice: Result of monitor operation for STSRHTS13085 on host-073: ok
Nov 30 11:48:57 host-073 pacemaker-controld[3415]: notice: Transition 8 (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-8.bz2): Complete
Nov 30 11:48:57 host-073 pacemaker-controld[3415]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE
Nov 30 11:48:57 host-073 systemd[1]: dnf-makecache.service: Succeeded.
Nov 30 11:48:57 host-073 systemd[1]: Started dnf makecache.
Nov 30 11:49:53 host-073 pcsd[1899]: INFO:tornado.access:200 GET /remote/get_configs?cluster_name=STSRHTS13085 (10.15.105.92) 7.74ms
Nov 30 11:49:53 host-073 restraintd[1714]: *** Current Time: Mon Nov 30 11:49:53 2020 Localwatchdog at:  * Disabled! *
Nov 30 11:50:53 host-073 restraintd[1714]: *** Current Time: Mon Nov 30 11:50:53 2020 Localwatchdog at:  * Disabled! *
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: State transition S_IDLE -> S_POLICY_ENGINE
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice:  * Start      crypt1             (             host-073 )
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: Calculated transition 9, saving inputs in /var/lib/pacemaker/pengine/pe-input-9.bz2
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Initiating monitor operation crypt1_monitor_0 on host-093
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Initiating monitor operation crypt1_monitor_0 on host-092
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Initiating monitor operation crypt1_monitor_0 locally on host-073
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 9 aborted by operation crypt1_monitor_0 'modify' on host-092: Event failed
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 9 action 12 (crypt1_monitor_0 on host-092): expected 'not running' but got 'invalid parameter'
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 9 action 13 (crypt1_monitor_0 on host-093): expected 'not running' but got 'invalid parameter'
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Result of probe operation for crypt1 on host-073: not running
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 9 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=3, Source=/var/lib/pacemaker/pengine/pe-input-9.bz2): Complete
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for probe of crypt1 on host-093 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: If it is not possible for crypt1 to run on host-093, see the resource-discovery option for location constraints
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for probe of crypt1 on host-093 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: If it is not possible for crypt1 to run on host-093, see the resource-discovery option for location constraints
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for probe of crypt1 on host-092 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: If it is not possible for crypt1 to run on host-092, see the resource-discovery option for location constraints
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-092 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for probe of crypt1 on host-092 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: If it is not possible for crypt1 to run on host-092, see the resource-discovery option for location constraints
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-092 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Resource crypt1 is active on 2 nodes (attempting recovery)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice:  * Recover    crypt1             ( host-093 -> host-073 )
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Calculated transition 10 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-0.bz2
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Initiating stop operation crypt1_stop_0 on host-092
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Initiating stop operation crypt1_stop_0 on host-093
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 10 aborted by operation crypt1_stop_0 'modify' on host-092: Event failed
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 10 action 12 (crypt1_stop_0 on host-092): expected 'ok' but got 'invalid parameter'
Nov 30 11:50:57 host-073 pacemaker-attrd[3413]: notice: Setting fail-count-crypt1#stop_0[host-092]: (unset) -> INFINITY
Nov 30 11:50:57 host-073 pacemaker-attrd[3413]: notice: Setting last-failure-crypt1#stop_0[host-092]: (unset) -> 1606758657
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 10 aborted by transient_attributes.2 'create': Transient attribute change
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 10 action 8 (crypt1_stop_0 on host-093): expected 'ok' but got 'invalid parameter'
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 10 (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-error-0.bz2): Complete
Nov 30 11:50:57 host-073 pacemaker-attrd[3413]: notice: Setting fail-count-crypt1#stop_0[host-093]: (unset) -> INFINITY
Nov 30 11:50:57 host-073 pacemaker-attrd[3413]: notice: Setting last-failure-crypt1#stop_0[host-093]: (unset) -> 1606758657
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-093 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-093 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-092 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-092 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-092 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-092 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-092 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-092 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Forcing crypt1 away from host-092 after 1000000 failures (max=1000000)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Resource crypt1 is active on 2 nodes (attempting recovery)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Calculated transition 11 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-1.bz2
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-093 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-093 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-093 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-093 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-092 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-092 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-092 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: No further recovery can be attempted for crypt1 because stop on host-092 failed (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Unexpected result (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible) was recorded for stop of crypt1 on host-092 at Nov 30 11:50:57 2020
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Preventing crypt1 from restarting on host-092 because of hard failure (invalid parameter: Encrypted device /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 not accessible)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Forcing crypt1 away from host-092 after 1000000 failures (max=1000000)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: warning: Forcing crypt1 away from host-093 after 1000000 failures (max=1000000)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Resource crypt1 is active on 2 nodes (attempting recovery)
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: notice: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information
Nov 30 11:50:57 host-073 pacemaker-schedulerd[3414]: error: Calculated transition 12 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-2.bz2
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: Transition 12 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-error-2.bz2): Complete
Nov 30 11:50:57 host-073 pacemaker-controld[3415]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE

Comment 7 Corey Marthaler 2020-11-30 18:16:28 UTC
Also, to again ensure and verify the device is present and capable of being luksOpen'ed, i ran that command manually on the node with the LVM-activate resource running and it worked fine.

[root@host-073 ~]# lvs
  LV     VG            Attr       LSize   Pool   Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv1    STSRHTS13085  -wi-a-----   8.00g                                                      
[root@host-073 ~]# ls /dev/STSRHTS13085/lv1 
/dev/STSRHTS13085/lv1

[root@host-073 ~]# cryptsetup luksOpen /dev/STSRHTS13085/lv1 luks_lv1 --key-file=/etc/luks_key_file
[root@host-073 ~]# dmsetup ls
luks_lv1        (253:7)
STSRHTS13085-lv1        (253:6)

I assume the resource-agent is basically just running this above open command to present this crypt volume?

Comment 8 Corey Marthaler 2020-11-30 18:28:31 UTC
FWIW, I also attempted using the uuid instead of the LVM name, and that worked as well.

[root@host-073 ~]# cryptsetup luksClose luks_lv1

[root@host-073 ~]# cryptsetup luksOpen /dev/disk/by-uuid/f84b9e75-72c0-43aa-b550-a563b13ae517 luks_lv1 --key-file=/etc/luks_key_file
[root@host-073 ~]# dmsetup ls
luks_lv1        (253:7)
STSRHTS13085-lv1        (253:6)

Which raises the question. If we're suggesting that we use lvm volumes/resources below, which one of the benefits is unified logical volume naming, shouldn't we just be using the LVM lv name/path instead of this luks uuid name?

Comment 9 Oyvind Albrigtsen 2020-12-01 09:14:40 UTC
https://github.com/ClusterLabs/resource-agents/pull/1593

Comment 10 Corey Marthaler 2020-12-01 20:05:00 UTC
Fixed in the latest rpm resource-agents-4.1.1-80.el8.x86_64, marking Verified.

/tmp/luks_key_file -> host-073:/etc/luks_key_file
/tmp/luks_key_file -> host-092:/etc/luks_key_file
/tmp/luks_key_file -> host-093:/etc/luks_key_file

Creating single VG STSRHTS13085 out of /dev/sdg1 /dev/sda1 /dev/sdh1 /dev/sde1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1
Creating HA striped LV(s) and gfs2 filesystems on VG STSRHTS13085
        lvcreate --yes --activate sy --type striped -L 8G -i 2 -n lv1 STSRHTS13085
cryptsetup luksFormat /dev/STSRHTS13085/lv1 --type luks2 --key-file=/etc/luks_key_file
LUKS_UUID=2b471d0c-2fde-4a67-99c5-359db40a0f1c
cryptsetup luksOpen /dev/STSRHTS13085/lv1 luks_lv1 --key-file=/etc/luks_key_file
        mkfs.gfs2 -j 3 -J 32 -t STSRHTS13085:STSRHTS13085-lv1 /dev/mapper/luks_lv1 -O

cryptsetup luksClose luks_lv1
pcs resource create STSRHTS13085 --group HA_STSRHTS13085 ocf:heartbeat:LVM-activate vgname="STSRHTS13085" activation_mode=shared vg_access_mode=lvmlockd
pcs resource create crypt1 --force --group HA_STSRHTS13085 ocf:heartbeat:crypt crypt_dev="luks_lv1" crypt_type=luks2 key_file=/etc/luks_key_file encrypted_dev=2b471d0c-2fde-4a67-99c5-359db40a0f1c
pcs resource create fs1 --group HA_STSRHTS13085 Filesystem device="/dev/mapper/luks_lv1" directory="/mnt/fs1" fstype="gfs2" "options=noatime" op monitor interval=10s
pcs resource clone HA_STSRHTS13085
        lvcreate --yes --activate sy --type striped -L 8G -i 2 -n lv2 STSRHTS13085
cryptsetup luksFormat /dev/STSRHTS13085/lv2 --type luks2 --key-file=/etc/luks_key_file
LUKS_UUID=9c8013cf-a7d0-4f1e-a620-8a086a305b2e
cryptsetup luksOpen /dev/STSRHTS13085/lv2 luks_lv2 --key-file=/etc/luks_key_file
        mkfs.gfs2 -j 3 -J 32 -t STSRHTS13085:STSRHTS13085-lv2 /dev/mapper/luks_lv2 -O

cryptsetup luksClose luks_lv2
pcs resource create crypt2 --force --group HA_STSRHTS13085 ocf:heartbeat:crypt crypt_dev="luks_lv2" crypt_type=luks2 key_file=/etc/luks_key_file encrypted_dev=9c8013cf-a7d0-4f1e-a620-8a086a305b2e
pcs resource create fs2 --group HA_STSRHTS13085 Filesystem device="/dev/mapper/luks_lv2" directory="/mnt/fs2" fstype="gfs2" "options=noatime" op monitor interval=10s

pcs constraint order start locking-clone then HA_STSRHTS13085-clone

Running cleanup to fix any potential timing issues during setup
pcs resource cleanup

Checking status of resources on all nodes

[root@host-073 ~]# pcs status
Cluster name: STSRHTS13085
Cluster Summary:
  * Stack: corosync
  * Current DC: host-093 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
  * Last updated: Tue Dec  1 14:00:42 2020
  * Last change:  Tue Dec  1 13:58:05 2020 by root via cibadmin on host-093
  * 3 nodes configured
  * 24 resource instances configured

Node List:
  * Online: [ host-073 host-092 host-093 ]

Full List of Resources:
  * fence-host-073      (stonith:fence_xvm):     Started host-073
  * fence-host-092      (stonith:fence_xvm):     Started host-092
  * fence-host-093      (stonith:fence_xvm):     Started host-093
  * Clone Set: locking-clone [locking]:
    * Started: [ host-073 host-092 host-093 ]
  * Clone Set: HA_STSRHTS13085-clone [HA_STSRHTS13085]:
    * Started: [ host-073 host-092 host-093 ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 12 errata-xmlrpc 2021-05-18 15:12:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (resource-agents bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1736


Note You need to log in before you can comment on or make changes to this bug.