1870873 – "cibsecret sync" fails if node name is different from hostname

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1870873 - "cibsecret sync" fails if node name is different from hostname

Summary: "cibsecret sync" fails if node name is different from hostname

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pacemaker
Sub Component:
Version:	8.3
Hardware:	All
OS:	All
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	8.3
Assignee:	Ken Gaillot
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1793860
TreeView+	depends on / blocked

Reported:	2020-08-20 21:47 UTC by Ken Gaillot
Modified:	2021-03-02 16:42 UTC (History)
CC List:	2 users (show)
Fixed In Version:	pacemaker-2.0.4-6.el8
Doc Type:	No Doc Update
Doc Text:	This fix is for a build that has not been released
Clone Of:
Environment:
Last Closed:	2020-11-04 04:00:53 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2020:4804	0	None	None	None	2020-11-04 04:01:08 UTC

Description Ken Gaillot 2020-08-20 21:47:46 UTC

Description of problem: "cibsecret sync" filters out the local node from the list of all nodes by searching for the output of "uname -n". This causes two issues: if the node name is different from the hostname, this will cause local secrets to be removed; and if another node's name is a superset of the local node's name (e.g. "node10" vs. "node1"), secrets will not be synced to it.


Version-Release number of selected component (if applicable): 2.0.4-5


How reproducible: consistent


Steps to Reproduce:
1. Configure a cluster with node names different from hostnames.
2. Configure a resource and use cibsecret to make a parameter secret.
3. Run "cibsecret sync".

Actual results: Secrets are removed from the local node.


Expected results: Local secrets are synced to all other nodes.

Comment 1 Ken Gaillot 2020-08-20 21:54:23 UTC

Fixed upstream by commit afca6af

Comment 5 Markéta Smazová 2020-09-23 11:30:58 UTC

before fix
----------
Please see bug 1793860, comment 9 (case 2).

after fix
----------

>   [root@virt-023 ~]# rpm -q pacemaker
>   pacemaker-2.0.4-6.el8.x86_64

>   [root@virt-023 ~]# pcs status
>   Cluster name: STSRHTS25177
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-024 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
>     * Last updated: Thu Sep 10 13:17:44 2020
>     * Last change:  Thu Sep 10 12:45:22 2020 by hacluster via crmd on virt-024
>     * 3 nodes configured
>     * 9 resource instances configured
>
>   Node List:
>     * Online: [ virt-023 virt-024 virt-031 ]
>
>   Full List of Resources:
>     * fence-virt-023	(stonith:fence_xvm):	 Started virt-023
>     * fence-virt-024	(stonith:fence_xvm):	 Started virt-024
>     * fence-virt-031	(stonith:fence_xvm):	 Started virt-031
>     * Clone Set: locking-clone [locking]:
>       * Started: [ virt-023 virt-024 virt-031 ]
>
>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

Check that hostnames are different from cluster node names.

>   [root@virt-023 ~]# uname -n
>   virt-023.cluster-qe.lab.eng.brq.redhat.com

>   [root@virt-024 ~]# uname -n
>   virt-024.cluster-qe.lab.eng.brq.redhat.com

>   [root@virt-031 ~]# uname -n
>   virt-031.cluster-qe.lab.eng.brq.redhat.com

Remove node `virt-031` from the cluster.

>   [root@virt-023 ~]# pcs cluster node remove virt-031
>   Destroying cluster on hosts: 'virt-031'...
>   virt-031: Successfully destroyed cluster
>   Sending updated corosync.conf to nodes...
>   virt-023: Succeeded
>   virt-024: Succeeded
>   virt-023: Corosync configuration reloaded

Put the cluster in maintenance mode.

>   [root@virt-023 ~]# pcs property set maintenance-mode=true
>   [root@virt-023 ~]# echo $?
>   0

Set the `delay` attribute for `fence-virt-023` stonith resource as a secret.

>   [root@virt-023 ~]# cibsecret set fence-virt-023 delay 10
>   INFO: syncing /var/lib/pacemaker/lrm/secrets/fence-virt-023/delay to  virt-024  ...
>   Set 'fence-virt-023' option: id=fence-virt-023-instance_attributes-delay name=delay value=lrm://

Add node `virt-031` back to the cluster.

>   [root@virt-023 ~]# pcs cluster node add virt-031
>   No addresses specified for host 'virt-031', using 'virt-031'
>   Disabling sbd...
>   virt-031: sbd disabled
>   Sending 'corosync authkey', 'pacemaker authkey' to 'virt-031'
>   virt-031: successful distribution of the file 'corosync authkey'
>   virt-031: successful distribution of the file 'pacemaker authkey'
>   Sending updated corosync.conf to nodes...
>   virt-024: Succeeded
>   virt-031: Succeeded
>   virt-023: Succeeded
>   virt-023: Corosync configuration reloaded

On the new cluster node `virt-031` start Corosync and verify that it started.

>   [root@virt-031 ~]# systemctl start corosync.service
>   [root@virt-031 ~]# systemctl is-active corosync.service
>   active

Check cluster nodes status.

>   [root@virt-023 ~]# pcs status nodes
>   Pacemaker Nodes:
>    Online: virt-023 virt-024
>    Standby: virt-031
>    Standby with resource(s) running:
>    Maintenance:
>    Offline:
>   [...]

Run `cibsecret sync` to synchronize the secret file to the new node `virt-031`.

>   [root@virt-023 ~]# cibsecret sync
>   INFO: syncing /var/lib/pacemaker/lrm/secrets to  virt-024 virt-031  ...

Check if the secret file is synchronized across all cluster nodes.

>   [root@virt-023 ~]# ls -l /var/lib/pacemaker/lrm/secrets/fence-virt-023
>   total 8
>   -rw-------. 1 root root  3 Sep 10 13:19 delay
>   -rw-------. 1 root root 33 Sep 10 13:19 delay.sign

>   [root@virt-024 ~]# ls -l /var/lib/pacemaker/lrm/secrets/fence-virt-023
>   total 8
>   -rw-------. 1 root root  3 Sep 10 13:19 delay
>   -rw-------. 1 root root 33 Sep 10 13:19 delay.sign

>   [root@virt-031 ~]# ls -l /var/lib/pacemaker/lrm/secrets/fence-virt-023
>   total 8
>   -rw-------. 1 root root  3 Sep 10 13:19 delay
>   -rw-------. 1 root root 33 Sep 10 13:19 delay.sign

Running `cibsecret get` on new node `virt-031` will not work, until Pacemaker is started on the node.

>   [root@virt-031 ~]# cibsecret get fence-virt-023 delay
>   ERROR: pacemaker not running? cibsecret needs pacemaker

Start cluster services (including Pacemaker) on new node `virt-031`.

>   [root@virt-023 ~]# pcs cluster start virt-031
>   virt-031: Starting Cluster...

Turn off cluster maintenance mode.

>   [root@virt-023 ~]# pcs property set maintenance-mode=false
>   [root@virt-023 ~]# pcs status
>   Cluster name: STSRHTS25177
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-024 (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
>     * Last updated: Thu Sep 10 13:23:29 2020
>     * Last change:  Thu Sep 10 13:20:42 2020 by hacluster via crmd on virt-024
>     * 3 nodes configured
>     * 9 resource instances configured

>   Node List:
>     * Online: [ virt-023 virt-024 virt-031 ]

>   Full List of Resources:
>     * fence-virt-023	(stonith:fence_xvm):	 Started virt-023
>     * fence-virt-024	(stonith:fence_xvm):	 Started virt-024
>     * fence-virt-031	(stonith:fence_xvm):	 Started virt-031
>     * Clone Set: locking-clone [locking]:
>       * Started: [ virt-023 virt-024 virt-031 ]

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

Use `cibsecret get` to verify that `delay` secret value can be displayed on all cluster nodes.

>   [root@virt-024 ~]# cibsecret get fence-virt-023 delay
>   10

>   [root@virt-031 ~]# cibsecret get fence-virt-023 delay
>   10

>   [root@virt-023 ~]# cibsecret get fence-virt-023 delay
>   10


marking verified in pacemaker-2.0.4-6.el8

Comment 8 errata-xmlrpc 2020-11-04 04:00:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4804

Note You need to log in before you can comment on or make changes to this bug.