1659114 – 'pcs host auth' shows error message asking to re-authenticate the earlier authenticated node again.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1659114 - 'pcs host auth' shows error message asking to re-authenticate the earlier authenticated node again.

Summary: 'pcs host auth' shows error message asking to re-authenticate the earlier aut...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	8.0
Hardware:	Unspecified
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	8.0
Assignee:	Tomas Jelinek
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-13 15:39 UTC by hemel
Modified:	2019-06-12 14:29 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-12 14:29:42 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description hemel 2018-12-13 15:39:17 UTC

Description of problem:
One node is successfully authenticated, however on authentication of the other node, an error message appears asking to re-authenticate the earlier authenticated node again.

Version-Release number of selected component (if applicable):
pacemaker-2.0.0-10.el8.x86_64

How reproducible:
always

Steps to Reproduce:
1. On node1:
   ===========
***********************************************************************************************
[root@hbiswas_rhel8_node1 ~]# pcs host deauth		---> deauthorized all nodes
[root@hbiswas_rhel8_node1 ~]# pcs cluster auth		
rhel8_node1: Not authorized
rhel8_node2: Not authorized
Nodes to authorize: rhel8_node1, rhel8_node2
Username:
***********************************************************************************************

TEST1: After de-authorizing all nodes from node1, first authorizing node1 successfully, and then authorizing node2 works as expected (WORKING)

***********************************************************************************************
[root@hbiswas_rhel8_node1 ~]# pcs host deauth		--> After deauthorizing all nodes
[root@hbiswas_rhel8_node1 ~]# pcs host auth rhel8_node1		---> Authorizing node1
Username: hacluster
Password: 
rhel8_node1: Authorized						---> node1 authorized success
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node2. Run 'pcs host auth rhel8_node2' to make sure the nodes are authorized.
[root@hbiswas_rhel8_node1 ~]# pcs host auth rhel8_node2		---> Authorizing node2
Username: hacluster
Password: 
rhel8_node2: Authorized						---> node2 authorized success	 	
***********************************************************************************************


TEST2: After de-authorizing all nodes from node1, first authorizing node2 successfully, and then authorizing node1 results in error stating "Unable to synchronize and save known-hosts on nodes: rhel8_node2. Run 'pcs host auth rhel8_node2' to make sure the nodes are authorized." (SHOWS ERROR)

***********************************************************************************************
[root@hbiswas_rhel8_node1 ~]# pcs host deauth		--> After deauthorizing all nodes
[root@hbiswas_rhel8_node1 ~]# pcs host auth rhel8_node2		---> Authorizing node2
Username: hacluster
Password: 
rhel8_node2: Authorized						---> node2 authorized success	
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node1. Run 'pcs host auth rhel8_node1' to make sure the nodes are authorized.
[root@hbiswas_rhel8_node1 ~]# 
[root@hbiswas_rhel8_node1 ~]# pcs host auth rhel8_node1		---> Authorizing node1
Username: hacluster
Password: 
rhel8_node1: Authorized						---> node1 authorized success
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node2. Run 'pcs host auth rhel8_node2' to make sure the nodes are authorized.		---> node2 is already authorized, but throws error
[root@hbiswas_rhel8_node1 ~]# 
[root@hbiswas_rhel8_node1 ~]# pcs host auth rhel8_node1		---> Retrying the authorization
Username: hacluster
Password: 
rhel8_node1: Authorized						---> node1 authorized success
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node2. Run 'pcs host auth rhel8_node2' to make sure the nodes are authorized.		---> node2 is already authorized, but throws same error
[root@hbiswas_rhel8_node1 ~]#
***********************************************************************************************



2. On node2:
===========

TEST1: After deauthorizing all nodes from node2, first authorizing node2 successfully, and then authorizing node1 works as expected (WORKING)
***********************************************************************************************
[root@hbiswas_rhel8_node2 ~]# pcs host deauth
[root@hbiswas_rhel8_node2 ~]# pcs host auth rhel8_node2		---> Authorizing node2
Username: hacluster
Password: 
rhel8_node2: Authorized						---> node2 authorized success
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node1. Run 'pcs host auth rhel8_node1' to make sure the nodes are authorized.
[root@hbiswas_rhel8_node2 ~]# 
[root@hbiswas_rhel8_node2 ~]# pcs host auth rhel8_node1		---> Authorizing node1
Username: hacluster
Password: 
rhel8_node1: Authorized						---> node1 authorized success
[root@hbiswas_rhel8_node2 ~]#
***********************************************************************************************

TEST2: After deauthorizing all nodes from node2, first authorizing node1 successfully, and then authorizing node2 results in error stating "Unable to synchronize and save known-hosts on nodes: rhel8_node1. Run 'pcs host auth rhel8_node1' to make sure the nodes are authorized." (SHOWS ERROR)

***********************************************************************************************
[root@hbiswas_rhel8_node2 ~]# pcs host deauth
[root@hbiswas_rhel8_node2 ~]# pcs host auth rhel8_node1		---> Authorizing node1
Username: hacluster
Password: 
rhel8_node1: Authorized						---> node1 authorized success
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node2. Run 'pcs host auth rhel8_node2' to make sure the nodes are authorized.
[root@hbiswas_rhel8_node2 ~]# 
[root@hbiswas_rhel8_node2 ~]# pcs host auth rhel8_node2		---> Authorizing node2 
Username: hacluster
Password: 
rhel8_node2: Authorized						---> node2 authorized success	
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node1. Run 'pcs host auth rhel8_node1' to make sure the nodes are authorized.		---> node1 is already authorized, but throws error
[root@hbiswas_rhel8_node2 ~]# 
[root@hbiswas_rhel8_node2 ~]# pcs host auth rhel8_node2		---> Retrying the authorization
Username: hacluster
Password: 
rhel8_node2: Authorized						---> node2 authorized success
Error: Unable to synchronize and save known-hosts on nodes: rhel8_node1. Run 'pcs host auth rhel8_node1' to make sure the nodes are authorized.		---> node1 is already authorized, but throws same error
[root@hbiswas_rhel8_node2 ~]#
***********************************************************************************************

Actual results:
One node is successfully authenticated, however on authentication of the other node, an error message appears asking to re-authenticate the earlier authenticated node again.

Expected results:
Once a node is authorized successfully, the error message asking for another authentication of that same node should not appear

Additional info:
This seems to happen only when the authentication is performed for the other node first, followed by the authentication of the node from which the commands are being executed

Comment 1 Ken Gaillot 2018-12-13 15:48:17 UTC

re-assigning to pcs (it has its own component)

Comment 2 Ondrej Mular 2019-01-14 11:26:45 UTC

The described behavior is expected, but it might seem as not intuitive because of the complex internals of pcs token synchronization.

If a node is in a cluster, auth token synchronization across the whole cluster is always used for new tokens distribution. This synchronization mechanism uses network communication, so it needs to be able to communicate with all cluster nodes (that means including node on which ‘pcs host auth’ command was executed) using a token for the sync to be successful.

So if you have a cluster of nodes A and B, and you will remove all tokens from node A (by running ‘pcs host deauth’), then node A cannot communicate with node B, neither with itself.

You can see in TEST1 that if you first authenticate node A against itself, it cannot send the new token to the node B because it doesn’t have its token. Then if you try to auth node B on node A, node A is then able to send the new token to itself and also to node B.

In TEST2, node A auths against node B first. A new token will be sent only to node B because token for only node B is present on node A at the moment. I will note here that pcs doesn’t know the name of the node on which authentication command was executed. Because of lack of this information, pcs is unable to detect such situation and save new token locally (not via network). After the first ‘pcs host auth’ command, node A doesn’t have tokens to any node in the cluster. Therefore auth of node A on node A will not be able to sync new token to node B.

So to avoid this issue, you should either authenticate local node to itself first or authenticate nodes including the local node. Another option is to run ‘pcs cluster auth’ on a node which already has tokens of all nodes which will try to authenticate all nodes in the cluster against each other.

Comment 3 Tomas Jelinek 2019-06-12 14:29:42 UTC

We went through this issue again to see if there is anything to be done in pcs to improve the described situation.

As Ondrej already explained, the described behavior is expected. We discussed several ideas including storing tokens locally if storing them over the network fails. We found out that such a functionality would not ensure that all nodes are authenticated. Moreover it would hide the fact they are not (see an explanation below). That would only lead for users to get errors later which is something we want to avoid. That being said the described behavior is not only expected, it is also correct.

The proper way for dealing with this issue is:
A) run "pcs cluster auth" which makes sure all nodes in the cluster are authenticated
or
B) run "pcs host auth" for the local node first or include the local node in the list of nodes to auth

Explanation - why saving tokens locally would not help:
1. 2-node cluster, nodes A and B are authenticated to themselves and each other
2. "pcs host deauth" on node A
3. node B is authenticated to nodes B and A, node A is not authenticated to anything
4. "pcs host auth B" on node A - Pcs process on node A gets a token for node B stored in memory and sends it to nodes A and B. Sending to node A fails since there is no token for node A on node A. Detecting this, pcs stores the token for node B from memory to the local file on node A.
5. node B is authenticated to nodes B and A, node A is authenticated to node B only
6. "pcs cluster start --all" on node A fails since node A does not have any token for node A and therefore it cannot connect to it

Note You need to log in before you can comment on or make changes to this bug.