Bug 1328870

Summary: pcs+service: `pcs cluster auth` fails immediately after `service pcsd start`, succeeds few seconds later
Product: Red Hat Enterprise Linux 6 Reporter: Miroslav Lisik <mlisik>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.8CC: cfeist, cluster-maint, idevat, mcsontos, mspqa-list, omular, rsteiger, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.154-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: User tries to communicate with pcsd right after it has been started. Consequence: Unable to connect to pcsd instance. Fix: Make pcsd init script wait for pcsd to fully start. Result: It is possible to connect to pcsd right after start command finishes.
Story Points: ---
Clone Of: 968877
: 1428392 (view as bug list) Environment:
Last Closed: 2017-03-21 11:03:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 968877    
Bug Blocks: 1428392    
Attachments:
Description Flags
proposed fix none

Description Miroslav Lisik 2016-04-20 12:58:24 UTC
+++ This bug was initially created as a clone of Bug #968877 +++

Description of problem:
`pcs cluster auth` fails with "Unable to communicate with $node" when run soon after the pcsd service is started on all nodes. With little delay it works fine.

Version-Release number of selected component (if applicable):
pcs-0.9.148-7.el6

How reproducible:
always

Steps to Reproduce:

1. Start pcsd service on all nodes.

2. On one node stop pcsd service and delete files that regards pcsd authorization.
[root@virt-254 ~]# service pcsd stop && rm -rf /var/lib/pcsd/{pcs_users.conf,tokens}

3. On chosen node start pcsd service and run pcs authorization.
[root@virt-254 ~]# service pcsd start && pcs cluster auth -u hacluster -p password virt-{254,256,259}
Starting pcsd: [  OK  ]
virt-259: Authorized
virt-256: Authorized
Error: Unable to communicate with virt-254
Error: Unable to synchronize and save tokens on nodes: virt-254. Are they authorized?
[root@virt-254 ~]# echo $?
1
 
Actual results:
pcs cluster auth failing with:
Error: Unable to communicate with virt-254

Expected results:
pcs cluster auth should pass.

Additional info:
The subsequent auth call pass:

[root@virt-254 ~]# pcs cluster auth -u hacluster -p password virt-{254,256,259}
virt-259: Authorized
virt-256: Authorized
virt-254: Authorized

Comment 2 Tomas Jelinek 2016-09-09 14:37:59 UTC
Created attachment 1199497 [details]
proposed fix

Test:

[root@rh68-node1:~]# service pcsd stop
Stopping pcsd:                                             [  OK  ]
[root@rh68-node1:~]# service pcsd start && pcs cluster auth rh68-node1 rh68-node2 rh68-node3 -u hacluster -p beslo --force
Starting pcsd:                                             [  OK  ]
rh68-node1: Authorized
rh68-node3: Authorized
rh68-node2: Authorized
[root@rh68-node1:~]# echo $?
0

Comment 3 Ivan Devat 2016-10-19 07:08:08 UTC
Before Fix:

[vm-rhel67-1 ~] $ rpm -q pcs
pcs-0.9.148-7.el6_8.1.x86_64
[vm-rhel67-1 ~] $ service pcsd stop
Stopping pcsd:                                             [  OK  ]
[vm-rhel67-1 ~] $ service pcsd start && pcs cluster auth vm-rhel67-1 vm-rhel67-2 -u hacluster -p hh --force
Starting pcsd:                                             [  OK  ]
Error: Unable to communicate with vm-rhel67-1
vm-rhel67-2: Authorized


After Fix:

[vm-rhel67-1 ~] $ rpm -q pcs
pcs-0.9.154-1.el6.x86_64
[vm-rhel67-1 ~] $ service pcsd stop
Stopping pcsd:                                             [  OK  ]
[vm-rhel67-1 ~] $ service pcsd start && pcs cluster auth vm-rhel67-1 vm-rhel67-2 -u hacluster -p hh --force
Starting pcsd:                                             [  OK  ]
vm-rhel67-1: Authorized
vm-rhel67-2: Authorized

Comment 7 errata-xmlrpc 2017-03-21 11:03:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0707.html