Bug 1464781

Summary: Unable to add remote/guest node forcibly to a cluster if error with pacemaker_remote daemon occurs on remote/guest node
Product: Red Hat Enterprise Linux 7 Reporter: Miroslav Lisik <mlisik>
Component: pcsAssignee: Ivan Devat <idevat>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: high    
Version: 7.4CC: cfeist, cluster-maint, idevat, omular, rsteiger, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.162-3.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 15:39:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
proposed fix
none
fix for exit code issue none

Description Miroslav Lisik 2017-06-25 17:07:13 UTC
Description of problem:

Unable to add remote/guest node forcibly to a cluster when pacemaker-remote
package is not installed or if some error occurs during enable and start
actions of pacemaker_remote daemon on the remote/geust nodes.


Version-Release number of selected component (if applicable):
pcs-0.9.158-6.el7


How reproducible:
always


Steps to Reproduce:

1. Have a cluster authenticated against remote or guest node.

2. Make sure that package pacemaker-remote is not installed or cause error
somehow during start/enable of pcemaker_remote daemon.

3. Issue the command for adding remote/guest node.


Actual results:

Adding remote node without forcing works as expected:

[root@duck-01 ~]# pcs cluster node add-remote virt-136
Sending remote node configuration files to 'virt-136'
virt-136: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'virt-136'
Error: virt-136: service command failed: pacemaker_remote start: Operation failed., use --force to override
Error: virt-136: service command failed: pacemaker_remote enable: Operation failed., use --force to override
[root@duck-01 ~]# echo $?
1
[root@duck-01 ~]# pcs status nodes | sed -n '/Pacemaker Remote/,$ p'
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline:

Adding remote node with forcing does not work:

[root@duck-01 ~]# pcs cluster node add-remote virt-136 --force
Sending remote node configuration files to 'virt-136'
virt-136: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'virt-136'
Error: virt-136: service command failed: pacemaker_remote start: Operation failed., use --force to override
Error: virt-136: service command failed: pacemaker_remote enable: Operation failed., use --force to override
[root@duck-01 ~]# echo $?
1
[root@duck-01 ~]# pcs status nodes | sed -n '/Pacemaker Remote/,$ p'
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline:

Adding guest node without forcing works as expected:

[root@duck-01 ~]# pcs cluster node add-guest pool-10-34-70-90 guest-01
Sending remote node configuration files to 'pool-10-34-70-90'
pool-10-34-70-90: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'pool-10-34-70-90'
Error: pool-10-34-70-90: service command failed: pacemaker_remote start: Operation failed., use --force to override
Error: pool-10-34-70-90: service command failed: pacemaker_remote enable: Operation failed., use --force to override
[root@duck-01 ~]# echo $?
1
[root@duck-01 ~]# pcs status nodes | sed -n '/Pacemaker Remote/,$ p'
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline:

Adding guest node with forcing does not work:

[root@duck-01 ~]# pcs cluster node add-guest pool-10-34-70-90 guest-01 --force
Sending remote node configuration files to 'pool-10-34-70-90'
pool-10-34-70-90: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'pool-10-34-70-90'
Error: pool-10-34-70-90: service command failed: pacemaker_remote start: Operation failed., use --force to override
Error: pool-10-34-70-90: service command failed: pacemaker_remote enable: Operation failed., use --force to override
[root@duck-01 ~]# echo $?
1
[root@duck-01 ~]# pcs status nodes | sed -n '/Pacemaker Remote/,$ p'
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline:


Expected results:

When remote/guest node is added forcibly, node is always added to a cluster
despite of failed actions on the remote/guest node.


Additional info:

Possible workaround: to use --skip-offline option, but it is not suggested in the error message.

[root@duck-01 ~]# pcs cluster node add-remote virt-136 --skip-offline
Sending remote node configuration files to 'virt-136'
virt-136: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'virt-136'
Warning: virt-136: service command failed: pacemaker_remote start: Operation failed.
Warning: virt-136: service command failed: pacemaker_remote enable: Operation failed.
[root@duck-01 ~]# echo $?
0
[root@duck-01 ~]# pcs status nodes | sed -n '/Pacemaker Remote/,$ p'
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline: virt-136

[root@duck-01 ~]# pcs cluster node add-guest pool-10-34-70-90 guest-01 --skip-offline
Sending remote node configuration files to 'pool-10-34-70-90'
pool-10-34-70-90: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'pool-10-34-70-90'
Warning: pool-10-34-70-90: service command failed: pacemaker_remote start: Operation failed.
Warning: pool-10-34-70-90: service command failed: pacemaker_remote enable: Operation failed.
[root@duck-01 ~]# echo $?
0
[root@duck-01 ~]# pcs status nodes | sed -n '/Pacemaker Remote/,$ p'
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline: pool-10-34-70-90 virt-136

Comment 2 Tomas Jelinek 2017-06-30 14:36:42 UTC
Created attachment 1293252 [details]
proposed fix

Comment 4 Ivan Devat 2017-10-11 08:01:21 UTC
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.160-1.el7.x86_64

[vm-rhel72-1 ~] $ pcs cluster node add-remote vm-rhel72-2 --force
Sending remote node configuration files to 'vm-rhel72-2'
vm-rhel72-2: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'vm-rhel72-2'
Warning: vm-rhel72-2: service command failed: pacemaker_remote enable: Operation failed.
Warning: vm-rhel72-2: service command failed: pacemaker_remote start: Operation failed.

[vm-rhel72-1 ~] $ echo $?
0

Comment 7 Tomas Jelinek 2018-01-04 09:49:11 UTC
Created attachment 1376747 [details]
fix for exit code issue

Comment 8 Ivan Devat 2018-01-08 08:59:19 UTC
After Fix

[ant ~] $ rpm -q pcs
pcs-0.9.162-3.el7.x86_64

[ant ~] $ pcs cluster node add-guest cat REMOTE-NODE
Sending remote node configuration files to 'cat'
cat: successful distribution of the file 'pacemaker_remote authkey'
Requesting start of service pacemaker_remote on 'cat'
Error: cat: service command failed: pacemaker_remote enable: Operation failed.
Error: cat: service command failed: pacemaker_remote start: Operation failed.
Error: Errors have occurred, therefore pcs is unable to continue
[ant ~] $ echo $?
1

Comment 13 errata-xmlrpc 2018-04-10 15:39:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0866