Description of problem: Sometimes when gdeploy fails to add the node to existing ganesha cluster,it does not show any failure in Gdeploy.Instead it reflect the add node as success. It should show the status as failed when the add node is not successful Version-Release number of selected component (if applicable): gdeploy-2.0.2-7.el7rhgs.noarch How reproducible: Steps to Reproduce: 1.Create a 3 node ganesha cluster. 2.Add a node to existing ganesha cluster via gdeploy # cat add_node.conf [hosts] dhcp37-122.lab.eng.blr.redhat.com dhcp37-102.lab.eng.blr.redhat.com dhcp37-92.lab.eng.blr.redhat.com dhcp37-119.lab.eng.blr.redhat.com [peer] action=probe [clients] action=mount volname=gluster_shared_storage hosts=dhcp37-122.lab.eng.blr.redhat.com fstype=glusterfs client_mount_points=/var/run/gluster/shared_storage/ [nfs-ganesha] action=add-node cluster_nodes=dhcp37-102.lab.eng.blr.redhat.com,dhcp37-92.lab.eng.blr.redhat.com,dhcp37-119.lab.eng.blr.redhat.com nodes=dhcp37-122.lab.eng.blr.redhat.com vip=10.70.36.220 Actual results: Gdpeloy shows success even if it failed to add the node to existing ganesha cluster Expected results: Gdpeloy should show the correct status of add node if it is failed /passed Additional info:
https://github.com/gluster/gdeploy/commit/0a53e9a7b fixes the issue.
Verified this bug on # rpm -qa | grep gdeploy gdeploy-2.0.2-10.el7rhgs.noarch Gdeploy shows status of add-node- [root@dhcp42-88 home]# gdeploy -c add_node.conf PLAY [master] ****************************************************************** TASK [Creates a Trusted Storage Pool] ****************************************** changed: [dhcp42-125.lab.eng.blr.redhat.com] TASK [Pause for a few seconds] ************************************************* Pausing for 5 seconds (ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort) ok: [dhcp42-125.lab.eng.blr.redhat.com] PLAY RECAP ********************************************************************* dhcp42-125.lab.eng.blr.redhat.com : ok=2 changed=1 unreachable=0 failed=0 PLAY [clients] ***************************************************************** TASK [Create the dir to mount the volume, skips if present] ******************** ok: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) PLAY RECAP ********************************************************************* dhcp42-117.lab.eng.blr.redhat.com : ok=1 changed=0 unreachable=0 failed=0 PLAY [clients] ***************************************************************** TASK [Mount the volumes, if fstype is glusterfs] ******************************* ok: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) PLAY RECAP ********************************************************************* dhcp42-117.lab.eng.blr.redhat.com : ok=1 changed=0 unreachable=0 failed=0 PLAY [clients] ***************************************************************** TASK [setup] ******************************************************************* ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Uncomment STATD_PORT for rpc.statd to listen on] ************************* skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Uncomment LOCKD_TCPPORT for rpc.lockd to listen on] ********************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Uncomment LOCKD_UDPPORT for rpc.lockd to listen on] ********************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Uncomment MOUNTD_PORT for rpc.mountd to listen on] *********************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Restart nfs service (RHEL 6 only)] *************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Restart rpc-statd service] *********************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Restart nfs-config service] ********************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Restart nfs-mountd service] ********************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Restart nfslock service (RHEL 6 & 7)] ************************************ skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) TASK [Mount the volumes if fstype is NFS] ************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) PLAY RECAP ********************************************************************* dhcp42-117.lab.eng.blr.redhat.com : ok=1 changed=0 unreachable=0 failed=0 PLAY [clients] ***************************************************************** TASK [Mount the volumes, if fstype is CIFS] ************************************ skipping: [dhcp42-117.lab.eng.blr.redhat.com] => (item={u'mountpoint': u'/var/run/gluster/shared_storage/', u'fstype': u'fuse'}) PLAY RECAP ********************************************************************* dhcp42-117.lab.eng.blr.redhat.com : ok=0 changed=0 unreachable=0 failed=0 PLAY [master_node] ************************************************************* TASK [setup] ******************************************************************* ok: [dhcp42-125.lab.eng.blr.redhat.com] TASK [Copy the public key to the local] **************************************** changed: [dhcp42-125.lab.eng.blr.redhat.com] TASK [Copy the private key to the local] *************************************** changed: [dhcp42-125.lab.eng.blr.redhat.com] PLAY RECAP ********************************************************************* dhcp42-125.lab.eng.blr.redhat.com : ok=3 changed=2 unreachable=0 failed=0 PLAY [new_nodes] *************************************************************** TASK [setup] ******************************************************************* ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Check if nfs-ganesha is installed] *************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] [WARNING]: Consider using yum, dnf or zypper module rather than running rpm TASK [fail] ******************************************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Check if corosync is installed] ****************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [fail] ******************************************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Check if pacemaker is installed] ***************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [fail] ******************************************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Check if libntirpc is installed] ***************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [fail] ******************************************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Check if pcs is installed] *********************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [fail] ******************************************************************** skipping: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Stop kernel NFS] ********************************************************* ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Stop network manager service] ******************************************** ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Disable network manager service] ***************************************** ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Start network service] *************************************************** ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Enable network service] ************************************************** ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Start pcsd service] ****************************************************** ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Enable pcsd service] ***************************************************** ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Enable pacemaker service] ************************************************ changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Create a user hacluster on new nodes] ************************************ ok: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Set the hacluster user the same password on new nodes] ******************* changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Copy the public key to remote nodes] ************************************* ok: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-125.lab.eng.blr.redhat.com) TASK [Copy the private key to remote node] ************************************* ok: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-125.lab.eng.blr.redhat.com) TASK [Deploy the pubkey ~/root/.ssh/authorized_keys on all nodes] ************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Define service port] ***************************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Restart statd service (RHEL 6 only)] ************************************* skipping: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Restart nfs-config service] ********************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Restart rpc-statd service] *********************************************** changed: [dhcp42-117.lab.eng.blr.redhat.com] TASK [Pcs cluster authenticate the hacluster on new nodes] ********************* changed: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-117.lab.eng.blr.redhat.com) TASK [Pcs cluster authenticate the hacluster on existing nodes] **************** changed: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-125.lab.eng.blr.redhat.com) changed: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-127.lab.eng.blr.redhat.com) changed: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-129.lab.eng.blr.redhat.com) changed: [dhcp42-117.lab.eng.blr.redhat.com] => (item=dhcp42-119.lab.eng.blr.redhat.com) TASK [Pause for a few seconds after pcs auth] ********************************** Pausing for 3 seconds (ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort) ok: [dhcp42-117.lab.eng.blr.redhat.com] PLAY RECAP ********************************************************************* dhcp42-117.lab.eng.blr.redhat.com : ok=25 changed=13 unreachable=0 failed=0 PLAY [cluster_nodes] *********************************************************** TASK [Pcs cluster authenticate the hacluster on new nodes] ********************* changed: [dhcp42-119.lab.eng.blr.redhat.com] => (item=dhcp42-117.lab.eng.blr.redhat.com) changed: [dhcp42-129.lab.eng.blr.redhat.com] => (item=dhcp42-117.lab.eng.blr.redhat.com) failed: [dhcp42-125.lab.eng.blr.redhat.com] (item=dhcp42-117.lab.eng.blr.redhat.com) => {"changed": true, "cmd": "pcs cluster auth -u hacluster -p hacluster dhcp42-117.lab.eng.blr.redhat.com", "delta": "0:00:05.560134", "end": "2019-05-05 14:04:54.257981", "failed": true, "item": "dhcp42-117.lab.eng.blr.redhat.com", "rc": 1, "start": "2019-05-05 14:04:48.697847", "stderr": "Error: Some nodes had a newer tokens than the local node. Local node's tokens were updated. Please repeat the authentication if needed.\nError: Unable to communicate with pcsd", "stdout": "", "stdout_lines": [], "warnings": []} failed: [dhcp42-127.lab.eng.blr.redhat.com] (item=dhcp42-117.lab.eng.blr.redhat.com) => {"changed": true, "cmd": "pcs cluster auth -u hacluster -p hacluster dhcp42-117.lab.eng.blr.redhat.com", "delta": "0:00:05.749889", "end": "2017-06-15 23:30:25.778969", "failed": true, "item": "dhcp42-117.lab.eng.blr.redhat.com", "rc": 1, "start": "2017-06-15 23:30:20.029080", "stderr": "Error: Some nodes had a newer tokens than the local node. Local node's tokens were updated. Please repeat the authentication if needed.\nError: Unable to communicate with pcsd", "stdout": "", "stdout_lines": [], "warnings": []} TASK [Pause for a few seconds] ************************************************* Pausing for 5 seconds (ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort) ok: [dhcp42-129.lab.eng.blr.redhat.com] to retry, use: --limit @/tmp/tmp0mgSRh/ganesha-pcs-auth-new-nodes.retry PLAY RECAP ********************************************************************* dhcp42-119.lab.eng.blr.redhat.com : ok=1 changed=1 unreachable=0 failed=0 dhcp42-125.lab.eng.blr.redhat.com : ok=0 changed=0 unreachable=0 failed=1 dhcp42-127.lab.eng.blr.redhat.com : ok=0 changed=0 unreachable=0 failed=1 dhcp42-129.lab.eng.blr.redhat.com : ok=2 changed=1 unreachable=0 failed=0 Ignoring errors... PLAY [master] ****************************************************************** TASK [Adds a node to the cluster] ********************************************** changed: [dhcp42-125.lab.eng.blr.redhat.com] => (item={u'host': u'dhcp42-117.lab.eng.blr.redhat.com', u'vip': u'10.70.42.44'}) TASK [Report ganesha add-node status] ****************************************** ok: [dhcp42-125.lab.eng.blr.redhat.com] => { "msg": [ "Disabling SBD service...", "dhcp42-117.lab.eng.blr.redhat.com: sbd disabled", "dhcp42-125.lab.eng.blr.redhat.com: Corosync updated", "dhcp42-127.lab.eng.blr.redhat.com: Corosync updated", "dhcp42-129.lab.eng.blr.redhat.com: Corosync updated", "dhcp42-119.lab.eng.blr.redhat.com: Corosync updated", "Setting up corosync...", "dhcp42-117.lab.eng.blr.redhat.com: Succeeded", "Synchronizing pcsd certificates on nodes dhcp42-117.lab.eng.blr.redhat.com...", "dhcp42-117.lab.eng.blr.redhat.com: Success", "Restarting pcsd on the nodes in order to reload the certificates...", "dhcp42-117.lab.eng.blr.redhat.com: Success", "dhcp42-117.lab.eng.blr.redhat.com: Stopping Cluster (pacemaker)...", "dhcp42-127.lab.eng.blr.redhat.com: Stopping Cluster (pacemaker)...", "dhcp42-125.lab.eng.blr.redhat.com: Stopping Cluster (pacemaker)...", "dhcp42-119.lab.eng.blr.redhat.com: Stopping Cluster (pacemaker)...", "dhcp42-129.lab.eng.blr.redhat.com: Stopping Cluster (pacemaker)...", "dhcp42-117.lab.eng.blr.redhat.com: Stopping Cluster (corosync)...", "dhcp42-119.lab.eng.blr.redhat.com: Stopping Cluster (corosync)...", "dhcp42-127.lab.eng.blr.redhat.com: Stopping Cluster (corosync)...", "dhcp42-129.lab.eng.blr.redhat.com: Stopping Cluster (corosync)...", "dhcp42-125.lab.eng.blr.redhat.com: Stopping Cluster (corosync)...", "dhcp42-117.lab.eng.blr.redhat.com: Starting Cluster...", "dhcp42-129.lab.eng.blr.redhat.com: Starting Cluster...", "dhcp42-119.lab.eng.blr.redhat.com: Starting Cluster...", "dhcp42-127.lab.eng.blr.redhat.com: Starting Cluster...", "dhcp42-125.lab.eng.blr.redhat.com: Starting Cluster...", "Removing group: dhcp42-119.lab.eng.blr.redhat.com-group (and all resources within group)", "Stopping all resources in group: dhcp42-119.lab.eng.blr.redhat.com-group...", "Deleting Resource - dhcp42-119.lab.eng.blr.redhat.com-nfs_block", "Removing Constraint - order-nfs-grace-clone-dhcp42-119.lab.eng.blr.redhat.com-cluster_ip-1-mandatory", "Deleting Resource - dhcp42-119.lab.eng.blr.redhat.com-cluster_ip-1", "Removing Constraint - location-dhcp42-119.lab.eng.blr.redhat.com-group", "Removing Constraint - location-dhcp42-119.lab.eng.blr.redhat.com-group-dhcp42-125.lab.eng.blr.redhat.com-1000", "Removing Constraint - location-dhcp42-119.lab.eng.blr.redhat.com-group-dhcp42-127.lab.eng.blr.redhat.com-2000", "Removing Constraint - location-dhcp42-119.lab.eng.blr.redhat.com-group-dhcp42-119.lab.eng.blr.redhat.com-3000", "Deleting Resource (and group) - dhcp42-119.lab.eng.blr.redhat.com-nfs_unblock", "Removing group: dhcp42-125.lab.eng.blr.redhat.com-group (and all resources within group)", "Stopping all resources in group: dhcp42-125.lab.eng.blr.redhat.com-group...", "Deleting Resource - dhcp42-125.lab.eng.blr.redhat.com-nfs_block", "Removing Constraint - order-nfs-grace-clone-dhcp42-125.lab.eng.blr.redhat.com-cluster_ip-1-mandatory", "Deleting Resource - dhcp42-125.lab.eng.blr.redhat.com-cluster_ip-1", "Removing Constraint - location-dhcp42-125.lab.eng.blr.redhat.com-group", "Removing Constraint - location-dhcp42-125.lab.eng.blr.redhat.com-group-dhcp42-127.lab.eng.blr.redhat.com-1000", "Removing Constraint - location-dhcp42-125.lab.eng.blr.redhat.com-group-dhcp42-119.lab.eng.blr.redhat.com-2000", "Removing Constraint - location-dhcp42-125.lab.eng.blr.redhat.com-group-dhcp42-125.lab.eng.blr.redhat.com-3000", "Deleting Resource (and group) - dhcp42-125.lab.eng.blr.redhat.com-nfs_unblock", "Removing group: dhcp42-127.lab.eng.blr.redhat.com-group (and all resources within group)", "Stopping all resources in group: dhcp42-127.lab.eng.blr.redhat.com-group...", "Deleting Resource - dhcp42-127.lab.eng.blr.redhat.com-nfs_block", "Removing Constraint - order-nfs-grace-clone-dhcp42-127.lab.eng.blr.redhat.com-cluster_ip-1-mandatory", "Deleting Resource - dhcp42-127.lab.eng.blr.redhat.com-cluster_ip-1", "Removing Constraint - location-dhcp42-127.lab.eng.blr.redhat.com-group", "Removing Constraint - location-dhcp42-127.lab.eng.blr.redhat.com-group-dhcp42-119.lab.eng.blr.redhat.com-1000", "Removing Constraint - location-dhcp42-127.lab.eng.blr.redhat.com-group-dhcp42-125.lab.eng.blr.redhat.com-2000", "Removing Constraint - location-dhcp42-127.lab.eng.blr.redhat.com-group-dhcp42-127.lab.eng.blr.redhat.com-3000", "Deleting Resource (and group) - dhcp42-127.lab.eng.blr.redhat.com-nfs_unblock", "Removing group: dhcp42-129.lab.eng.blr.redhat.com-group (and all resources within group)", "Stopping all resources in group: dhcp42-129.lab.eng.blr.redhat.com-group...", "Deleting Resource - dhcp42-129.lab.eng.blr.redhat.com-nfs_block", "Removing Constraint - order-nfs-grace-clone-dhcp42-129.lab.eng.blr.redhat.com-cluster_ip-1-mandatory", "Deleting Resource - dhcp42-129.lab.eng.blr.redhat.com-cluster_ip-1", "Removing Constraint - location-dhcp42-129.lab.eng.blr.redhat.com-group", "Removing Constraint - location-dhcp42-129.lab.eng.blr.redhat.com-group-dhcp42-117.lab.eng.blr.redhat.com-1000", "Removing Constraint - location-dhcp42-129.lab.eng.blr.redhat.com-group-dhcp42-119.lab.eng.blr.redhat.com-2000", "Removing Constraint - location-dhcp42-129.lab.eng.blr.redhat.com-group-dhcp42-125.lab.eng.blr.redhat.com-3000", "Removing Constraint - location-dhcp42-129.lab.eng.blr.redhat.com-group-dhcp42-127.lab.eng.blr.redhat.com-4000", "Removing Constraint - location-dhcp42-129.lab.eng.blr.redhat.com-group-dhcp42-129.lab.eng.blr.redhat.com-5000", "Deleting Resource (and group) - dhcp42-129.lab.eng.blr.redhat.com-nfs_unblock", "Adding nfs-grace-clone dhcp42-119.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)", "Adding nfs-grace-clone dhcp42-125.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)", "Adding nfs-grace-clone dhcp42-127.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)", "Adding nfs-grace-clone dhcp42-129.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)", "CIB updated" ] } PLAY RECAP ********************************************************************* Moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2777