Bug 1810042 - Changes to gluster peer probe in nightly build breaks ansible:gluster_volume call
Summary: Changes to gluster peer probe in nightly build breaks ansible:gluster_volume ...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Sanju
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-04 13:28 UTC by Sachin Prabhu
Modified: 2020-03-13 06:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-13 06:23:12 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 24211 0 None Merged cli: display the error while probing the localhost 2020-03-13 06:23:11 UTC

Description Sachin Prabhu 2020-03-04 13:28:40 UTC
As part of setting up the CI for gluster-samba, I was attempting to install a gluster node using the ansible scripts.

I've hit a problem in the ansible gluster_volume command when it does a peer probe for the same machine.
TASK [gluster.cluster/roles/gluster_volume : Create a volume] ******************
fatal: [storage0]: FAILED! => {"changed": false, "msg": "failed to probe peer storage0 on storage0"}

I've been able to narrow this down to 
https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/storage/glusterfs/gluster_volume.py probe()

def probe(host, myhostname):
    global module
    out = run_gluster(['peer', 'probe', host])
    if out.find('localhost') == -1 and not wait_for_peer(host):
        module.fail_json(msg='failed to probe peer %s on %s' % (host, myhostname))

The ansible gluster_volume module probe checks for the output to contain 'localhost' or the host to be included in the list of peers.
Since we were probing the same machine, the ansible code expects to see the output 
[root@vm145-91 ~]# gluster peer probe vm145-91
peer probe: success. Probe on localhost not needed

However checking on the machine, I see the output
[root@storage0 ~]# gluster peer probe storage0
peer probe: success

This change is caused by the glusterfs commit
bc6e206c6 cli-rpc-ops.c: cleanups

I was able to work around the problem by using the 20191218.a7eeab9-0.0.el7 versions from the nightly builds at
https://ci.centos.org/artifacts/gluster/nightly/master/7/x86_64/

Comment 1 Sachin Prabhu 2020-03-04 13:53:24 UTC
The quick patch below confirmed the cause. I suspect that we may be better off fixing ansible:gluster_volume

commit 9972018b1a77db3becddd99b2e872917566f3539 (HEAD -> probe_regression)
Author: Sachin Prabhu <sprabhu>
Date:   Wed Mar 4 11:37:42 2020 +0000

    Fix regression in gluster probe
    
    Change-Id: Ibb8037b27b5cc246f2b4ac86a315e4a2a7c92e46

diff --git a/cli/src/cli-rpc-ops.c b/cli/src/cli-rpc-ops.c
index 0f57d94b5..3205f2895 100644
--- a/cli/src/cli-rpc-ops.c
+++ b/cli/src/cli-rpc-ops.c
@@ -158,9 +158,9 @@ gf_cli_probe_cbk(struct rpc_req *req, struct iovec *iov, int count,
 
     gf_log("cli", GF_LOG_INFO, "Received resp to probe");
 
-    if (rsp.op_ret) {
-        if (rsp.op_errstr && rsp.op_errstr[0] != '\0') {
-            snprintf(msg, sizeof(msg), "%s", rsp.op_errstr);
+    if (rsp.op_errstr && rsp.op_errstr[0] != '\0') {
+        snprintf(msg, sizeof(msg), "%s", rsp.op_errstr);
+        if (rsp.op_ret) {
             gf_log("cli", GF_LOG_ERROR, "%s", msg);
         }
     }

Comment 2 Sanju 2020-03-09 06:55:10 UTC
Hi Sachin, thanks for detailed report. Would you like to push the patch to gerrit or I can push for you.

-Sanju

Comment 3 Worker Ant 2020-03-10 10:27:33 UTC
REVIEW: https://review.gluster.org/24211 (cli: display the error while probing the localhost) posted (#1) for review on master by Sanju Rakonde

Comment 4 Sachin Prabhu 2020-03-11 23:25:23 UTC
Sanju, 

Thanks for following up on this issue. I have reviewed the patch.

Sachin Prabhu

Comment 5 Worker Ant 2020-03-13 06:23:12 UTC
REVIEW: https://review.gluster.org/24211 (cli: display the error while probing the localhost) merged (#2) on master by Sanju Rakonde


Note You need to log in before you can comment on or make changes to this bug.