Bug 1647968

Summary: Seeing defunt translator and discrepancy in volume info when issued from node which doesn't host bricks in that volume
Product: [Community] GlusterFS Reporter: Sanju <srakonde>
Component: glusterdAssignee: Sanju <srakonde>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 5CC: bmekala, bugs, nchilaka, rhs-bugs, sankarshan, storage-qa-internal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-5.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1635820 Environment:
Last Closed: 2018-11-29 15:21:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1635136, 1635820    
Bug Blocks: 1643052    

Description Sanju 2018-11-08 16:10:52 UTC
+++ This bug was initially created as a clone of Bug #1635820 +++

+++ This bug was initially created as a clone of Bug #1635136 +++

Description of problem:
--------------------------
when we issue a volume info from a node in the cluster, which doesnt host any of the bricks of that volume, we see ambiguous data as below
However the same is not seen when issued from a node which is hosting one of the bricks(in below case either from dhcp35-140/38/184)
This is reproducible always
Also, this is seen when the volume is created only
Once started, you don't see this



vol info from node which doesnt host brick
==========================================
Volume Name: rep3-8
Type: Replicate
Volume ID: ecff9a6e-0c14-4124-acfb-e41aff40debf
Status: Created
Snapshot Count: 0
Xlator 1: BD
Capability 1: thin
Capability 2: offload_copy
Capability 3: offload_snapshot
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp35-140.lab.eng.blr.redhat.com:/gluster/brick8/rep3-8
Brick1 VG: 
Brick2: dhcp35-38.lab.eng.blr.redhat.com:/gluster/brick8/rep3-8
Brick2 VG: 
Brick3: dhcp35-184.lab.eng.blr.redhat.com:/gluster/brick8/rep3-8
Brick3 VG: 
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


volinfo from node which hosts brick
=====================================
 
Volume Name: rep3-8
Type: Replicate
Volume ID: ecff9a6e-0c14-4124-acfb-e41aff40debf
Status: Created
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp35-140.lab.eng.blr.redhat.com:/gluster/brick8/rep3-8
Brick2: dhcp35-38.lab.eng.blr.redhat.com:/gluster/brick8/rep3-8
Brick3: dhcp35-184.lab.eng.blr.redhat.com:/gluster/brick8/rep3-8
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off



I did go through some volfiles and found a descrepency here


[root@dhcp35-140 bricks]# cat dhcp35-140.lab.eng.blr.redhat.com\:-gluster-brick8-rep3-8 
uuid=c63a0191-c6b0-4073-a0b0-f5ca0cf3f128
hostname=dhcp35-140.lab.eng.blr.redhat.com
path=/gluster/brick8/rep3-8
real_path=/gluster/brick8/rep3-8
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=rep3-8-client-0
mount_dir=/rep3-8
snap-status=0
brick-fsid=64810
[root@dhcp35-140 bricks]# 



[root@dhcp35-140 bricks]# cat /var/lib/glusterd/vols/rep3-8/bricks/dhcp35-140.lab.eng.blr.redhat.com\:-gluster-brick8-rep3-8 
uuid=c63a0191-c6b0-4073-a0b0-f5ca0cf3f128
hostname=dhcp35-140.lab.eng.blr.redhat.com
path=/gluster/brick8/rep3-8
real_path=/gluster/brick8/rep3-8
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=rep3-8-client-0
mount_dir=/rep3-8
snap-status=0
brick-fsid=64810  ------------------>From node which hosts volume
[root@dhcp35-140 bricks]# 



[root@dhcp35-218 bricks]# cat /var/lib/glusterd/vols/rep3-8/bricks/dhcp35-140.lab.eng.blr.redhat.com:-gluster-brick8-rep3-8
uuid=c63a0191-c6b0-4073-a0b0-f5ca0cf3f128
hostname=dhcp35-140.lab.eng.blr.redhat.com
path=/gluster/brick8/rep3-8
real_path=/gluster/brick8/rep3-8
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=rep3-8-client-0
mount_dir=/rep3-8
snap-status=0
brick-fsid=0            ------------------>From node which doesnt host volume
[root@dhcp35-218 bricks]# 




Version-Release number of selected component (if applicable):
------------------------------
3.12.2-18

How reproducible:
=================
always

Steps to Reproduce:
1.have a 6 node cluster
2.create a 1x3 or arbiter volume with bricks from only 3 of the 6 nodes
3.issue volume info from any node which hosts the brick of the volume
4. issue vol info from any node which *doesnt* host brick of the volume

You can notice the discrepancy.
 

--- Additional comment from Atin Mukherjee on 2018-10-02 19:24:50 IST ---

Sanju has started working on this. Moving the state to Assigned.

--- Additional comment from Sanju on 2018-10-03 23:09:11 IST ---

RCA:
When "gluster v info" command is issued, glusterd_add_volume_detail_to_dict() will add all the data related to the respective volume into a dictionary.

a snippet from glusterd_add_volume_detail_to_dict() which adds BD xlator related information to the dict is:

#ifdef HAVE_BD_XLATOR
        if (volinfo->caps) {
                caps = 0;
                snprintf (key, 256, "volume%d.xlator0", count);
                buf = GF_MALLOC (256, gf_common_mt_char);
                if (!buf) {
                        ret = ENOMEM;
                        goto out;
                }
                if (volinfo->caps & CAPS_BD)
                        snprintf (buf, 256, "BD");
                ret = dict_set_dynstr (volumes, key, buf);
                if (ret) {
                        GF_FREE (buf);
                        goto out;
                }

                if (volinfo->caps & CAPS_THIN) {
                        snprintf (key, 256, "volume%d.xlator0.caps%d", count,
                                  caps++);
                        buf = GF_MALLOC (256, gf_common_mt_char);
                        if (!buf) {
                                ret = ENOMEM;
                                goto out;
                        }
                        snprintf (buf, 256, "thin");
                        ret = dict_set_dynstr (volumes, key, buf);
                        if (ret) {
                                GF_FREE (buf);
                                goto out;
                        }
                }

                if (volinfo->caps & CAPS_OFFLOAD_COPY) {
                        snprintf (key, 256, "volume%d.xlator0.caps%d", count,
                                  caps++);
                        buf = GF_MALLOC (256, gf_common_mt_char);
                        if (!buf) {
                                ret = ENOMEM;
                                goto out;
                        }
                        snprintf (buf, 256, "offload_copy");
                        ret = dict_set_dynstr (volumes, key, buf);
                        if (ret) {
                                GF_FREE (buf);
                                goto out;
                        }
                }

                if (volinfo->caps & CAPS_OFFLOAD_SNAPSHOT) {
                        snprintf (key, 256, "volume%d.xlator0.caps%d", count,
                                  caps++);
                        buf = GF_MALLOC (256, gf_common_mt_char);
                        if (!buf) {
                                ret = ENOMEM;
                                goto out;
                        }
                        snprintf (buf, 256, "offload_snapshot");
                        ret = dict_set_dynstr (volumes, key, buf);
                        if (ret)  {
                                GF_FREE (buf);
                                goto out;
                        }
                }

                if (volinfo->caps & CAPS_OFFLOAD_ZERO) {
                        snprintf (key, 256, "volume%d.xlator0.caps%d", count,
                                  caps++);
                        buf = GF_MALLOC (256, gf_common_mt_char);
                        if (!buf) {
                                ret = ENOMEM;
                                goto out;
                        }
                        snprintf (buf, 256, "offload_zerofill");
                        ret = dict_set_dynstr (volumes, key, buf);
                        if (ret)  {
                                GF_FREE (buf);
                                goto out;
                        }
                }

        }
#endif

A point to be noted here is, we disable BD xlator at the time of source compilation, only when RHEL version < 6.

So, the above block of code will be executed if volinfo->caps has a non-zero value (for RHEL>=6).

We set volinfo->caps during the volume creation in glusterd_op_create_volume(). caps value is set to 
caps = CAPS_BD | CAPS_THIN | CAPS_OFFLOAD_COPY | CAPS_OFFLOAD_SNAPSHOT in the beginning.

                if (!gf_uuid_compare (brickinfo->uuid, MY_UUID)) {
                        ret = sys_statvfs (brickinfo->path, &brickstat);
                        if (ret) {
                                gf_log ("brick-op", GF_LOG_ERROR, "Failed to fetch disk"
                                        " utilization from the brick (%s:%s). Please "
                                        "check health of the brick. Error code was %s",
                                        brickinfo->hostname, brickinfo->path,
                                        strerror (errno));
                                goto out;
                        }
                        brickinfo->statfs_fsid = brickstat.f_fsid;

#ifdef HAVE_BD_XLATOR
                        if (brickinfo->vg[0]) {
                                ret = glusterd_is_valid_vg (brickinfo, 0, msg);
                                if (ret) {
                                        gf_msg (this->name, GF_LOG_ERROR, 0,
                                                GD_MSG_INVALID_VG, "%s", msg);
                                        goto out;
                                }

                                /* if anyone of the brick does not have thin
                                   support, disable it for entire volume */
                                caps &= brickinfo->caps;
                        } else {
                                caps = 0;
                        }
#endif
                }

caps value will reset in the above code block. If brick doesn't belong to the same node, the caps value should be set to 0. In earlier versions of RHGS, we used to set caps to 0, if the brick doesn't belong to the same node or brickinfo->vg[0] is null. With the commit febf5ed4848, we are assigning caps to 0, only when brick belongs to the same node but brickinfo->vg[0] is null. We need to also set caps to 0, if the brick doesn't belong to the same node. 

Value of caps will be assigned to volinfo->caps, and glusterd_add_volume_detail_to_dict() uses volinfo->caps to add details into the dictionary. So, setting up correct value for volinfo->caps fixes the issue.

Thanks,
Sanju

--- Additional comment from Worker Ant on 2018-10-04 06:31:26 IST ---

REVIEW: https://review.gluster.org/21336 (glusterd: ensure volinfo->caps is set to correct value.) posted (#1) for review on master by Sanju Rakonde

--- Additional comment from Worker Ant on 2018-10-25 17:42:41 IST ---

COMMIT: https://review.gluster.org/21336 committed in master by "Atin Mukherjee" <amukherj> with a commit message- glusterd: ensure volinfo->caps is set to correct value.

With the commit febf5ed4848, during the volume create op,
we are setting volinfo->caps to 0, only if any of the bricks
belong to the same node and brickinfo->vg[0] is null.
Previously, we used to set volinfo->caps to 0, when
either brick doesn't belong to the same node or brickinfo->vg[0]
is null.

With this patch, we set volinfo->caps to 0, when either brick
doesn't belong to the same node or brickinfo->vg[0] is null.
(as we do earlier without commit febf5ed4848).

fixes: bz#1635820
Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49
Signed-off-by: Sanju Rakonde <srakonde>

Comment 1 Worker Ant 2018-11-08 16:13:27 UTC
REVIEW: https://review.gluster.org/21600 (glusterd: ensure volinfo->caps is set to correct value.) posted (#1) for review on release-5 by Sanju Rakonde

Comment 2 Worker Ant 2018-11-09 18:47:04 UTC
REVIEW: https://review.gluster.org/21600 (glusterd: ensure volinfo->caps is set to correct value.) posted (#2) for review on release-5 by Shyamsundar Ranganathan

Comment 3 Shyamsundar 2018-11-29 15:21:29 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.1, please open a new bug report.

glusterfs-5.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-November/000116.html
[2] https://www.gluster.org/pipermail/gluster-users/