Bug 1633479

Summary: 'df' shows half as much space on volume after upgrade to RHGS 3.4
Product: [Community] GlusterFS Reporter: Sanju <srakonde>
Component: glusterdAssignee: Sanju <srakonde>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: high    
Version: 4.1CC: abhishku, amukherj, anrobins, bkunal, bmekala, bugs, nbalacha, rcyriac, rhs-bugs, sankarshan, sarora, storage-qa-internal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-4.1.6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1632889 Environment:
Last Closed: 2018-11-29 15:25:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1632889    
Bug Blocks: 1630997, 1633242    

Comment 1 Worker Ant 2018-09-27 11:04:02 UTC
REVISION POSTED: https://review.gluster.org/21288 (glusterd: make sure that brickinfo->uuid is not null) posted (#2) for review on release-4.1 by Sanju Rakonde

Comment 2 Worker Ant 2018-09-27 11:04:11 UTC
REVIEW: https://review.gluster.org/21288 (glusterd: make sure that brickinfo->uuid is not null) posted (#2) for review on release-4.1 by Sanju Rakonde

Comment 3 Worker Ant 2018-10-05 14:40:22 UTC
COMMIT: https://review.gluster.org/21288 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- glusterd: make sure that brickinfo->uuid is not null

Problem: After an upgrade from the version where shared-brick-count
option is not present to a version which introduced this option
causes issue at the mount point i.e, size of the volume at mount
point will be reduced by shared-brick-count value times.

Cause: shared-brick-count is equal to the number of bricks that
are sharing the file system. gd_set_shared_brick_count() calculates
the shared-brick-count value based on uuid of the node and fsid of
the brick. https://review.gluster.org/#/c/glusterfs/+/19484 handles
setting of fsid properly during an upgrade path. This patch assumed
that when the code path is reached, brickinfo->uuid is non-null.
But brickinfo->uuid is null for all the bricks, as the uuid is null
https://review.gluster.org/#/c/glusterfs/+/19484 couldn't reached the
code path to set the fsid for bricks. So, we had fsid as 0 for all
bricks, which resulted in gd_set_shared_brick_count() to calculate
shared-brick-count in a wrong way. i.e, the logic written in
gd_set_shared_brick_count() didn't work as expected since fsid is 0.

Solution: Before control reaches the code path written by
https://review.gluster.org/#/c/glusterfs/+/19484,
adding a check for whether brickinfo->uuid is null and
if brickinfo->uuid is having null value, calling
glusterd_resolve_brick will set the brickinfo->uuid to a
proper value. When we have proper uuid, fsid for the bricks
will be set properly and shared-brick-count value will be
caluculated correctly.

Please take a look at the bug https://bugzilla.redhat.com/show_bug.cgi?id=1632889
for complete RCA

Steps followed to test the fix:
1. Created a 2 node cluster, the cluster is running with binary
which doesn't have shared-brick-count option
2. Created a 2x(2+1) volume and started it
3. Mouted the volume, checked size of volume using df
4. Upgrade to a version where shared-brick-count is introduced
(upgraded the nodes one by one i.e, stop the glusterd, upgrade the node
and start the glusterd).
5. after upgrading both the nodes, bumped up the cluster.op-version
6. At mount point, df shows the correct size for volume.

> BUG: 1632889
> Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b
> Signed-off-by: Sanju Rakonde <srakonde>
(cherry picked from commit f1e9b878ce2067db83a0baa5f384eda87287719d)

fixes: bz#1633479
Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b
Signed-off-by: Sanju Rakonde <srakonde>

Comment 4 Shyamsundar 2018-11-29 15:25:05 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.6, please open a new bug report.

glusterfs-4.1.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-November/000116.html
[2] https://www.gluster.org/pipermail/gluster-users/