Bug 1644279 - Disperse volume 'df' usage is extremely incorrect after replace-brick.
Summary: Disperse volume 'df' usage is extremely incorrect after replace-brick.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd
Version: rhgs-3.4
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ---
: RHGS 3.4.z Batch Update 2
Assignee: Sanju
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On: 1637196
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-30 11:27 UTC by Sanju
Modified: 2018-12-17 17:07 UTC (History)
15 users (show)

Fixed In Version: glusterfs-3.12.2-27
Doc Type: Bug Fix
Doc Text:
Previously, when multiple replace brick operations were performed on a volume, a df command run on the mount point showed a reduced volume size. With this update, the df command shows correct volume size at mount point.
Clone Of: 1637196
Environment:
Last Closed: 2018-12-17 17:07:11 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 21513 0 None None None 2018-10-30 11:27:41 UTC
Red Hat Product Errata RHBA-2018:3827 0 None None None 2018-12-17 17:07:27 UTC

Description Sanju 2018-10-30 11:27:42 UTC
+++ This bug was initially created as a clone of Bug #1637196 +++

Disperse volume 'df' usage is extremely incorrect after replace-brick.

Disperse volume 'df' usage statistics are extremely incorrect
after replace brick where the source brick is down. On a 3
brick redundancy 1 disperse volume, the available space is
reduced by 50%, and the used 'inode' count goes up by 50% even
on empty volumes. The 'df' usage numbers are wrong on both
FUSE and NFS v3 mounts. Starting/stopping the disperse volume,
and remounting the client does not correct the 'df' usage
numbers.

When the replace-brick is done while the source brick is
running, the 'df' usage statistics after a replace brick seem
to be OK.

It looks as though only the statfs() numbers that 'df' is
using are incorrect; the actual disperse volume space and
inode usage looks OK. In some ways, that makes the issue
cosmetic, except for any applications or features that use and
believe these numbers.

Test plan:

# gluster --version
glusterfs 3.12.14

##### Start with empty bricks, on separate file-systems.
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdd        100G   33M  100G   1% /exports/brick-1
/dev/sde        100G   33M  100G   1% /exports/brick-2
/dev/sdf        100G   33M  100G   1% /exports/brick-3
/dev/sdg        100G   33M  100G   1% /exports/brick-4
/dev/sdh        100G   33M  100G   1% /exports/brick-5
/dev/sdi        100G   33M  100G   1% /exports/brick-6
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdd          50M     3   50M    1% /exports/brick-1
/dev/sde          50M     3   50M    1% /exports/brick-2
/dev/sdf          50M     3   50M    1% /exports/brick-3
/dev/sdg          50M     3   50M    1% /exports/brick-4
/dev/sdh          50M     3   50M    1% /exports/brick-5
/dev/sdi          50M     3   50M    1% /exports/brick-6

##### Create the disperse volume:
# mkdir /exports/brick-1/disp-vol /exports/brick-2/disp-vol /exports/brick-3/disp-vol /exports/brick-4/disp-vol /exports/brick-5/disp-vol /exports/brick-6/disp-vol
# gluster volume create disp-vol disperse-data 2 redundancy 1 transport tcp 10.0.0.28:/exports/brick-1/disp-vol/ 10.0.0.28:/exports/brick-2/disp-vol/ 10.0.0.28:/exports/brick-3/disp-vol/ force
volume create: disp-vol: success: please start the volume to access data
# gluster volume start disp-vol
volume start: disp-vol: success

##### Mount the disperse volume using both FUSE and NFS v3:
# mkdir /mnt/disp-vol-fuse
# mkdir /mnt/disp-vol-nfs
# mount -t glusterfs -o acl,log-level=WARNING,fuse-mountopts=noatime 127.0.0.1:/disp-vol /mnt/disp-vol-fuse/
# gluster volume set disp-vol nfs.disable off
Gluster NFS is being deprecated in favor of NFS-Ganesha Enter "yes" to continue using Gluster NFS (y/n) yes
volume set: success
# mount 127.0.0.1:/disp-vol /mnt/disp-vol-nfs/

##### Initially, the space and inode usage numbers are correct:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   65M  200G   1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   64M  200G   1% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    22   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    22   50M    1% /mnt/disp-vol-nfs
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdd          50M    22   50M    1% /exports/brick-1
/dev/sde          50M    20   50M    1% /exports/brick-2
/dev/sdf          50M    20   50M    1% /exports/brick-3

##### Create a file to use up some space:
# fallocate -l 25G /mnt/disp-vol-fuse/file.1
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    26   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    26   50M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdd        100G   13G   88G  13% /exports/brick-1
/dev/sde        100G   13G   88G  13% /exports/brick-2
/dev/sdf        100G   13G   88G  13% /exports/brick-3
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdd          50M    26   50M    1% /exports/brick-1
/dev/sde          50M    24   50M    1% /exports/brick-2
/dev/sdf          50M    24   50M    1% /exports/brick-3

##### Perform the first replace-brick with the source brick being up:
# gluster volume replace-brick disp-vol 10.0.0.28:/exports/brick-1/disp-vol/ 10.0.0.28:/exports/brick-4/disp-vol/ commit force
volume replace-brick: success: replace-brick commit force operation successful
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-2/disp-vol
/file.1
Status: Connected
Number of entries: 1
Brick 10.0.0.28:/exports/brick-3/disp-vol
/file.1
Status: Connected
Number of entries: 1

##### After first replace-brick with up source brick, the space and inode usage numbers are correct:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde        100G   13G   88G  13% /exports/brick-2
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G  8.1G   92G   9% /exports/brick-4
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-2/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-3/disp-vol
Status: Connected
Number of entries: 0
##### Still good after healing is done:
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde        100G   13G   88G  13% /exports/brick-2
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G   13G   88G  13% /exports/brick-4
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sde          50M    24   50M    1% /exports/brick-2
/dev/sdf          50M    24   50M    1% /exports/brick-3
/dev/sdg          50M    24   50M    1% /exports/brick-4
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-nfs

##### Kill brick-2 process to simulate failure:
# gluster volume status disp-vol
Status of volume: disp-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.0.28:/exports/brick-4/disp-vol   62003     0          Y       110996
Brick 10.0.0.28:/exports/brick-2/disp-vol   62001     0          Y       107148
Brick 10.0.0.28:/exports/brick-3/disp-vol   62002     0          Y       107179
NFS Server on localhost                     2049      0          Y       111004
Self-heal Daemon on localhost               N/A       N/A        Y       111015
Task Status of Volume disp-vol
------------------------------------------------------------------------------
There are no active volume tasks
# kill 107148
##### Before the replace-brick, the 'df' numbers are still good:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-nfs

##### After the replace-brick with a down source brick, the
'df' numbers are still crazy, volume size reduced by 50%, and
inode use went from 1% to 51%:

# gluster volume replace-brick disp-vol 10.0.0.28:/exports/brick-2/disp-vol/ 10.0.0.28:/exports/brick-5/disp-vol/ commit force
volume replace-brick: success: replace-brick commit force operation successful
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-nfs
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
/file.1
Status: Connected
Number of entries: 1
Brick 10.0.0.28:/exports/brick-5/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-3/disp-vol
/file.1
Status: Connected
Number of entries: 1
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G  2.1G   98G   3% /exports/brick-5
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdf          50M    24   50M    1% /exports/brick-3
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5

##### 'df' numbers are no better after healing is done:
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-5/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-3/disp-vol
Status: Connected
Number of entries: 0
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G   13G   88G  13% /exports/brick-5
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdf          50M    24   50M    1% /exports/brick-3
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5

#### Stopping/starting the disperse volume, and remounting clients does not help:
# gluster volume stop disp-vol
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: disp-vol: success
# gluster volume start disp-vol
volume start: disp-vol: success
# mount -t glusterfs -o acl,log-level=WARNING,fuse-mountopts=noatime 127.0.0.1:/disp-vol /mnt/disp-vol-fuse/
# mount 127.0.0.1:/disp-vol /mnt/disp-vol-nfs/
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-nfs

##### Simulate a second brick failure, and replacement:
# gluster volume status disp-vol
Status of volume: disp-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.0.28:/exports/brick-4/disp-vol   62001     0          Y       121258
Brick 10.0.0.28:/exports/brick-5/disp-vol   62004     0          Y       121278
Brick 10.0.0.28:/exports/brick-3/disp-vol   62005     0          Y       121298
NFS Server on localhost                     2049      0          Y       121319
Self-heal Daemon on localhost               N/A       N/A        Y       121328
Task Status of Volume disp-vol
------------------------------------------------------------------------------
There are no active volume tasks

# kill 121298
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    25M    12   25M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    25M    12   25M    1% /mnt/disp-vol-nfs
# gluster volume replace-brick disp-vol 10.0.0.28:/exports/brick-3/disp-vol/ 10.0.0.28:/exports/brick-6/disp-vol/ commit force
volume replace-brick: success: replace-brick commit force operation successful

##### After the second replace-brick with a down source brick,
the volume size reported by 'df' goes down by another 33%. The
inode usage went back down from 51%, but it is now less than
the number the volume started with, which is suspicious, and
the total number of inodes has gone from a starting value of
50M down to 17M!

# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G   13G   88G  13% /exports/brick-5
/dev/sdi        100G  2.1G   98G   3% /exports/brick-6
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5
/dev/sdi          50M    24   50M    1% /exports/brick-6
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-5/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-6/disp-vol
Status: Connected
Number of entries: 0
##### 'df' numbers are no better after healing is done:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G   13G   88G  13% /exports/brick-5
/dev/sdi        100G   13G   88G  13% /exports/brick-6
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5
/dev/sdi          50M    24   50M    1% /exports/brick-6
# gluster volume info disp-vol
Volume Name: disp-vol
Type: Disperse
Volume ID: fb9cccb8-311f-49ac-948d-60e4894da0b6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.0.28:/exports/brick-4/disp-vol
Brick2: 10.0.0.28:/exports/brick-5/disp-vol
Brick3: 10.0.0.28:/exports/brick-6/disp-vol
Options Reconfigured:
transport.address-family: inet
nfs.disable: off

##### Note that although 'df' is saying the disperse volume is
only 67G, it really still does have 200GB of space.

# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-nfs

# fallocate -l 25G /mnt/disp-vol-fuse/file.2
# fallocate -l 25G /mnt/disp-vol-fuse/file.3
# fallocate -l 25G /mnt/disp-vol-fuse/file.4
# fallocate -l 25G /mnt/disp-vol-fuse/file.5
# fallocate -l 25G /mnt/disp-vol-fuse/file.6
# fallocate -l 25G /mnt/disp-vol-fuse/file.7
# fallocate -l 25G /mnt/disp-vol-fuse/file.8
fallocate: /mnt/disp-vol-fuse/file.8: fallocate failed: No space left on device
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G   62G  5.4G  93% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G   62G  5.4G  93% /mnt/disp-vol-nfs
# du -sh /mnt/disp-vol-fuse/
176G    /mnt/disp-vol-fuse/
# du -sh /mnt/disp-vol-nfs/
176G    /mnt/disp-vol-nfs/

# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol   5.4M    15  5.4M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   5.4M    15  5.4M    1% /mnt/disp-vol-nfs

--- Additional comment from Jeff Byers on 2018-10-09 03:19:12 IST ---

Note that with a cursory test, this issue does not appear to occur on an older GlusterFS version:

# gluster --version
glusterfs 3.7.18 built on May 25 2018 16:07:41

--- Additional comment from Jeff Byers on 2018-10-13 02:35:10 IST ---

The problem of the 'df' usage being wrong is because the
GlusterFS brick shared count 'shared-brick-count' for the
segments is being incremented, where it should have been
always been 1. What the 'shared-brick-count' is for is when a
volume has multiple bricks on the same file-system. In such
cases, as multiple bricks cannot use the full space of the
file-system since they are sharing, the space for only one of
them is counted.

However, in this case there was no brick file-system sharing
going on. GlusterFS uses the file-system ID from f_fsid field
from statvfs() to determine when multiple bricks are on the
same file-system.

Unfortunately, 'replace-brick' was not reading the
sys_statvfs() 'f_fsid' value from the new brick, so 'brick-
fsid' in the brick spec file was being set to 0. For the first
'replace-brick' this would be OK, but when another brick was
replaced, also with 'brick-fsid' being 0, there could then be
multiple bricks with the 'statfs_fsid' value of zero, so
'shared-brick-count' would be incremented, and its space would
be subtracted from the volume.

--- Additional comment from Shyamsundar on 2018-10-23 20:25:06 IST ---

Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

--- Additional comment from Nithya Balachandran on 2018-10-29 10:00:13 IST ---

Sanju,

Can you take a look at this?

Thanks,
Nithya

--- Additional comment from Xavi Hernandez on 2018-10-29 23:35:43 IST ---

Moving this to 'distribute' component.

--- Additional comment from Nithya Balachandran on 2018-10-30 09:17:24 IST ---

This is not a dht issue - moving this to glusterd.

--- Additional comment from Worker Ant on 2018-10-30 16:42:42 IST ---

REVIEW: https://review.gluster.org/21513 (glusterd: set fsid while performing replace brick) posted (#1) for review on master by Sanju Rakonde

--- Additional comment from Sanju on 2018-10-30 16:56:33 IST ---

Updated reproducer:
1. create any type of volume which supports replace-brick operation, having at least two bricks(B1, B2,..)
2. start the volume
3. mount the volume and check the volume size using df
4. perform a replace-brick operation on B1
5. check the size at mount point using df, it should be same as in step 3
6. perform replace-brick operation on B2.
7. check the size at mount point using df, it will be reduced by half

RCA:
While performing the replace brick operation we are not setting the fsid for the new brick. So the new brick will have fsid as 0. when we perform 2nd replace-brick operation, again the 2nd new brick will have fsid as 0. So, there will be two bricks which have fsid as 0.

While calculating shared-brick-count, we consider the value of fsid of the bricks. If bricks are having same fsid, that means they are sharing the same file system. shared-brick-count refers to number of bricks that are sharing the same file system. In this case, shared-brick-count becomes 2 (as both new bricks are having fsid as 0). So, after 2nd replace brick operation the volume size at the mount point will be reduced by half.

Thanks,
Sanju

Comment 2 Sanju 2018-10-30 11:28:59 UTC
upstream patch: https://review.gluster.org/21513

Comment 12 Anjana KD 2018-12-04 10:02:18 UTC
Updated Doc text field. Kindly review for technical accuracy.

Comment 15 errata-xmlrpc 2018-12-17 17:07:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3827


Note You need to log in before you can comment on or make changes to this bug.