Bug 1637196
Summary: | Disperse volume 'df' usage is extremely incorrect after replace-brick. | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Jeff Byers <jbyers> | |
Component: | glusterd | Assignee: | Sanju <srakonde> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | mainline | CC: | bugs, jahernan, jbyers, nbalacha, pasik, vnosov | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-6.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1644279 (view as bug list) | Environment: | ||
Last Closed: | 2019-03-25 16:31:17 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1644279 |
Description
Jeff Byers
2018-10-08 21:16:47 UTC
Note that with a cursory test, this issue does not appear to occur on an older GlusterFS version: # gluster --version glusterfs 3.7.18 built on May 25 2018 16:07:41 The problem of the 'df' usage being wrong is because the GlusterFS brick shared count 'shared-brick-count' for the segments is being incremented, where it should have been always been 1. What the 'shared-brick-count' is for is when a volume has multiple bricks on the same file-system. In such cases, as multiple bricks cannot use the full space of the file-system since they are sharing, the space for only one of them is counted. However, in this case there was no brick file-system sharing going on. GlusterFS uses the file-system ID from f_fsid field from statvfs() to determine when multiple bricks are on the same file-system. Unfortunately, 'replace-brick' was not reading the sys_statvfs() 'f_fsid' value from the new brick, so 'brick- fsid' in the brick spec file was being set to 0. For the first 'replace-brick' this would be OK, but when another brick was replaced, also with 'brick-fsid' being 0, there could then be multiple bricks with the 'statfs_fsid' value of zero, so 'shared-brick-count' would be incremented, and its space would be subtracted from the volume. Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions. Sanju, Can you take a look at this? Thanks, Nithya Moving this to 'distribute' component. This is not a dht issue - moving this to glusterd. REVIEW: https://review.gluster.org/21513 (glusterd: set fsid while performing replace brick) posted (#1) for review on master by Sanju Rakonde Updated reproducer: 1. create any type of volume which supports replace-brick operation, having at least two bricks(B1, B2,..) 2. start the volume 3. mount the volume and check the volume size using df 4. perform a replace-brick operation on B1 5. check the size at mount point using df, it should be same as in step 3 6. perform replace-brick operation on B2. 7. check the size at mount point using df, it will be reduced by half RCA: While performing the replace brick operation we are not setting the fsid for the new brick. So the new brick will have fsid as 0. when we perform 2nd replace-brick operation, again the 2nd new brick will have fsid as 0. So, there will be two bricks which have fsid as 0. While calculating shared-brick-count, we consider the value of fsid of the bricks. If bricks are having same fsid, that means they are sharing the same file system. shared-brick-count refers to number of bricks that are sharing the same file system. In this case, shared-brick-count becomes 2 (as both new bricks are having fsid as 0). So, after 2nd replace brick operation the volume size at the mount point will be reduced by half. Thanks, Sanju REVIEW: https://review.gluster.org/21513 (glusterd: set fsid while performing replace brick) posted (#3) for review on master by Atin Mukherjee This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report. glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html [2] https://www.gluster.org/pipermail/gluster-users/ |