+++ This bug was initially created as a clone of Bug #1248998 +++ Issue: ====== Files created in the mount point when the volume type is Distributed are not available after converting the volume type to Replicated. RHGS Version: ============= Red Hat Gluster Storage Server 3.1 ( glusterfs-3.7.1-11.el6rhs ) Steps to Reproduce: =================== 1. Create a distributed volume type with one brick. 2. Mount the volume and create some files (Eg: touch file{1.30}) 3. Convert distributed volume type to Replicated one by adding one more brick ( 1x2 Config ) 4. Check the files present in the mount point created in step-2. //Files created in step-2 are not available. Actual Result: ============== Files which are created in the mount point when the volume type is Distributed are not available in the mount point after converting volume type to Replicated one. Expected Result: =============== Files created in one volume type should be present after converting volume to different possible type (Eg Distributed to Replicated , Distributed to Distributed-Replicated ) --- Additional comment from Red Hat Bugzilla Rules Engine on 2015-07-31 06:00:30 EDT --- This bug is automatically being proposed for Red Hat Gluster Storage 3.1.0 by setting the release flag 'rhgs‑3.1.0' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Byreddy on 2015-08-04 00:29:51 EDT --- Console CLI steps: 1)Create the Distributed Volume Type [root@vm4 ~]# gluster volume create DIS 192.168.122.37:/bricks/a1 force volume create: DIS: success: please start the volume to access data [root@vm4 ~]# 2) Check Volume Type: [root@vm4 ~]# gluster volume info Volume Name: DIS Type: Distribute Volume ID: e3fa6293-3234-4517-b1af-b88051023e9e Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 192.168.122.37:/bricks/a1 Options Reconfigured: performance.readdir-ahead: on 3)Mount the Volume: [root@vm4 ~]# mount -t glusterfs 192.168.122.37:/DIS /xyz 4) Create some files in the mount point: [root@vm4 xyz]# touch tmp{1..30} [root@vm4 xyz]# ls tmp1 tmp11 tmp13 tmp15 tmp17 tmp19 tmp20 tmp22 tmp24 tmp26 tmp28 tmp3 tmp4 tmp6 tmp8 tmp10 tmp12 tmp14 tmp16 tmp18 tmp2 tmp21 tmp23 tmp25 tmp27 tmp29 tmp30 tmp5 tmp7 tmp9 [root@vm4 xyz]# 5) Files present in the brick [root@vm4 a1]# ls tmp1 tmp11 tmp13 tmp15 tmp17 tmp19 tmp20 tmp22 tmp24 tmp26 tmp28 tmp3 tmp4 tmp6 tmp8 tmp10 tmp12 tmp14 tmp16 tmp18 tmp2 tmp21 tmp23 tmp25 tmp27 tmp29 tmp30 tmp5 tmp7 tmp9 [root@vm4 a1]# 6) Convert Distributed Volume type to Replicated by adding one more brick. [root@vm4 bricks]# gluster volume add-brick DIS replica 2 192.168.122.37:/lt_bricks/a2 force volume add-brick: success [root@vm4 bricks]# 7)Check Volume type: [root@vm4 bricks]# gluster v info Volume Name: DIS Type: Replicate Volume ID: e3fa6293-3234-4517-b1af-b88051023e9e Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.122.37:/bricks/a1 Brick2: 192.168.122.37:/lt_bricks/a2 Options Reconfigured: performance.readdir-ahead: on 8) Check for files in the mount point. [root@vm4 ~]# cd /xyz/ [root@vm4 xyz]# ls [root@vm4 xyz]# --- Additional comment from Pranith Kumar K on 2015-12-09 08:46:10 EST --- Anuradha, Assigning it to you as your add-brick patch upstream should fix this. Pranith --- Additional comment from Krutika Dhananjay on 2016-02-26 00:39:52 EST --- http://review.gluster.org/#/c/12451/ http://review.gluster.org/#/c/12454/ --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-03-08 03:16:53 EST --- Since this bug has been approved for the z-stream release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.1.z+', and has been marked for RHGS 3.1 Update 3 release through the Internal Whiteboard entry of '3.1.3', the Target Release is being automatically set to 'RHGS 3.1.3' --- Additional comment from Mike McCune on 2016-03-28 18:17:27 EDT --- This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions --- Additional comment from Anuradha on 2016-04-05 07:17:10 EDT --- Patch links for downstream : https://code.engineering.redhat.com/gerrit/#/c/71426/ https://code.engineering.redhat.com/gerrit/#/c/71424/ --- Additional comment from errata-xmlrpc on 2016-04-20 07:23:20 EDT --- Bug report changed to ON_QA status by Errata System. A QE request has been submitted for advisory RHEA-2016:23053-01 https://errata.devel.redhat.com/advisory/23053 --- Additional comment from nchilaka on 2016-04-26 08:03:32 EDT --- Following is the planned QA Test Plan(QATP) to verify this bug(TC#3 is the main case to test the fix): TC#1:automatic heal should be triggered and all files must be availble on both bricks( data,metadata and entry heals must pass ) 1:create a single brick volume 2:now start volume, and add some files and directories and note them 3:now add-brick such that this brick makes the volume a replica vol 1x2 by using below command gluster v add-brick <vname> rep 2 <newbirck> 4:Now from the mount point check if you are able to see all the files and dirs created in step 2 and check that they are accessible too 5. Now check the heal info command to see that all heals are complete 6. Now check the backend brick to make sure all files are replicated 7. Now create new files and dirs and make sure they are replicated to both bricks 8. Make sure data,metadata and entry heals pass TC#2:bringing down brick of the original brick must not cause any IO issue(As it is now a AFR volume)-->test on both fuse and nfs --->Testcase failed on fuse client step1,2,3 same as above 4:Now check if the heal is complete using heal info 5. After heal completes, now bring down the first brick(which was used to create the actual old volume) 6.without doing any lookup on the fuse mount, try to create new file ; make sure the file gets created Failure details: Case failed at the first file create at step 6 with transport end point error TC#3:the old brick or legacy brick must be the source and not the new brick(tests the actual fix) step 1 to 4 same as TC#1 5. Now make sure that all the files and directories are created on the new brick 6. The mount point should be able to display all the files and their contents(not just file names, but even the contents) TC#4:afr client bits must be visible on the old brick or legacy brick (with client bits set for the client1, ie the newly added brick) step 1 to 4 same as TC#1 5. Now even before self heal is triggered, check the xattrs on any of the file and dir on the old brick. it must have following information of its other pair brick(the newly added brick) trusted.afr.jan-client-1=0x000000010000000100000000 TC#5:lookup(named self heal) should trigger heal of the file step 1 to 6 same as TC#1 5. Now do a cat of a file created previously and see if this heals the file by checking the size of file in backend brick on both bricks. Also check md5sum which should be same TC#6:manual heal command should trigger heal of the file step 1 to 6 same as TC#17. 7. trigger a heal on the volume, this should heal --->fails because this cmnd "heal operation to perform index self heal on volume" but the index itself is not getting created TC#7:manual heal full command should trigger heal of the file step 1 to 6 same as TC#17. 7. trigger a heal full on the volume, this should heal Run all the above cases on below configurations: Fuse mount NFS mount x3 dist-rep volume --- Additional comment from nchilaka on 2016-04-26 08:55:26 EDT --- Validation on 3.7.9-2 build [root@dhcp35-191 feb]# rpm -qa|grep gluster glusterfs-client-xlators-3.7.9-2.el7rhgs.x86_64 glusterfs-server-3.7.9-2.el7rhgs.x86_64 python-gluster-3.7.5-19.el7rhgs.noarch gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64 vdsm-gluster-4.16.30-1.3.el7rhgs.noarch glusterfs-3.7.9-2.el7rhgs.x86_64 glusterfs-api-3.7.9-2.el7rhgs.x86_64 glusterfs-cli-3.7.9-2.el7rhgs.x86_64 glusterfs-geo-replication-3.7.9-2.el7rhgs.x86_64 gluster-nagios-common-0.2.3-1.el7rhgs.noarch glusterfs-libs-3.7.9-2.el7rhgs.x86_64 glusterfs-fuse-3.7.9-2.el7rhgs.x86_64 glusterfs-rdma-3.7.9-2.el7rhgs.x86_64 from the QATP I found the following issues or pass/fail status On FUSE Mount: TC#1:FAIL ->raised bug 1330526 - adding brick to a single brick volume to convert to replica is not triggering self heal Tc#2:FAIL Testcase failed on fuse client --->bug raised:1330399 - "Transport endpoint is not connected" error on fuse mount when we bring down the legacy brick of a volume after converting it to replicate tc#3:PASS TC#4:PASS TC#5:PASS TC#6:FAIL due to bug raised in TC#1 TC#7:PASS On NFS Mount: Hit the following issue: saw an issue where the first ls shows the file size as zero for all files, however second ls does read correctly raised bug 1330555 - file sizes showing as zero on issuing ls on an nfs mount after adding brick to convert a single brick to replica volume(1x2) Since, the TC#1 failed with self heal not triggering and after discussing with dev moving to failed_qa --- Additional comment from Anuradha on 2016-04-28 02:26:50 EDT --- The patches sent for this BZ were supposed to address 2 things : 1) Allow increasing the replica count from 2 to 3 such that heal to the newly added brick is done automatically. 2) Allow converting a distribute volume to replicate one. The patches address point 1 but not point 2. Brief explanation of the fix and why it fails for point 2 : a) When a new brick is added to a volume such that replica count is increased, the existing bricks accuse the new bricks indicating some heal needs to be performed. b) This accusing is 2 step process : Marking pending xattrs on '/' of the old bricks and subsequently adding an index link file in <brickpath>/.glusterfs/indices/xattrop directory. This index file is used by self-heal daemon to pick up the files to be healed. c) The mentioned index file is added by "index" xlator. This xlator keeps track of a watchlist which contains xattr key patterns, when ever a non-zero value is provided against any of the xattr key patterns in watchlist, this xlator add the index file. d) This watchlist is populated during init of this xlator based on the xlator option provided to it in a *replicate* volume. So the fix works fine for a replicate volume whose replica count is increased. But in case of a distribute volume being converted to replicate, index xlator doesn't have this watchlist. e) So upon adding a new brick to the volume, second part mentioned in point (b) fails. Resulting in 0 files that need healing (as seen from indices/xattrop) even though pending markers are set. f) Hence, the healing fails. Proposed solution : When new bricks are added to a volume to increase replica count, reconfigure the existing bricks such that index xlator residing on these bricks is made aware of the watchlist and index files can be added. Planning to make this fix for upcoming releases. Dropping it from 3.1.3. --- Additional comment from RHEL Product and Program Management on 2016-04-28 02:31:46 EDT --- This bug report previously had all acks and release flag approved. However since at least one of its acks has been changed, the release flag has been reset to ? by the bugbot (pm-rhel). The ack needs to become approved before the release flag can become approved again. --- Additional comment from nchilaka on 2016-05-02 06:42:24 EDT --- removed fixed-in version due to failed_Qa --- Additional comment from errata-xmlrpc on 2016-06-03 04:10:21 EDT --- This bug has been dropped from advisory RHEA-2016:23053 by Rejy Cyriac (rcyriac)
REVIEW: http://review.gluster.org/15118 (glusterd: Convert volume to replica after adding brick self heal is not triggered) posted (#1) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15118 (glusterd: Convert volume to replica after adding brick self heal is not triggered) posted (#2) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15118 (glusterd: Convert volume to replica after adding brick self heal is not triggered) posted (#3) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15118 (glusterd: Convert volume to replica after adding brick self heal is not triggered) posted (#4) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15118 (glusterd: Convert volume to replica after adding brick self heal is not triggered) posted (#5) for review on master by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15118 (glusterd: Convert volume to replica after adding brick self heal is not triggered) posted (#6) for review on master by MOHIT AGRAWAL (moagrawa)
COMMIT: http://review.gluster.org/15118 committed in master by Atin Mukherjee (amukherj) ------ commit 87bb8d0400d4ed18dd3954b1d9e5ca6ee0fb9742 Author: Mohit Agrawal <moagrawa> Date: Tue Aug 9 15:53:27 2016 +0530 glusterd: Convert volume to replica after adding brick self heal is not triggered Problem: After add brick to a distribute volume to convert to replica is not triggering self heal. Solution: Modify the condition in brick_graph_add_index to set trusted.afr.dirty attribute in xlator. Test : To verify the patch followd below steps 1) Create a single node volume gluster volume create <DIS> <IP:/dist1/brick1> 2) Start volume and create mount point mount -t glusterfs <IP>:/DIS /mnt 3) Touch some file and write some data on file 4) Add another brick along with replica 2 gluster volume add-brick DIS replica 2 <IP>:/dist2/brick2 5) Before apply the patch file size is 0 bytes in mount point. BUG: 1365455 Change-Id: Ief0ccbf98ea21b53d0e27edef177db6cabb3397f Signed-off-by: Mohit Agrawal <moagrawa> Reviewed-on: http://review.gluster.org/15118 NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Ravishankar N <ravishankar> Reviewed-by: Anuradha Talur <atalur> Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Atin Mukherjee <amukherj>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/