Description of problem: While running automation runs, found that healing is not completed on Distributed-Replicated ( Arbiter ) Version-Release number of selected component (if applicable): glusterfs-3.12.2-18.2.el7rhgs.x86_64 How reproducible: Steps to Reproduce: 1) create distributed-replicated volume ( Arbiter:2 x (2 + 1) ) and mount the volume 2) Disable client side heals 3) write IO using below script #python /usr/share/glustolibs/io/scripts/file_dir_ops.py create_deep_dirs_with_files --dir-length 2 --dir-depth 2 --max-num-of-dirs 2 --num-of-files 20 /mnt/testvol_distributed-replicated_glusterfs/files 4) Disable self-heal-daemon 5) bring bricks offline from each set ( brick1 and brick5 ) 6) create files from mount point #python /usr/share/glustolibs/io/scripts/file_dir_ops.py create_files -f 20 /mnt/testvol_distributed-replicated_glusterfs/files 7) bring bricks online 8) Enable self-heal-daemon 9) issue volume heal 10) wait for heal to complete 11) Disable self-heal-daemon 12) bring bricks offline from each set ( brick1 and brick5 ) 13) Modify data python /usr/share/glustolibs/io/scripts/file_dir_ops.py mv /mnt/testvol_distributed-replicated_glusterfs/files 14) bring bricks online 15) Enable self-heal-daemon 16) Issue volume heal 17) Wait for heal to complete Actual results: After step 17, heal info is still pending [root@rhsauto049 ~]# gluster vol heal testvol_distributed-replicated info Brick rhsauto049.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick0 Status: Connected Number of entries: 0 Brick rhsauto029.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick1 Status: Connected Number of entries: 0 Brick rhsauto034.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick2 Status: Connected Number of entries: 0 Brick rhsauto039.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick3 /files/user1_a/dir1_a/dir0_a/testfile2_a.txt /files/user1_a/dir1_a/dir0_a /files/user1_a/dir1_a/dir0_a/testfile10_a.txt /files/user1_a/dir1_a /files/user1_a/dir1_a/dir1_a/testfile2_a.txt /files/user1_a/dir1_a/dir1_a /files/user1_a/dir1_a/dir1_a/testfile10_a.txt /files/user1_a/dir1_a/testfile2_a.txt /files/user1_a/dir1_a/testfile10_a.txt Status: Connected Number of entries: 9 Brick rhsauto040.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick4 /files/user1_a/dir1_a/dir0_a/testfile2_a.txt /files/user1_a/dir1_a/dir0_a /files/user1_a/dir1_a/dir0_a/testfile10_a.txt /files/user1_a/dir1_a /files/user1_a/dir1_a/dir1_a/testfile2_a.txt /files/user1_a/dir1_a/dir1_a /files/user1_a/dir1_a/dir1_a/testfile10_a.txt /files/user1_a/dir1_a/testfile2_a.txt /files/user1_a/dir1_a/testfile10_a.txt Status: Connected Number of entries: 9 Brick rhsauto041.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick5 <gfid:f8395fc2-fd6f-4408-a7d3-d039273bd7f2>/user1_a/dir1_a Status: Connected Number of entries: 1 [root@rhsauto049 ~]# Expected results: healing should complete Additional info: [root@rhsauto049 ~]# gluster vol info Volume Name: testvol_distributed-replicated Type: Distributed-Replicate Volume ID: 8edc5266-545b-4b97-8e94-d8e41d1dad4d Status: Started Snapshot Count: 0 Number of Bricks: 2 x (2 + 1) = 6 Transport-type: tcp Bricks: Brick1: rhsauto049.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick0 Brick2: rhsauto029.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick1 Brick3: rhsauto034.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick2 (arbiter) Brick4: rhsauto039.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick3 Brick5: rhsauto040.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick4 Brick6: rhsauto041.lab.eng.blr.redhat.com:/bricks/brick0/testvol_distributed-replicated_brick5 (arbiter) Options Reconfigured: cluster.self-heal-daemon: on cluster.data-self-heal: off cluster.metadata-self-heal: off cluster.entry-self-heal: off transport.address-family: inet nfs.disable: on performance.client-io-threads: off cluster.brick-multiplex: enable cluster.server-quorum-ratio: 51 [root@rhsauto049 ~]# [root@rhsauto049 ~]# gluster vol status Status of volume: testvol_distributed-replicated Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick rhsauto049.lab.eng.blr.redhat.com:/br icks/brick0/testvol_distributed-replicated_ brick0 49153 0 Y 26694 Brick rhsauto029.lab.eng.blr.redhat.com:/br icks/brick0/testvol_distributed-replicated_ brick1 49153 0 Y 3368 Brick rhsauto034.lab.eng.blr.redhat.com:/br icks/brick0/testvol_distributed-replicated_ brick2 49152 0 Y 31682 Brick rhsauto039.lab.eng.blr.redhat.com:/br icks/brick0/testvol_distributed-replicated_ brick3 49153 0 Y 29522 Brick rhsauto040.lab.eng.blr.redhat.com:/br icks/brick0/testvol_distributed-replicated_ brick4 49153 0 Y 26953 Brick rhsauto041.lab.eng.blr.redhat.com:/br icks/brick0/testvol_distributed-replicated_ brick5 49153 0 Y 25772 Self-heal Daemon on localhost N/A N/A Y 29604 Self-heal Daemon on rhsauto029.lab.eng.blr. redhat.com N/A N/A Y 3442 Self-heal Daemon on rhsauto034.lab.eng.blr. redhat.com N/A N/A Y 32481 Self-heal Daemon on rhsauto041.lab.eng.blr. redhat.com N/A N/A Y 25845 Self-heal Daemon on rhsauto039.lab.eng.blr. redhat.com N/A N/A Y 30318 Self-heal Daemon on rhsauto040.lab.eng.blr. redhat.com N/A N/A Y 27807 Task Status of Volume testvol_distributed-replicated ------------------------------------------------------------------------------ There are no active volume tasks [root@rhsauto049 ~]# SOS Reports: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/vavuthu/heal_issue_gfid_symlink_missing/
Healing is not able to complete because the gfid handle (symlink) for a directory is missing. Steps to reproduce in a single node setup: 0."pip install python-docx" if you don't have it. 1. Create and fuse mount a 2x (2+1) volume on /mnt/fuse_mnt. 2. Fill data on mount: python file_dir_ops.py create_deep_dirs_with_files --dir-length 2 --dir-depth 2 --max-num-of-dirs 2 --num-of-files 20 /mnt/fuse_mnt/files python file_dir_ops.py create_files -f 20 /mnt/fuse_mnt/files 3. Kill 1st data brick of each replica 4. Rename files using: python file_dir_ops.py mv /mnt/fuse_mnt/files 5. gluster volume start volname force 6. gluster volume heal volname 7. You will still see a directory and files under it not getting healed. If you look at the bricks you killed in step 3, it won't have the symlink for the directory. ------------------------------------------------------------------------------- Pranith initially had a dirty fix which solves the problem but he found some more races between janitor thread unlinking the gfid handle and posix_lookup and posix_mkdir. He and Xavi are discussing some solutions to solve this in a clean way for both AFR and EC volumes. See BZ 1636902 comment#14 and 15 for some proposed solutions. diff --git a/xlators/storage/posix/src/posix.c b/xlators/storage/posix/src/posix.c index 7bfe780bb..82f44a012 100644 --- a/xlators/storage/posix/src/posix.c +++ b/xlators/storage/posix/src/posix.c @@ -1732,7 +1732,11 @@ posix_mkdir (call_frame_t *frame, xlator_t *this, * posix_gfid_set to set the symlink to the * new dir.*/ posix_handle_unset (this, stbuf.ia_gfid, NULL); + } else if (op_ret < 0) { + MAKE_HANDLE_GFID_PATH (gfid_path, this, uuid_req, NULL); + sys_unlink(gfid_path); } + } else if (!uuid_req && frame->root->pid != GF_SERVER_PID_TRASH) { op_ret = -1; op_errno = EPERM; ------------------------------------------------------------------------------- @Pranith, feel free to assign the bug back to me if you won't be working on this. Leaving a need-info on you for this.
Hit this issue, manually too while upgrading from 3.4.0-async to 3.4.1 more details refer to BZ#1643919 and https://bugzilla.redhat.com/show_bug.cgi?id=1643919#c10
*** Bug 1658870 has been marked as a duplicate of this bug. ***
*** Bug 1712225 has been marked as a duplicate of this bug. ***
Work on this bug will start once https://review.gluster.org/c/glusterfs/+/23937 and a subsequent patch where shd will remove stale entries is merged so moving to next BU for now.
Design for https://review.gluster.org/c/glusterfs/+/23937 changed and the new patch we need for this work is https://review.gluster.org/c/glusterfs/+/24284
https://review.gluster.org/c/glusterfs/+/24373
(In reply to Ravishankar N from comment #2) > Healing is not able to complete because the gfid handle (symlink) for a > directory is missing. Steps to reproduce in a single node setup: > > > 0."pip install python-docx" if you don't have it. > 1. Create and fuse mount a 2x (2+1) volume on /mnt/fuse_mnt. > 2. Fill data on mount: > python file_dir_ops.py create_deep_dirs_with_files --dir-length 2 > --dir-depth 2 --max-num-of-dirs 2 --num-of-files 20 /mnt/fuse_mnt/files > python file_dir_ops.py create_files -f 20 /mnt/fuse_mnt/files > 3. Kill 1st data brick of each replica > 4. Rename files using: > python file_dir_ops.py mv /mnt/fuse_mnt/files > 5. gluster volume start volname force > 6. gluster volume heal volname > 7. You will still see a directory and files under it not getting healed. If > you look at the bricks you killed in step 3, it won't have the symlink for > the directory. > > ----------------------------------------------------------------------------- > -- > Pranith initially had a dirty fix which solves the problem but he found some > more races between janitor thread unlinking the gfid handle and posix_lookup > and posix_mkdir. He and Xavi are discussing some solutions to solve this in > a clean way for both AFR and EC volumes. See BZ 1636902 comment#14 and 15 > for some proposed solutions. > > diff --git a/xlators/storage/posix/src/posix.c > b/xlators/storage/posix/src/posix.c > index 7bfe780bb..82f44a012 100644 > --- a/xlators/storage/posix/src/posix.c > +++ b/xlators/storage/posix/src/posix.c > @@ -1732,7 +1732,11 @@ posix_mkdir (call_frame_t *frame, xlator_t *this, > * posix_gfid_set to set the symlink to the > * new dir.*/ > posix_handle_unset (this, stbuf.ia_gfid, > NULL); > + } else if (op_ret < 0) { > + MAKE_HANDLE_GFID_PATH (gfid_path, this, uuid_req, NULL); > + sys_unlink(gfid_path); > } > + > } else if (!uuid_req && frame->root->pid != GF_SERVER_PID_TRASH) { > op_ret = -1; > op_errno = EPERM; > ----------------------------------------------------------------------------- > -- > @Pranith, feel free to assign the bug back to me if you won't be working on > this. Leaving a need-info on you for this. Ravi, Could you attach file_dir_ops.py script to the bz? Pranith
Upasana gave the link on chat. https://github.com/gluster/glusto-tests/blob/master/glustolibs-io/shared_files/scripts/file_dir_ops.py
Tested this case by running the case in comment-2 10 times and printing the number of pending heals. Before the case it was failing once in 2 times. [root@localhost-live ~]# bash testcase.sh pending-heals: 26 pending-heals: 0 pending-heals: 4 pending-heals: 0 pending-heals: 52 pending-heals: 0 pending-heals: 18 pending-heals: 0 pending-heals: 108 pending-heals: 12 With the fix it doesn't fail: [root@localhost-live ~]# bash testcase.sh pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0 pending-heals: 0
*** Bug 1901154 has been marked as a duplicate of this bug. ***
Hi Karthik, Could you please set the "Doc Type" field and fill out the "Doc Text" template with the relevant information. Thanks Amrita
Thanks Karthik, LGTM.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1462