Created attachment 1319232 [details] self-heal daemon log file of all 3 nodes Description of problem: I have a 3 nodes replica-2+arbiter where one single 0 kBytes big file is stuck in self-heal and as such never gets healed. The whole issue has been extensively discussed and described on the gluster-users mailing list here: http://lists.gluster.org/pipermail/gluster-users/2017-August/032105.html For easiness I have pasted a few relevant infos here below, starting with the heal info output: Brick node1.domain.tld:/data/myvolume/brick /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Status: Connected Number of entries: 1 Brick node2.domain.tld:/data/myvolume/brick /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Status: Connected Number of entries: 1 Brick node3.domain.tld:/srv/glusterfs/myvolume/brick /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Status: Connected Number of entries: 1 A stat and getfattr of the file on each brick: NODE1: STAT: File: ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ Size: 0 Blocks: 38 IO Block: 131072 regular empty file Device: 24h/36d Inode: 10033884 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:11:46.407404779 +0200 Change: 2017-08-14 17:11:46.407404779 +0200 Birth: - GETFATTR: trusted.afr.dirty=0sAAAAAQAAAAAAAAAA trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg== trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo= NODE2: STAT: File: ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’ Size: 0 Blocks: 38 IO Block: 131072 regular empty file Device: 26h/38d Inode: 10031330 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:11:46.403704181 +0200 Change: 2017-08-14 17:11:46.403704181 +0200 Birth: - GETFATTR: trusted.afr.dirty=0sAAAAAQAAAAAAAAAA trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw== trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE= NODE3: STAT: File: /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: ca11h/51729d Inode: 405208959 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:04:55.530681000 +0200 Change: 2017-08-14 17:11:46.604380051 +0200 Birth: - GETFATTR: trusted.afr.dirty=0sAAAAAQAAAAAAAAAA trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg== trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g== trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4= CLIENT GLUSTER MOUNT: STAT: File: '/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png' Size: 0 Blocks: 0 IO Block: 131072 regular empty file Device: 1eh/30d Inode: 11897049013408443114 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) Access: 2017-08-14 17:04:55.530681000 +0200 Modify: 2017-08-14 17:11:46.407404779 +0200 Change: 2017-08-14 17:11:46.407404779 +0200 Birth: - Version-Release number of selected component (if applicable): GlusterFS 3.8.11 on Debian 8 How reproducible: AFAIK Ravishankar managed to reproduced the problem. Steps to Reproduce: 1. Ask Ravi 2. 3. Actual results: Expected results: Additional info:
> Steps to Reproduce: > 1. Ask Ravi 1. Create a zero-byte file (`touch file`) on an arbiter volume via fuse mount. 2. Attach gdb to mount process, put break points before winding pre-op and write. 3. echo "hello" >>file 4. after pre-op is stack_wound on all bricks, `pkill gluster && pkill gdb` 5. Now we have the dirty bit set (and no pending bits) for the file on all bricks and an entry inside .glusterfs/indices/dirty. 6. restart all gluster processes. 7. `heal info` will show the entry as needing heal. 8. Heal won't complete because data self-heal algo picks up the brick with latest ctime as source. In this case arbiter is likely the source because the pre-op in step-4 happened last on the arbiter.
Sent patch https://review.gluster.org/#/c/18283/ against BZ 1491670.
(In reply to Ravishankar N from comment #2) > Sent patch https://review.gluster.org/#/c/18283/ against BZ 1491670. The addendum patch to address review comments in 18283 has also been merged in master: https://review.gluster.org/#/c/18391/ Note: I'm back porting these patches to only the 3.12 and 3.8 branches. glusterfs-3.8 is EOL, hence moving this bug to CLOSED.