This is actually 3.3.1. There is no option for 3.3.1 in the version list. After replacing a brick using the proposed replace-brick...commit force followed by a heal...full, the client showed 2 dentries for files that had a dht sticky-pointer. On the new brick, the 0 length mode 1000 files did not have the trusted.dht.linkto attribute. Deleting these broken stickies did not resolve the problem as they were just healed back in the same broken way. This is a replica 3 volume (4x3) and the migration was for the 2nd replica subvolume in the 3rd distribute subvolume.
Joe, I tried re-creating the issue as per our discussion. It is not happening for me :-(. We must be missing some detail. Let me know if I did something wrong or if you find any more information about this issue. [root@pranithk-laptop r2]# getfattr -d -m . -e hex /gfs/r2_?/10 getfattr: Removing leading '/' from absolute path names # file: gfs/r2_0/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_1/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_2/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_3/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 # file: gfs/r2_4/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 # file: gfs/r2_5/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 [root@pranithk-laptop r2]# rm -f /gfs/r2_3/10 [root@pranithk-laptop r2]# getfattr -d -m . -e hex /gfs/r2_?/10 getfattr: Removing leading '/' from absolute path names # file: gfs/r2_0/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_1/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_2/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_4/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 # file: gfs/r2_5/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 [root@pranithk-laptop r2]# gluster volume heal r2 full Heal operation on volume r2 has been successful [root@pranithk-laptop r2]# getfattr -d -m . -e hex /gfs/r2_?/10 getfattr: Removing leading '/' from absolute path names # file: gfs/r2_0/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_1/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_2/10 trusted.afr.r2-client-0=0x000000000000000000000000 trusted.afr.r2-client-1=0x000000000000000000000000 trusted.afr.r2-client-2=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 # file: gfs/r2_3/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 # file: gfs/r2_4/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000 # file: gfs/r2_5/10 trusted.afr.r2-client-3=0x000000000000000000000000 trusted.afr.r2-client-4=0x000000000000000000000000 trusted.afr.r2-client-5=0x000000000000000000000000 trusted.gfid=0x5fd4bd0068e540bba53097f545325af3 trusted.glusterfs.dht.linkto=0x72322d7265706c69636174652d3000
Joe, I had forgotten to delete the hardlink in the attempt in the prev comment. I tried again with deleting the hardlink as well. It worked fine. Pranith.
Volume Name: share1 Type: Distributed-Replicate Volume ID: 9fbd655f-e060-41b4-9597-3a1ec2e41509 Status: Started Number of Bricks: 4 x 3 = 12 Transport-type: tcp Bricks: Brick1: ewcs2:/var/spool/glusterfs/a_share1 Brick2: ewcs10:/data/glusterfs/share1/a Brick3: ewcs7:/var/spool/glusterfs/a_share1 Brick4: ewcs2:/var/spool/glusterfs/b_share1 Brick5: ewcs10:/data/glusterfs/share1/b Brick6: ewcs7:/var/spool/glusterfs/b_share1 Brick7: ewcs2:/var/spool/glusterfs/c_share1 Brick8: ewcs10:/data/glusterfs/share1/c Brick9: ewcs7:/var/spool/glusterfs/c_share1 Brick10: ewcs2:/var/spool/glusterfs/d_share1 Brick11: ewcs10:/data/glusterfs/share1/d Brick12: ewcs7:/var/spool/glusterfs/d_share1 Options Reconfigured: performance.cache-size: 8MB performance.io-cache: off nfs.disable: on nfs.rpc-auth-allow: on # ls -l public/Installs/openssl-0.9.7e/crypto/cast total 125 ---------T 1 root root 0 Nov 3 12:36 casttest.c ---------T 1 root root 0 Nov 3 12:36 casttest.c ewcs10: # file: /data/share1/c/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a7661725f73706f6f6c5f743a733000 trusted.afr.share1-client-6=0x000000000000000000000000 trusted.afr.share1-client-7=0x000000000000000000000000 trusted.afr.share1-client-8=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 # file: /data/share1/d/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.share1-client-10=0x000000000000000000000000 trusted.afr.share1-client-11=0x000000000000000000000000 trusted.afr.share1-client-9=0x000000000000000000000000 trusted.afr.share1-io-threads=0x000000000000000000000000 trusted.afr.share1-replace-brick=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 trusted.share1-posix.gen=0x4de74e2900000540 # file: c/.glusterfs/73/2e/732e66cd-9496-413d-a32b-ad8a47a1b6c7 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a7661725f73706f6f6c5f743a733000 trusted.afr.share1-client-6=0x000000000000000000000000 trusted.afr.share1-client-7=0x000000000000000000000000 trusted.afr.share1-client-8=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 # file: d/.glusterfs/73/2e/732e66cd-9496-413d-a32b-ad8a47a1b6c7 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.share1-client-10=0x000000000000000000000000 trusted.afr.share1-client-11=0x000000000000000000000000 trusted.afr.share1-client-9=0x000000000000000000000000 trusted.afr.share1-io-threads=0x000000000000000000000000 trusted.afr.share1-replace-brick=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 trusted.share1-posix.gen=0x4de74e2900000540 ewcs2 and ewcs7 match with: # file: c_share1/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a7661725f73706f6f6c5f743a733000 trusted.afr.share1-client-6=0x000000000000000000000000 trusted.afr.share1-client-7=0x000000000000000000000000 trusted.afr.share1-client-8=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 # file: d_share1/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a7661725f73706f6f6c5f743a733000 trusted.afr.share1-client-10=0x000000000000000000000000 trusted.afr.share1-client-11=0x000000000000000000000000 trusted.afr.share1-client-9=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 trusted.share1-posix.gen=0x4de74e2900000540 # file: c_share1/.glusterfs/73/2e/732e66cd-9496-413d-a32b-ad8a47a1b6c7 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a7661725f73706f6f6c5f743a733000 trusted.afr.share1-client-6=0x000000000000000000000000 trusted.afr.share1-client-7=0x000000000000000000000000 trusted.afr.share1-client-8=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 # file: d_share1/.glusterfs/73/2e/732e66cd-9496-413d-a32b-ad8a47a1b6c7 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a7661725f73706f6f6c5f743a733000 trusted.afr.share1-client-10=0x000000000000000000000000 trusted.afr.share1-client-11=0x000000000000000000000000 trusted.afr.share1-client-9=0x000000000000000000000000 trusted.gfid=0x732e66cd9496413da32bad8a47a1b6c7 trusted.share1-posix.gen=0x4de74e2900000540
Almost forgot to indicate which ones are which (identical on all three replicas): File: `c/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c' Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fd37h/64823d Inode: 1251 Links: 2 Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2011-12-28 01:48:32.648356000 -0800 Modify: 2012-11-03 12:36:37.896017000 -0700 Change: 2012-11-03 12:36:37.899813131 -0700 File: `d/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c' Size: 7496 Blocks: 16 IO Block: 4096 regular file Device: fd19h/64793d Inode: 6292381 Links: 2 Access: (0664/-rw-rw-r--) Uid: ( 0/ root) Gid: ( 3/ sys) Access: 2012-11-01 01:23:19.828815123 -0700 Modify: 2012-10-01 15:02:10.489055000 -0700 Change: 2012-10-19 06:20:58.713815692 -0700
Log entries for that file: bricks/data-glusterfs-share1-c.log.1352028876:[2012-11-03 12:37:06.728311] I [server3_1-fops.c:1183:server_link_cbk] 0-share1-server: 63354: LINK /public/Installs/openssl-0.9.7e/test/casttest.c (b26b9d9a-e1f6-4df4-bc84-23dbcac91a2f) ==> -1 (File exists) bricks/data-glusterfs-share1-c.log.1352028876:[2012-11-03 12:37:06.822605] I [server3_1-fops.c:1183:server_link_cbk] 0-share1-server: 63549: LINK /public/Installs/openssl-0.9.7e/test/casttest.c (b26b9d9a-e1f6-4df4-bc84-23dbcac91a2f) ==> -1 (File exists) glustershd.log:[2012-10-29 16:49:12.182185] W [client3_1-fops.c:2457:client3_1_link_cbk] 0-share1-client-8: remote operation failed: File exists (00000000-0000-0000-0000-000000000000 -> /public/Installs/openssl-0.9.7e/test/casttest.c) That last entry was the heal..full after the replace-brick..commit force.
Joe, Considering the file casttest.c does not have dht.linkto xattr on all of 'c/public/Installs/openssl-0.9.7e/crypto/cast/casttest.c', the problem is not with Replace-brick/self-heal. Some how we got into a situation where the files were not set with dht xattr. Please let us know if you have any information about this. Pranith.
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed.