Hide Forgot
Description of problem: ===================== I sometimes observe that the heal info throws duplicate entries for the same file or gfid [root@dhcp35-191 glusterfs]# gluster v heal tinker info Brick 10.70.35.191:/rhs/brick1/tinker /newdata - Possibly undergoing heal /newdata - Possibly undergoing heal Number of entries: 2 Brick 10.70.35.27:/rhs/brick1/tinker /newdata - Possibly undergoing heal Number of entries: 1 Brick 10.70.35.191:/rhs/brick2/tinker Status: Transport endpoint is not connected [root@dhcp35-191 glusterfs]# gluster v status tinker Status of volume: tinker Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.191:/rhs/brick1/tinker 49187 0 Y 5597 Brick 10.70.35.27:/rhs/brick1/tinker 49166 0 Y 29082 Brick 10.70.35.191:/rhs/brick2/tinker N/A N/A N N/A NFS Server on localhost 2049 0 Y 5617 Self-heal Daemon on localhost N/A N/A Y 5625 NFS Server on 10.70.35.44 2049 0 Y 25745 Self-heal Daemon on 10.70.35.44 N/A N/A Y 25753 NFS Server on 10.70.35.64 2049 0 Y 1523 Self-heal Daemon on 10.70.35.64 N/A N/A Y 1531 NFS Server on 10.70.35.98 2049 0 Y 31548 Self-heal Daemon on 10.70.35.98 N/A N/A Y 31556 NFS Server on 10.70.35.27 2049 0 Y 30478 Self-heal Daemon on 10.70.35.27 N/A N/A Y 30487 NFS Server on 10.70.35.114 2049 0 Y 30209 Self-heal Daemon on 10.70.35.114 N/A N/A Y 30217 Task Status of Volume tinker ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp35-191 glusterfs]# gluster v info tinker Volume Name: tinker Type: Replicate Volume ID: 5f00ff1a-410f-4a9b-82a0-91d5cfae89c3 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.70.35.191:/rhs/brick1/tinker Brick2: 10.70.35.27:/rhs/brick1/tinker Brick3: 10.70.35.191:/rhs/brick2/tinker Options Reconfigured: performance.readdir-ahead: on ########another instance################### [root@dhcp35-191 glusterfs]# gluster v heal tinker Launching heal operation to perform index self heal on volume tinker has been unsuccessful on bricks that are down. Please check if all brick processes are running. [root@dhcp35-191 glusterfs]# gluster v heal tinker info; Brick 10.70.35.191:/rhs/brick1/tinker Status: Transport endpoint is not connected Brick 10.70.35.27:/rhs/brick1/tinker /newdata - Possibly undergoing heal /newdata - Possibly undergoing heal Number of entries: 2 Brick 10.70.35.191:/rhs/brick2/tinker <gfid:d08eb8a1-7ae6-4994-bc79-c14830b5d7d8> - Possibly undergoing heal <gfid:d08eb8a1-7ae6-4994-bc79-c14830b5d7d8> - Possibly undergoing heal Number of entries: 2 Version-Release number of selected component (if applicable): ========== 3.7.9-2 Steps to Reproduce: 1.create a 1x3 volume(in my case 1st and 3rd bricks are hosted in same node) 2.now mount volume on fuse and write a 1gb file 3.now keep writing to the file and bring down the first brick 4. keep the brick down till atleast there is 2-3GB of data for healing 5. keep writes happening and force start the volume to bring back the brick up 6. Then kill the 3rd brick while IOs keep happening 7. Issue a heal command to heal the brick1 as there is still brick2 for source prupose 8. then issue a heal info command. it can be seen that the same file is shown twice in the info This behavior ceases to exist after sometime ############################################################# File and brick xattrs info ##while duplicate entries are getting listed######## [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/* /rhs/brick1/tinker/: total 6430956 -rw-r--r--. 2 root root 8841353216 May 2 17:52 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8841353216 May 2 17:52 newdata 6.2G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/ /rhs/brick1/tinker/: total 6511580 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 6.3G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/ security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x5f00ff1a410f4a9b82a091d5cfae89c3 # file: rhs/brick2/tinker/ security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x5f00ff1a410f4a9b82a091d5cfae89c3 [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/newdata /rhs/brick1/tinker/: total 6511580 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 6.3G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.tinker-client-2=0x000002710000000000000000 trusted.bit-rot.version=0x05000000000000005727467200057c36 trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 # file: rhs/brick2/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-0=0x0000001c0000000000000000 trusted.bit-rot.version=0x030000000000000057273cbd0003f1da trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/newdata; ll /rhs/brick*/tinker/.glusterfs/ /rhs/brick1/tinker/: total 6511580 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 6.3G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.tinker-client-2=0x000002710000000000000000 trusted.bit-rot.version=0x05000000000000005727467200057c36 trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 # file: rhs/brick2/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-0=0x0000001c0000000000000000 trusted.bit-rot.version=0x030000000000000057273cbd0003f1da trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 /rhs/brick1/tinker/.glusterfs/: total 64 drwx------. 3 root root 15 May 2 16:57 00 drw-------. 4 root root 30 May 2 16:57 changelogs drwx------. 3 root root 15 May 2 17:01 d0 -rw-r--r--. 1 root root 19 May 2 17:53 health_check drw-------. 4 root root 32 May 2 16:57 indices drwxr-xr-x. 2 root root 6 May 2 17:52 landfill drw-------. 2 root root 54 May 2 16:57 quanrantine -rw-r--r--. 1 root root 4096 May 2 16:57 tinker.db -rw-r--r--. 1 root root 32768 May 2 16:57 tinker.db-shm -rw-r--r--. 1 root root 20632 May 2 16:57 tinker.db-wal drw-------. 2 root root 6 May 2 17:52 unlink /rhs/brick2/tinker/.glusterfs/: total 64 drwx------. 3 root root 15 May 2 16:57 00 drw-------. 4 root root 30 May 2 16:57 changelogs drwx------. 3 root root 15 May 2 17:01 d0 -rw-r--r--. 1 root root 19 May 2 17:52 health_check drw-------. 4 root root 32 May 2 16:57 indices drwxr-xr-x. 2 root root 6 May 2 17:10 landfill drw-------. 2 root root 54 May 2 16:57 quanrantine -rw-r--r--. 1 root root 4096 May 2 16:57 tinker.db -rw-r--r--. 1 root root 32768 May 2 16:57 tinker.db-shm -rw-r--r--. 1 root root 20632 May 2 16:57 tinker.db-wal drw-------. 2 root root 6 May 2 17:10 unlink [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/newdata; ll /rhs/brick*/tinker/.glusterfs/indices /rhs/brick1/tinker/: total 6511580 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 6.3G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.tinker-client-2=0x000002710000000000000000 trusted.bit-rot.version=0x05000000000000005727467200057c36 trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 # file: rhs/brick2/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-0=0x0000001c0000000000000000 trusted.bit-rot.version=0x030000000000000057273cbd0003f1da trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 /rhs/brick1/tinker/.glusterfs/indices: total 0 drw-------. 2 root root 147 May 2 17:52 dirty drw-------. 2 root root 100 May 2 17:52 xattrop /rhs/brick2/tinker/.glusterfs/indices: total 0 drw-------. 2 root root 55 May 2 17:52 dirty drw-------. 2 root root 100 May 2 17:52 xattrop [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/newdata; ll /rhs/brick*/tinker/.glusterfs/indices/dirty /rhs/brick1/tinker/: total 6511580 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 6.3G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.tinker-client-2=0x000002710000000000000000 trusted.bit-rot.version=0x05000000000000005727467200057c36 trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 # file: rhs/brick2/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-0=0x0000001c0000000000000000 trusted.bit-rot.version=0x030000000000000057273cbd0003f1da trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 /rhs/brick1/tinker/.glusterfs/indices/dirty: total 0 ----------. 2 root root 0 May 2 17:31 d08eb8a1-7ae6-4994-bc79-c14830b5d7d8 ----------. 2 root root 0 May 2 17:31 dirty-51e107a3-0073-4167-ad39-a31a6b21ec68 ----------. 1 root root 0 May 2 17:52 dirty-b35a1d07-0707-48b4-ba32-41c5c3f94f87 /rhs/brick2/tinker/.glusterfs/indices/dirty: total 0 ----------. 1 root root 0 May 2 17:13 dirty-298b589f-529e-40f3-bf08-898cf246941d [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/newdata; ll /rhs/brick*/tinker/.glusterfs/indices/xat* /rhs/brick1/tinker/: total 6511580 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 9771904 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 6.3G /rhs/brick1/tinker/newdata 9.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.tinker-client-2=0x000002710000000000000000 trusted.bit-rot.version=0x05000000000000005727467200057c36 trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 # file: rhs/brick2/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-0=0x0000001c0000000000000000 trusted.bit-rot.version=0x030000000000000057273cbd0003f1da trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 /rhs/brick1/tinker/.glusterfs/indices/xattrop: total 0 ----------. 2 root root 0 May 2 17:52 d08eb8a1-7ae6-4994-bc79-c14830b5d7d8 ----------. 2 root root 0 May 2 17:52 xattrop-b35a1d07-0707-48b4-ba32-41c5c3f94f87 /rhs/brick2/tinker/.glusterfs/indices/xattrop: total 0 ----------. 2 root root 0 May 2 17:10 d08eb8a1-7ae6-4994-bc79-c14830b5d7d8 ----------. 2 root root 0 May 2 17:10 xattrop-298b589f-529e-40f3-bf08-898cf246941d #########no more duplicate entries see################ [root@dhcp35-191 ~]# ll /rhs/brick*/tinker/;du -sh /rhs/brick*/tinker/*;getfattr -d -m . -e hex /rhs/brick*/tinker/newdata; ll /rhs/brick*/tinker/.glusterfs/indices/xat*;ll /rhs/brick*/tinker/.glusterfs/indices/dirt*;getfattr -d -m . -e hex /rhs/brick*/tinker /rhs/brick1/tinker/: total 8816752 -rw-r--r--. 2 root root 9028352000 May 2 17:53 newdata /rhs/brick2/tinker/: total 8705788 -rw-r--r--. 2 root root 8914724864 May 2 17:52 newdata 8.5G /rhs/brick1/tinker/newdata 8.4G /rhs/brick2/tinker/newdata getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-2=0x000002720000000000000000 trusted.bit-rot.version=0x05000000000000005727467200057c36 trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 # file: rhs/brick2/tinker/newdata security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.tinker-client-0=0x0000001c0000000000000000 trusted.bit-rot.version=0x030000000000000057273cbd0003f1da trusted.gfid=0xd08eb8a17ae64994bc79c14830b5d7d8 /rhs/brick1/tinker/.glusterfs/indices/xattrop: total 0 ----------. 2 root root 0 May 2 17:52 d08eb8a1-7ae6-4994-bc79-c14830b5d7d8 ----------. 2 root root 0 May 2 17:52 xattrop-b35a1d07-0707-48b4-ba32-41c5c3f94f87 /rhs/brick2/tinker/.glusterfs/indices/xattrop: total 0 ----------. 2 root root 0 May 2 17:10 d08eb8a1-7ae6-4994-bc79-c14830b5d7d8 ----------. 2 root root 0 May 2 17:10 xattrop-298b589f-529e-40f3-bf08-898cf246941d /rhs/brick1/tinker/.glusterfs/indices/dirty: total 0 ----------. 1 root root 0 May 2 17:52 dirty-b35a1d07-0707-48b4-ba32-41c5c3f94f87 /rhs/brick2/tinker/.glusterfs/indices/dirty: total 0 ----------. 1 root root 0 May 2 17:13 dirty-298b589f-529e-40f3-bf08-898cf246941d getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/tinker security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x5f00ff1a410f4a9b82a091d5cfae89c3 # file: rhs/brick2/tinker security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x5f00ff1a410f4a9b82a091d5cfae89c3
sosreports@nchilaka@rhsqe-repo bug.1332194]$ pwd /home/repo/sosreports/nchilaka/bug.1332194
Anuradha, Could you check if the issue is because of the known issue where the index exists in both dirty and xattrop indices? I didn't look at the logs... Pranith
Hi Vijay, can you see if this issue is re-creatable?