Hide Forgot
gfid of an entry is different on different subvolumes. 2x2 distribted replicate setup. 2 fuse and 1 nfs client. 1 fuse client was running untarring of linux kernel tarball and find <mount point> | xargs stat, other fuse client was running fileop. nfs client running fs-perf test. brought down one brick, slept and brought it up. volume set operations were going on. Sometime after the brick was bought up did ls on one of the fuse clients. ls ls: cannot access fileop_L1_2: Input/output error ls: cannot access fileop_L1_5: Input/output error ls: cannot access fileop_L1_6: Input/output error ls: cannot access fileop_L1_7: Input/output error ls: cannot access fileop_L1_9: Input/output error ls: cannot access fileop_L1_11: Input/output error a.out fileop_L1_0 fileop_L1_11 fileop_L1_14 fileop_L1_2 fileop_L1_5 fileop_L1_8 in okpa core.25797 fileop_L1_1 fileop_L1_12 fileop_L1_15 fileop_L1_3 fileop_L1_6 fileop_L1_9 kernel_compile.sh out dir fileop_L1_10 fileop_L1_13 fileop_L1_17 fileop_L1_4 fileop_L1_7 glusterfs.git linux-2.6.31.1 rdd.c [2011-10-28 05:05:13.466838] I [afr-common.c:982:afr_launch_self_heal] 0-mirror-replicate-1: background meta-data entry missing-entry self-he al triggered. path: /dir/fileop_L1_13 [2011-10-28 05:05:13.468197] I [afr-self-heal-common.c:1826:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-mirror-replicate-1: Non blocking entr ylks failed. [2011-10-28 05:05:13.469708] W [afr-common.c:1065:afr_conflicting_iattrs] 0-mirror-replicate-1: /dir/fileop_L1_13: gfid differs on subvolume 1 (7b4931cc-39c7-4bc6-a259-7983fa802ca2, 101162b2-485f-4e2d-9af6-d04fffb4acb6) [2011-10-28 05:05:13.469732] E [afr-self-heal-common.c:1310:afr_sh_common_lookup_cbk] 0-mirror-replicate-1: Conflicting entries for /dir/fileo p_L1_13 [2011-10-28 05:05:13.474530] W [afr-common.c:1065:afr_conflicting_iattrs] 0-mirror-replicate-1: /dir/fileop_L1_13: gfid differs on subvolume 1 (7b4931cc-39c7-4bc6-a259-7983fa802ca2, 101162b2-485f-4e2d-9af6-d04fffb4acb6) [2011-10-28 05:05:13.474561] E [afr-self-heal-common.c:1310:afr_sh_common_lookup_cbk] 0-mirror-replicate-1: Conflicting entries for /dir/fileo p_L1_13 [2011-10-28 05:05:13.475275] E [afr-self-heal-common.c:2041:afr_self_heal_completion_cbk] 0-mirror-replicate-1: background meta-data entry mi ssing-entry self-heal failed on /dir/fileop_L1_13 [2011-10-28 05:05:13.475299] I [dht-layout.c:581:dht_layout_normalize] 0-mirror-dht: found anomalies in /dir/fileop_L1_13. holes=1 overlaps=0 [2011-10-28 05:05:13.475309] I [dht-selfheal.c:576:dht_selfheal_directory] 0-mirror-dht: 1 subvolumes have unrecoverable errors [2011-10-28 05:05:13.476253] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-mirror-client-2: remote operation failed: Stale NFS file handle [2011-10-28 05:05:13.483881] W [afr-common.c:1065:afr_conflicting_iattrs] 0-mirror-replicate-1: /dir/fileop_L1_14: gfid differs on subvolume 1 (c9a75a75-fd3f-478b-a5b7-a220a8fa1081, a173bd67-dbfc-489e-a5bd-0add55a0dbe1) [2011-10-28 05:05:13.483929] W [afr-common.c:1065:afr_conflicting_iattrs] 0-mirror-replicate-1: /dir/fileop_L1_14: gfid differs on subvolume 1 (c9a75a75-fd3f-478b-a5b7-a220a8fa1081, a173bd67-dbfc-489e-a5bd-0add55a0dbe1) [2011-10-28 05:05:13.483945] W [afr-common.c:826:afr_detect_self_heal_by_iatt] 0-mirror-replicate-1: /dir/fileop_L1_14: gfid different on subv olume [2011-10-28 05:05:13.483962] I [afr-common.c:982:afr_launch_self_heal] 0-mirror-replicate-1: background meta-data entry missing-entry self-he al triggered. path: /dir/fileop_L1_14 [2011-10-28 05:05:13.484175] I [afr-self-heal-common.c:1826:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-mirror-replicate-1: Non blocking entr ylks failed. [2011-10-28 05:05:13.485294] W [afr-common.c:1065:afr_conflicting_iattrs] 0-mirror-replicate-1: /dir/fileop_L1_14: gfid differs on subvolume 1 (c9a75a75-fd3f-478b-a5b7-a220a8fa1081, a173bd67-dbfc-489e-a5bd-0add55a0dbe1) [2011-10-28 05:05:13.485317] E [afr-self-heal-common.c:1310:afr_sh_common_lookup_cbk] 0-mirror-replicate-1: Conflicting entries for /dir/fileo p_L1_14 [2011-10-28 05:05:13.487465] W [afr-common.c:1065:afr_conflicting_iattrs] 0-mirror-replicate-1: /dir/fileop_L1_14: gfid differs on subvolume 1 (c9a75a75-fd3f-478b-a5b7-a220a8fa1081, a173bd67-dbfc-489e-a5bd-0add55a0dbe1) [2011-10-28 05:05:13.487489] E [afr-self-heal-common.c:1310:afr_sh_common_lookup_cbk] 0-mirror-replicate-1: Conflicting entries for /dir/fileo p_L1_14 [2011-10-28 05:05:13.488109] E [afr-self-heal-common.c:2041:afr_self_heal_completion_cbk] 0-mirror-replicate-1: background meta-data entry mi ssing-entry self-heal failed on /dir/fileop_L1_14 getfattr -d -m . -e hex /export/mirror/dir/fileop_L1_13 getfattr: Removing leading '/' from absolute path names # file: export/mirror/dir/fileop_L1_13 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.gfid=0x101162b2485f4e2d9af6d04fffb4acb6 trusted.glusterfs.dht=0x00000001000000007fffffffffffffff trusted.glusterfs.quota.a395ef87-af01-4871-ae1a-1577fdc43b71.contri=0x0000000002320000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000002320000 getfattr -d -m . -e hex /export/mirror/dir/fileop_L1_13 getfattr: Removing leading '/' from absolute path names # file: export/mirror/dir/fileop_L1_13 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.gfid=0x101162b2485f4e2d9af6d04fffb4acb6 trusted.glusterfs.dht=0x00000001000000007fffffffffffffff trusted.glusterfs.quota.a395ef87-af01-4871-ae1a-1577fdc43b71.contri=0x0000000002320000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000002320000 getfattr -d -m . -e hex /export/mirror/dir/fileop_L1_13 getfattr: Removing leading '/' from absolute path names # file: export/mirror/dir/fileop_L1_13 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.gfid=0x7b4931cc39c74bc6a2597983fa802ca2 trusted.glusterfs.dht=0x0000000100000000000000007ffffffe trusted.glusterfs.quota.a395ef87-af01-4871-ae1a-1577fdc43b71.contri=0x0000000000000000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000000000000 getfattr -d -m . -e hex /export/mirror/dir/fileop_L1_13 getfattr: Removing leading '/' from absolute path names # file: export/mirror/dir/fileop_L1_13 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.mirror-client-2=0x000000000000000100000021 trusted.afr.mirror-client-3=0x000000000000000000000000 trusted.gfid=0x101162b2485f4e2d9af6d04fffb4acb6 trusted.glusterfs.dht=0x0000000100000000000000007ffffffe trusted.glusterfs.quota.a395ef87-af01-4871-ae1a-1577fdc43b71.contri=0x00000000020f0000 trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x00000000020f000
[2011-10-28 05:05:13.483962] I [afr-common.c:982:afr_launch_self_heal] 0-mirror-replicate-1: background meta-data entry missing-entry self-he al triggered. path: /dir/fileop_L1_14 [2011-10-28 05:05:13.484175] I [afr-self-heal-common.c:1826:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-mirror-replicate-1: Non blocking entr ylks failed. Johnny, The logs are not complete, seems like other self-heal is in progress, until it is complete the files would give gfid mismatches I think. I think we would be able to improve this situation to avoid this error, but I need the full logs to confirm my assumptions Pranith.
CHANGE: http://review.gluster.com/672 (Change-Id: I8a43b5fbe7a90344f490090df853d47b651bc0ff) merged in release-3.2 by Vijay Bellur (vijay)
CHANGE: http://review.gluster.com/673 (*) removed uuid_generate usage in pump and afr) merged in release-3.2 by Vijay Bellur (vijay)
CHANGE: http://review.gluster.com/678 (Change-Id: I7a8bd3b3f9600ced4a945f07447698876933ade0) merged in master by Vijay Bellur (vijay)
CHANGE: http://review.gluster.com/679 (*) removed uuid_generate usage in pump and afr, self-heald) merged in master by Vijay Bellur (vijay)
CHANGE: http://review.gluster.com/680 (Change-Id: I2319258743e478cc3a932d8ff0b2204a97cd4f8e) merged in master by Vijay Bellur (vijay)