Created attachment 734304 [details] client side logs: rhs-1_mnt-bz922792_dht.log.gz rhs-2_mnt-bz922792_dht.log.gz Affected version: glusterfs master/HEAD build with this last commit: commit ce111f472796d027796b0cc3a4a6f78689f1172d Author: Anand Avati <avati> Date: Fri Apr 5 02:18:06 2013 -0700 Following script is used for reproducing: #!/bin/bash echo "starting.." while :; do mkdir -p foo/bar/goo mkdir -p foo/bar/gee mkdir -p foo/gue/gar rm -rf foo done The affected volume is a 6-brick distribute: # gluster volume info bz922792_dht Volume Name: bz922792_dht Type: Distribute Volume ID: 99301415-d889-4d25-8b55-bce17bfdfbce Status: Started Number of Bricks: 6 Transport-type: tcp Bricks: Brick1: rhs-1:/bricks/bz922792_dht_1 Brick2: rhs-2:/bricks/bz922792_dht_1 Brick3: rhs-1:/bricks/bz922792_dht_2 Brick4: rhs-2:/bricks/bz922792_dht_2 Brick5: rhs-1:/bricks/bz922792_dht_3 Brick6: rhs-2:/bricks/bz922792_dht_3 After running the reproducer script on two glusterfs-clients (on the servers), a gfid mismatch will occur relatively soon (mostly within a minute): rhs-1# getfattr -d -e hex -m trusted.gfid /bricks/bz922792_dht_?/foo 2> /dev/null # file: bricks/bz922792_dht_1/foo trusted.gfid=0x05dda1efa857498ebb989eae513ad811 # file: bricks/bz922792_dht_2/foo trusted.gfid=0x05dda1efa857498ebb989eae513ad811 # file: bricks/bz922792_dht_3/foo trusted.gfid=0xcd99da3a04d549deb22fd44aef5fa340 rhs-2# getfattr -d -e hex -m trusted.gfid /bricks/bz922792_dht_?/foo 2> /dev/null # file: bricks/bz922792_dht_1/foo trusted.gfid=0x05dda1efa857498ebb989eae513ad811 # file: bricks/bz922792_dht_2/foo trusted.gfid=0x05dda1efa857498ebb989eae513ad811 # file: bricks/bz922792_dht_3/foo trusted.gfid=0xcd99da3a04d549deb22fd44aef5fa340 0-bz922792_dht-client-0 to 0-bz922792_dht-client-3 have gfid:05dda1ef-a857-498e-bb98-9eae513ad811 0-bz922792_dht-client-4 = rhs-1:/bricks/bz922792_dht_3 0-bz922792_dht-client-5 = rhs-2:/bricks/bz922792_dht_3 -> gfid:cd99da3a-04d5-49de-b22f-d44aef5fa340 From the client log of rhs-1, I think that this is the start of the problem: [2013-04-11 14:23:54.328021] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 0-bz922792_dht-client-4: remote operation failed: File exists. Path: /foo [2013-04-11 14:23:54.328789] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 0-bz922792_dht-client-5: remote operation failed: File exists. Path: /foo ... [2013-04-11 14:25:07.032185] W [client-rpc-fops.c:2604:client3_3_lookup_cbk] 0-bz922792_dht-client-5: remote operation failed: Stale NFS file handle. Path: /foo (05dda1ef-a857-498e-bb98-9eae513ad811) [2013-04-11 14:25:07.032220] W [client-rpc-fops.c:2604:client3_3_lookup_cbk] 0-bz922792_dht-client-4: remote operation failed: Stale NFS file handle. Path: /foo (05dda1ef-a857-498e-bb98-9eae513ad811) [2013-04-11 14:25:07.033762] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-1 [2013-04-11 14:25:07.033798] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-0 [2013-04-11 14:25:07.033823] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-3 [2013-04-11 14:25:07.033855] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-2 [2013-04-11 14:25:07.035677] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-2 [2013-04-11 14:25:07.035721] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-1 [2013-04-11 14:25:07.035756] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-0 [2013-04-11 14:25:07.035779] W [dht-common.c:419:dht_lookup_dir_cbk] 0-bz922792_dht-dht: /foo: gfid different on bz922792_dht-client-3 ... [2013-04-11 14:25:07.053041] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-1: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) [2013-04-11 14:25:07.053073] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-0: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) [2013-04-11 14:25:07.053102] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-3: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) [2013-04-11 14:25:07.053124] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-2: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) From rhs-2, the first messages concerning the same gfids: [2013-04-11 14:23:54.357739] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-0: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) [2013-04-11 14:23:54.357804] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-1: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) [2013-04-11 14:23:54.357832] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-3: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) [2013-04-11 14:23:54.357868] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-2: remote operation failed: No such file or directory. Path: /foo (cd99da3a-04d5-49de-b22f-d44aef5fa340) ... [2013-04-11 14:25:57.053218] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-5: remote operation failed: No such file or directory. Path: /foo (05dda1ef-a857-498e-bb98-9eae513ad811) [2013-04-11 14:25:57.053254] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 0-bz922792_dht-client-4: remote operation failed: No such file or directory. Path: /foo (05dda1ef-a857-498e-bb98-9eae513ad811) The first mkdir operation seems to have succeeded for 0-bz922792_dht-client-0 to 0-bz922792_dht-client-3, but failed on the two bricks which have a different gfid.
REVIEW: http://review.gluster.org/4846 (cluster/dht: xattr on to prevent races in rmdir lookup_heal) posted (#3) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4846 (cluster/dht: xattr on to prevent races in rmdir lookup_heal) posted (#4) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#5) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4889 (locks: Added an xdata-based 'cmd' for inodelk count in a given domain) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4889 (locks: Added an xdata-based 'cmd' for inodelk count in a given domain) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#6) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4889 (locks: Added an xdata-based 'cmd' for inodelk count in a given domain) posted (#4) for review on master by Shishir Gowda (sgowda)
These two patches don't fix this issue for me when I apply them on top of master (last commit 328ea4b). In my first attempt to verify these patches, after stopping the reproduser scripts the output looks like this: [root@rhs-1 ~]# ls -li /mnt/bz922792_dht/foo/ total 0 12580817571139378177 d--------- 3 root root 80 Jun 13 08:57 bar 12580817571139378177 d--------- 3 root root 80 Jun 13 08:57 bar 10650833170816791630 d--------- 2 root root 76 Jun 13 08:57 gue 10650833170816791630 d--------- 2 root root 76 Jun 13 08:57 gue [root@rhs-1 ~]# ls -li /mnt/bz922792_dht/foo/bar total 0 9888433851475164314 drwxr-xr-x 2 root root 30 Jun 13 08:57 goo 9888433851475164314 drwxr-xr-x 2 root root 30 Jun 13 08:57 goo
GFIDs are inconsistent, that likely explains the double listing in 'ls'. On rhs-1: _ # file: bricks/bz922792_dht_1/foo \ trusted.gfid=0xea63465236a440d095d0c7047482af7f \ # file: bricks/bz922792_dht_2/foo |_ OK on both trusted.gfid=0xea63465236a440d095d0c7047482af7f | # file: bricks/bz922792_dht_3/foo / trusted.gfid=0xea63465236a440d095d0c7047482af7f _/ # file: bricks/bz922792_dht_1/foo/bar \ trusted.gfid=0x4051087efb3f45dfae980aecc7c15c01 \ # file: bricks/bz922792_dht_2/foo/bar |_ 1/6 wrong trusted.gfid=0x4051087efb3f45dfae980aecc7c15c01 | # file: bricks/bz922792_dht_3/foo/bar / trusted.gfid=0x7347f22b7d3d49d28eea7f635f94c7ed _/ <-- differs, unique # file: bricks/bz922792_dht_1/foo/gue \ trusted.gfid=0x2b182bc2d831432193cf5c569c7b744e \ <-- match rhs-2 dht_3 # file: bricks/bz922792_dht_2/foo/gue |_ 2/6 wrong trusted.gfid=0x30fe1e3ca0e2416a9143b8fe66b2f032 | # file: bricks/bz922792_dht_3/foo/gue / trusted.gfid=0x30fe1e3ca0e2416a9143b8fe66b2f032 _/ On rhs-2: _ # file: bricks/bz922792_dht_1/foo \ trusted.gfid=0xea63465236a440d095d0c7047482af7f \ # file: bricks/bz922792_dht_2/foo |_ OK on both trusted.gfid=0xea63465236a440d095d0c7047482af7f | # file: bricks/bz922792_dht_3/foo / trusted.gfid=0xea63465236a440d095d0c7047482af7f _/ # file: bricks/bz922792_dht_1/foo/bar \ trusted.gfid=0x4051087efb3f45dfae980aecc7c15c01 \ # file: bricks/bz922792_dht_2/foo/bar |_ 1/6 wrong (on rhs-1) trusted.gfid=0x4051087efb3f45dfae980aecc7c15c01 | # file: bricks/bz922792_dht_3/foo/bar / trusted.gfid=0x4051087efb3f45dfae980aecc7c15c01 _/ # file: bricks/bz922792_dht_1/foo/gue \ trusted.gfid=0x30fe1e3ca0e2416a9143b8fe66b2f032 \ # file: bricks/bz922792_dht_2/foo/gue |_ 2/6 wrong trusted.gfid=0x30fe1e3ca0e2416a9143b8fe66b2f032 | # file: bricks/bz922792_dht_3/foo/gue / trusted.gfid=0x2b182bc2d831432193cf5c569c7b744e _/ <-- matches rhs-1 dht_1
Ai, going through the logs, I notice that not all glusterfsd processes were running (no idea how that happened). Re-running the tests now, will leave a new update later.
I have not seen the duplicate entries in 'ls' anymore, but the reproducers hungs after a while never the less. The gfid mismatches on the directories look a little different: On rhs-1: _ # file: bricks/bz922792_dht_1/foo \ trusted.gfid=0x9703ccec339a45708da0aa7a098b23ba \ # file: bricks/bz922792_dht_2/foo |_ OK on both trusted.gfid=0x9703ccec339a45708da0aa7a098b23ba | # file: bricks/bz922792_dht_3/foo / trusted.gfid=0x9703ccec339a45708da0aa7a098b23ba _/ # file: bricks/bz922792_dht_1/foo/bar \ trusted.gfid=0xee8578f0a69b43ec82889187186a30a3 \ <-- match rhs-2:dht_3 # file: bricks/bz922792_dht_2/foo/bar |_ 2/6 wrong trusted.gfid=0x852d1dd258c84bccaa7c8575e9c99dda | # file: bricks/bz922792_dht_3/foo/bar / trusted.gfid=0x852d1dd258c84bccaa7c8575e9c99dda _/ # file: bricks/bz922792_dht_1/foo/gue \ trusted.gfid=0xd7d84b28dd524f10b76386b6f44be101 \ # file: bricks/bz922792_dht_2/foo/gue |_ 3/6 wrong trusted.gfid=0xd7d84b28dd524f10b76386b6f44be101 | # file: bricks/bz922792_dht_3/foo/gue / trusted.gfid=0x2516d664966748bc956a54f3a356ad3b _/ <-- match rhs-2:dht_2+3 On rhs-2: _ # file: bricks/bz922792_dht_1/foo \ trusted.gfid=0x9703ccec339a45708da0aa7a098b23ba \ # file: bricks/bz922792_dht_2/foo |_ OK on both trusted.gfid=0x9703ccec339a45708da0aa7a098b23ba | # file: bricks/bz922792_dht_3/foo / trusted.gfid=0x9703ccec339a45708da0aa7a098b23ba _/ # file: bricks/bz922792_dht_1/foo/bar \ trusted.gfid=0x852d1dd258c84bccaa7c8575e9c99dda \ # file: bricks/bz922792_dht_2/foo/bar |_ 2/6 wrong trusted.gfid=0x852d1dd258c84bccaa7c8575e9c99dda | # file: bricks/bz922792_dht_3/foo/bar / trusted.gfid=0xee8578f0a69b43ec82889187186a30a3 _/ <-- match rhs-1:dht_1 # file: bricks/bz922792_dht_1/foo/gue \ trusted.gfid=0xd7d84b28dd524f10b76386b6f44be101 \ <-- match rhs-1:dht_1+2 # file: bricks/bz922792_dht_2/foo/gue |_ 3/6 wrong trusted.gfid=0x2516d664966748bc956a54f3a356ad3b | # file: bricks/bz922792_dht_3/foo/gue / trusted.gfid=0x2516d664966748bc956a54f3a356ad3b _/ The logs (mountpoint and the bricks from both servers) from the last test-run that resulted in these gfis mismatches are available from http://people.redhat.com/ndevos/bz951195/bz951195_comment12.tar.bz2 (54MB). I have not been able to make a useful diagnosis from these logs yet. Some guidance and suggestions are much appreciated!
Looks like a race between mkdir and lookup setting gfid's in posix xlator. We might have to revert back fix commit 97807e75956a2d240282bc64fab1b71762de0546 Author: Pranith K <pranithk> Date: Thu Jul 14 06:31:47 2011 +0000 storage/posix: Remove the interim fix that handles the gfid race Signed-off-by: Pranith Kumar K <pranithk> Signed-off-by: Anand Avati <avati> BUG: 2745 (failure to detect split brain) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2745 Error logs: rhs-1 brick-3: [2013-06-13 10:51:11.776493] W [posix-helpers.c:485:posix_gfid_set] 0-bz922792_dht-posix: setting GFID on /bricks/bz922792_dht_3/foo/gue/gar failed (File exists) [2013-06-13 10:51:11.776515] E [posix.c:960:posix_mkdir] 0-bz922792_dht-posix: setting gfid on /bricks/bz922792_dht_3/foo/gue/gar failed [2013-06-13 11:31:34.485813] W [posix-handle.c:624:posix_handle_soft] 0-bz922792_dht-posix: symlink ../. ./ee/85/ee8578f0-a69b-43ec-8288-9187186a30a3/goo -> /bricks/bz922792_dht_3/.glusterfs/7b/d2/7bd23cd6-b82 b-498f-85f0-c08744b91295 failed (File exists) [2013-06-13 11:31:34.485838] E [posix.c:960:posix_mkdir] 0-bz922792_dht-posix: setting gfid on /bricks/bz922792_dht_3/foo/bar/goo failed
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#7) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4889 (locks: Added an xdata-based 'cmd' for inodelk count in a given domain) posted (#5) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/5240 (Revert "storage/posix: Remove the interim fix that handles the gfid race") posted (#1) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#8) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/5240 (Revert "storage/posix: Remove the interim fix that handles the gfid race") posted (#2) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4889 (locks: Added an xdata-based 'cmd' for inodelk count in a given domain) posted (#6) for review on master by Shishir Gowda (sgowda)
COMMIT: http://review.gluster.org/4889 committed in master by Vijay Bellur (vbellur) ------ commit 15e11cfa1dec9cafd5a9039da7a43e9c02b19d98 Author: shishir gowda <sgowda> Date: Wed Jun 5 15:56:27 2013 +0530 locks: Added an xdata-based 'cmd' for inodelk count in a given domain Following is the semantics of the 'cmd': 1) If @domain is NULL - returns no. of locks blocked/granted in all domains 2) If @domain is non-NULL- returns no. of locks blocked/granted in that domain 3) If @domain is non-existent - returns '0'; This is important since locks xlator creates a domain in a lazy manner. where @domain - a string representing the domain. Change-Id: I5e609772343acc157ca650300618c1161efbe72d BUG: 951195 Original-author: Krishnan Parthasarathi <kparthas> Signed-off-by: Krishnan Parthasarathi <kparthas> Signed-off-by: shishir gowda <sgowda> Reviewed-on: http://review.gluster.org/4889 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Amar Tumballi <amarts>
COMMIT: http://review.gluster.org/5240 committed in master by Vijay Bellur (vbellur) ------ commit acf8cfdf698aa3ebe42ed55bba8be4f85b751c29 Author: shishir gowda <sgowda> Date: Thu Jun 20 14:06:04 2013 +0530 Revert "storage/posix: Remove the interim fix that handles the gfid race" This reverts commit 97807e75956a2d240282bc64fab1b71762de0546. In a distribute or distribute-replica volume, this fix is required to prevent gfid mis-match due to race issues. test script bug-767585-gfid.t needs a sleep of 2, cause after setting backend gfid directly, we try to heal, and with this fix, we do not allow setxattr of gfid within creation of 1 second if not created by itself Change-Id: Ie3f4b385416889fd5de444638a64a7eaaf24cd60 BUG: 951195 Signed-off-by: shishir gowda <sgowda> Reviewed-on: http://review.gluster.org/5240 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Amar Tumballi <amarts>
REVIEW: http://review.gluster.org/5908 (cluster/dht: inodelk on hashed to prevent races in rmdir deal) posted (#1) for review on master by Shishir Gowda (sgowda)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#11) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#12) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#13) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#14) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#15) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#16) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#17) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#18) for review on master by Harshavardhana (harsha)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#19) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#20) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#21) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#22) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#23) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#24) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/7662 (cluster/dht: fail rmdir if hashed subvolume is not found.) posted (#1) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/4846 (cluster/dht: inodelk on hashed to prevent races in rmdir heal) posted (#25) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/7662 (cluster/dht: fail rmdir if hashed subvolume is not found.) posted (#2) for review on master by Raghavendra G (rgowdapp)
I have tried to backport the http://review.gluster.org/5240 to my glusterfs deployments, (both 3.3 & 3.4.2) I still found the gfid-mismatch issue. my test script is: #!/bin/bash mkdir -p /mnt/gluster/test_volume/test_dir for i in `seq 1 100000`;do echo $i; md5=`echo $i | md5sum | awk '{print $1}'` dir=${md5:0:2}/${md5:2:2}/${md5:4:2} mkdir -p /mnt/gluster/test_volume/test_dir/$dir/a$i mkdir -p /mnt/gluster/test_volume/test_dir/$dir/b$i mkdir -p /mnt/gluster/test_volume/test_dir/$dir/c$i done I use 10 VMs each has one client to run the test script concurrently. /mnt/gluster/test_volume is the mount-point of glustesr volume. I could find one directory which has gfid-mismatch issue. clush -g bj-mig -b -q "getfattr -dm - -e hex /data/xfsd/test_volume/test_dir/7d/3e/3e/a46180 | grep gfid" --------------- 10.15.187.150,10.15.187.159,10.15.187.160,10.15.187.164,10.15.187.165,10.15.187.166 --------------- trusted.gfid=0x6f5984f9deee42ab96a1de7de0ac4533 --------------- 10.15.187.161,10.15.187.162,10.15.187.163 --------------- trusted.gfid=0x6270d2c9a6de4de38d7890d67ee97536 The volume info is: gluster volume info test_volume Volume Name: test_volume Type: Distributed-Replicate Volume ID: d28ade83-7394-45fb-bce8-56bdf252194d Status: Started Number of Bricks: 3 x 3 = 9 Transport-type: tcp Bricks: Brick1: 10.15.187.150:/data/xfsd/test_volume Brick2: 10.15.187.159:/data/xfsd/test_volume Brick3: 10.15.187.160:/data/xfsd/test_volume Brick4: 10.15.187.161:/data/xfsd/test_volume Brick5: 10.15.187.162:/data/xfsd/test_volume Brick6: 10.15.187.163:/data/xfsd/test_volume Brick7: 10.15.187.164:/data/xfsd/test_volume Brick8: 10.15.187.165:/data/xfsd/test_volume Brick9: 10.15.187.166:/data/xfsd/test_volume I could provide more information if you need.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report. glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user