Description of problem: Using postmark test the write-behind+shard+dht+ec(2*(2+1)) cluster storage, I found more ERROR messages in the /var/log/mnt-test-.log, such as "[shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 18 failed. Base file gfid = b00f5de2-d811-44fe-80e5-1f382908a55a [No data available]", the [No data available] errno=61. This errno(61) is not handled in shard_writev(), it lead to shard_writev() failed and callback up layer xlator failed (op_ret=-1, op_errno=61). Usually, shard_writev() STACK_UNWIND(op_ret=-1, op_errno=61) failed, the file should be ERROR when md5sum, but it is right!! I see the codes wb_fulfill_cbk(), it add the wb_request to TODO list and retry writev again, So the file md5sum is right, not error. ===>Close the write-behind, I can't reproduce.<==== The errno(61) generates by the [storage/posix]posix_lookup(), because the buf.ia_gfid is NULL and set errno=ENODATA. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> posix_lookup() { op_ret = -1; if (gf_uuid_is_null (loc->pargfid) || (loc->name == NULL)) { // not run /* nameless lookup */ MAKE_INODE_HANDLE (real_path, this, loc, &buf); } else { // run here MAKE_ENTRY_HANDLE (real_path, par_path, this, loc, &buf); //<1> get buf failed, op_ret=-1 if (gf_uuid_is_null (loc->inode->gfid)) { // local->inode->gfid is NULL posix_gfid_heal (this, real_path, loc, xdata); //<2> run here { if (!xattr_req) goto out; if (sys_lstat (path, &stat) != 0) goto out; //<3> here return ret=-1 and errno=2(ENOENT), so GOTO OUT, it indicate the path file is not exists ret = sys_lgetxattr (path, GFID_XATTR_KEY, uuid_curr, 16); if (ret != 16) { if (is_fresh_file (&stat)) { ret = -1; errno = ENOENT; goto out; } } ret = posix_gfid_set (this, path, loc, xattr_req); out: return ret; } MAKE_ENTRY_HANDLE (real_path, par_path, this, //<4> BUT run here, get buf success, op_ret=0, but it's buf->ia_gfid is NULL, it indicate the file is exist but not set GFID. loc, &buf); } } out: if (!op_ret && !gfidless && gf_uuid_is_null (buf.ia_gfid)) { // <5> run here, op_ret=0, gfidless==0, buf.ia_gfid==NULL gf_msg (this->name, GF_LOG_ERROR, ENODATA, P_MSG_NULL_GFID, "buf->ia_gfid is null for " "%s", (real_path) ? real_path: ""); op_ret = -1; op_errno = ENODATA; //<6> run here, op_errno=ENODATA } } <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Version-Release number of selected component (if applicable): git url: https://review.gluster.org/glusterfs branch: master (commit 11eb8ba870457337c6067284dde4277e09764c0a) GlusterFS configure summary =========================== FUSE client : yes Infiniband verbs : yes epoll IO multiplex : yes argp-standalone : no fusermount : yes readline : yes georeplication : yes Linux-AIO : yes Enable Debug : yes Block Device xlator : yes glupy : yes Use syslog : yes XML output : yes Encryption xlator : yes Unit Tests : no Track priv ports : yes POSIX ACLs : yes Data Classification : yes SELinux features : yes firewalld-config : no Experimental xlators : yes Events : yes EC dynamic support : x64 sse avx Use memory pools : yes Nanosecond m/atimes : yes Legacy gNFS server : no How reproducible: posmark write-behind + shard + dht + ec Three CentOS7.3 nodes and 2 disks on every node. X86_64 Steps to Reproduce: 1.create a cluster storage service write-behind + shard + dht + ec (2 * (2+1)) Volume Name: test Type: Distributed-Disperse Volume ID: a110d762-6ce0-4309-b9b2-29c4b9d3dd84 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (2 + 1) = 6 Transport-type: tcp Bricks: Brick1: node-1.test:/mnt/disk1 Brick2: node-2.test:/mnt/disk3 Brick3: node-3.test:/mnt/disk5 Brick4: node-1.test:/mnt/disk2 Brick5: node-2.test:/mnt/disk4 Brick6: node-3.test:/mnt/disk6 Options Reconfigured: performance.nfs.write-behind: disable network.ping-timeout: 20 performance.least-prio-threads: 64 performance.high-prio-threads: 64 cluster.readdir-optimize: on performance.io-thread-count: 64 features.lock-heal: on nfs.rpc-auth-reject: *.*.*.* server.allow-insecure: on cluster.heal-timeout: 60 performance.stat-prefetch: off performance.low-prio-threads: 64 features.shard: on transport.address-family: inet nfs.disable: on mount.gluster 127.0.0.1:/test /mnt/test 2. Terminal-1# ./postmark conf [conf] set size 104800000 104800000 set number 66 set location /mnt/test set read 65536 set write 65536 set bias create -1 set bias read 0 set transactions 1 run result.txt show quit 3.Terminal-2# cd /var/log/glusterfs after postmark over. grep ' E ' /var/log/glusterfs/mnt-test-.log [2017-07-20 05:53:06.944290] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 18 failed. Base file gfid = b00f5de2-d811-44fe-80e5-1f382908a55a [No data available] [2017-07-20 05:53:25.095401] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 7 failed. Base file gfid = 1cbf5af2-b8a8-4327-a450-7335b7f84c52 [No data available] [2017-07-20 05:53:38.661017] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 4 failed. Base file gfid = 7cf1712c-0576-428f-a5ce-052eb5b6eddb [No data available] [2017-07-20 05:53:43.018596] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 20 failed. Base file gfid = acf750ff-c737-4806-a5e6-d2538579b984 [No data available] [2017-07-20 05:54:21.841236] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 21 failed. Base file gfid = 67e415ea-ba27-4a2c-89d0-c5330a624d9d [No data available] [2017-07-20 05:54:32.529585] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 4 failed. Base file gfid = f437eda1-ddd0-409b-b6d1-32201f478cb0 [No data available] [2017-07-20 05:54:33.727440] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 22 failed. Base file gfid = f437eda1-ddd0-409b-b6d1-32201f478cb0 [No data available] [2017-07-20 05:54:43.243084] E [MSGID: 133010] [shard.c:1720:shard_common_lookup_shards_cbk] 0-test-shard: Lookup on shard 15 failed. Base file gfid = 4f56cfac-9062-4c37-8c30-e660b807b6ad [No data available] Actual results: shard_common_lookup_shards_cbk() recieve the op_ret=-1, op_errno=ENODATA; Expected results: shard_common_lookup_shards_cbk() should handle the op_errno=ENODATA or posix_lookup() should handle the ENODATA. After all ENODATA not indicate file not exist and shard can be to writev(). Additional info:
# cd <glusterfs-DIR> ./autogen.sh && ./configure --build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --enable-systemtap=no --enable-bd-xlator --enable-debug && make -j8 && make install
Hi, Thanks again for the bug report. It is interesting to note that in your case you are seeing this with distributed-dispersed-sharded volume. I'm able to recreate the issue on my colleague's setup with plain distribute volume. I am analysing the tcpdump output to know what went wrong. I will get back to you in a couple of days with the RC. -Krutika
Hi, I've sent the patch https://review.gluster.org/#/c/17821 to fix this issue. I even tested it on plain distribute-sharded volume (where we were seeing this issue before) and it worked fine. Do you mind testing it out and letting us know if it fixes your problem, since yours is ec+dist+shard? -Krutika
https://review.gluster.org/#/c/17821