+++ This bug was initially created as a clone of Bug #1467250 +++ This was a hyper converged setup with arbiter volume: Volume Name: arbvol Type: Replicate Volume ID: d7ccfadd-63e1-4fef-b70f-77d0e2cb2bba Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: node1:/gluster/brick1/b1 Brick2: node2:/gluster/brick1/b1 Brick3: node3:/gluster/brick1/b1 (arbiter) Options Reconfigured: cluster.self-heal-daemon: off performance.strict-o-direct: on cluster.granular-entry-heal: enable network.ping-timeout: 10 user.cifs: off features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: on network.remote-dio: off performance.low-prio-threads: 32 performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet nfs.disable: on cluster.data-self-heal: off cluster.metadata-self-heal: off cluster.entry-self-heal: off In a hc setup, writes, truncates and reads were happening from a single client while the bricks were continuously being killed and brought backup one by one in a loop. I/O was then getting hung from the mount. gdb back trace of mount: #3 0x00007f45a762f5ee in shard_make_block_abspath (block_num=159197526, gfid=0x7f45a805a0a8 "$r\342\366{\260E\032\251\020\254>\312B\245\274", filepath=0x7f45ac9e1f90 "/.shard/2472e2f6-7bb0-451a-a910-ac3eca42a5bc.159197525", len=4096) at shard.c:57 #4 0x00007f45a7632605 in shard_common_resolve_shards (frame=0x7f45a8000b60, this=0x7f45a80121e0, post_res_handler=0x7f45a763a344 <shard_post_resolve_truncate_handler>) at shard.c:635 #5 0x00007f45a763337b in shard_refresh_dot_shard (frame=0x7f45a8000b60, this=0x7f45a80121e0) at shard.c:884 #6 0x00007f45a763b2a4 in shard_truncate_begin (frame=0x7f45a8000b60, this=0x7f45a80121e0) at shard.c:1989 #7 0x00007f45a763c7e3 in shard_post_lookup_truncate_handler (frame=0x7f45a8000b60, this=0x7f45a80121e0) at shard.c:2063 #8 0x00007f45a7634b79 in shard_lookup_base_file_cbk (frame=0x7f45a8000b60, cookie=0x7f45a804e130, this=0x7f45a80121e0, op_ret=0, op_errno=117, inode=0x7f45a805a0a0, buf=0x7f45a804ee68, xdata=0x7f45a8071ad0, postparent=0x7f45a804f098) at shard.c:1149 #9 0x00007f45a7892dd4 in dht_discover_complete (this=0x7f45a8010ab0, discover_frame=0x7f45a8060f60) at dht-common.c:577 #10 0x00007f45a78937f1 in dht_discover_cbk (frame=0x7f45a8060f60, cookie=0x7f45a800de50, this=0x7f45a8010ab0, op_ret=0, op_errno=117, inode=0x7f45a805a0a0, stbuf=0x7f45a807d890, xattr=0x7f45a8071ad0, postparent=0x7f45a807d900) at dht-common.c:700 #11 0x00007f45a7b71ad6 in afr_discover_done (frame=0x7f45a8066d60, this=0x7f45a800de50) at afr-common.c:2624 #12 0x00007f45a7b71e19 in afr_discover_cbk (frame=0x7f45a8066d60, cookie=0x2, this=0x7f45a800de50, op_ret=0, op_errno=0, inode=0x7f45a805a0a0, buf=0x7f45ac9e3900, xdata=0x7f45a806cb50, postparent=0x7f45ac9e3890) at afr-common.c:2669 shard_common_resolve_shards() was stuck in while (shard_idx_iter <= local->last_block) because local->last_block was -1. Turns out afr served the lookup from a bad copy (the good brick was down) and shard used the iatt values in the lookup cbk to calculate size and block count which were not correct. xattr values of the FILE on the bad brick from which lookup was served: [root@node2]# g /gluster/brick1/b1/FILE getfattr: Removing leading '/' from absolute path names # file: gluster/brick1/b1/FILE security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.arbvol-client-2=0x000003e00000046700000000 trusted.afr.dirty=0x000000020000012e00000000 trusted.gfid=0x0e47074be70b4c25bdf2f5d40a26a049 trusted.glusterfs.shard.block-size=0x0000000000400000 trusted.glusterfs.shard.file-size=0xfffffffffff214000000000000000000fffffffffffff80a0000000000000000 --- Additional comment from Worker Ant on 2017-07-03 05:03:05 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#1) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-07-04 22:45:20 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#2) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-07-06 01:51:02 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#3) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-08-27 08:47:59 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#4) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-09-06 12:56:05 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#5) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-09-13 05:32:19 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#6) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-10-17 08:17:43 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#7) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-10-18 02:25:40 EDT --- REVIEW: https://review.gluster.org/17673 (afr: add checks for allowing lookups) posted (#8) for review on master by Ravishankar N (ravishankar) --- Additional comment from Worker Ant on 2017-11-17 19:39:24 EST --- COMMIT: https://review.gluster.org/17673 committed in master by \"Ravishankar N\" <ravishankar> with a commit message- afr: add checks for allowing lookups Problem: In an arbiter volume, lookup was being served from one of the sink bricks (source brick was down). shard uses the iatt values from lookup cbk to calculate the size and block count, which in this case were incorrect values. shard_local_t->last_block was thus initialised to -1, resulting in an infinite while loop in shard_common_resolve_shards(). Fix: Use client quorum logic to allow or fail the lookups from afr if there are no readable subvolumes. So in replica-3 or arbiter vols, if there is no good copy or if quorum is not met, fail lookup with ENOTCONN. With this fix, we are also removing support for quorum-reads xlator option. So if quorum is not met, neither read nor write txns are allowed and we fail the fop with ENOTCONN. Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0 BUG: 1467250 Signed-off-by: Ravishankar N <ravishankar>
REVIEW: https://review.gluster.org/18817 (afr: add checks for allowing lookups) posted (#1) for review on release-3.13 by Ravishankar N
COMMIT: https://review.gluster.org/18817 committed in release-3.13 by \"Ravishankar N\" <ravishankar> with a commit message- afr: add checks for allowing lookups Problem: In an arbiter volume, lookup was being served from one of the sink bricks (source brick was down). shard uses the iatt values from lookup cbk to calculate the size and block count, which in this case were incorrect values. shard_local_t->last_block was thus initialised to -1, resulting in an infinite while loop in shard_common_resolve_shards(). Fix: Use client quorum logic to allow or fail the lookups from afr if there are no readable subvolumes. So in replica-3 or arbiter vols, if there is no good copy or if quorum is not met, fail lookup with ENOTCONN. With this fix, we are also removing support for quorum-reads xlator option. So if quorum is not met, neither read nor write txns are allowed and we fail the fop with ENOTCONN. Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0 BUG: 1515572 Signed-off-by: Ravishankar N <ravishankar> (cherry picked from commit bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/