Created attachment 1219268 [details] ERROR and warning Description of problem: Read a file from one client and write from other client. After a while we are seeing IO errors on mount points and *ERRORS* and *WARNINGS* on Mount Logs Version-Release number of selected component (if applicable): gluster --version glusterfs 3.8.4 built on Oct 24 2016 11:13:47 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. How reproducible: 100% Logs placed at rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug> Steps to Reproduce: 1. Create an arbiter Volume(Casey) and mount it on two different clients /mnt/casey and enable the mdcache options 2. One one of the clients while true; do cat abc; done; and on the other while true; do echo 21 > abc ; done; 3. after 1few minutes you see I/O errors on mount Point and below errors and warning on the mount log. [2016-11-10 08:02:54.878847] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-casey-replicate-0: Failing READ on gfid e739b56e-ce19-4056-9795-6b03681654b5: split-brain observed. [Input/output error] [2016-11-10 08:02:54.881998] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 516544: READ => -1 gfid=e739b56e-ce19-4056-9795-6b03681654b5 fd=0x7f3b2430c06c (Input/output error) Actual results: The I/O error s observed on the mount point. and mount logs showing split-brain observed ERROR Expected results: IO should run smoothly and no errors should be observed. Additional info: root@dhcp47-141 ~]# gluster volu info Volume Name: casey Type: Replicate Volume ID: 919084d3-561f-4874-be74-e349ad0b23a5 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: dhcp47-141.lab.eng.blr.redhat.com:/bricks/brick0/casey Brick2: dhcp47-143.lab.eng.blr.redhat.com:/bricks/brick0/casey Brick3: dhcp47-144.lab.eng.blr.redhat.com:/bricks/brick0/casey (arbiter) Options Reconfigured: performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on
Poornima and Ravi, Updating the bug with new findings as asked. This bug is also hitting on 1x3 replica also. Tested it on:- [root@dhcp47-141 /]# gluster --version glusterfs 3.8.4 built on Oct 24 2016 11:13:47 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
Its a good finding. RCA: When a brick is brought down, the event generation in afr changes, and the next read call afr_inode_refresh to choose a read subvolume. As a result of brick down, pending xattr is sent on the bricks that are up, which results in an upcall to the afr to reset the read subvolume. Consider a race between, afr_inode_Refresh_done() and upcall unsetting the read subvol to NULL. Below is the sequence of execution that can lead to EIO: 1. CHILD_DOWN - as a result of brick down ... 2. read() .... 3. afr_read() 4. afr_refresh_inode() - Because of brick down event generation has changed and inode needs refresh before read fop. 5. afr_refresh_done() 6. upcall - reset the read_subvol 7. afr_read_txn_refresh_done() In the above case, the read will fail with EIO error. The fix may either go in AFR/md-cache, need to conclude
Fix posted upstream: http://review.gluster.org/#/c/15892/
Creating deep directories & simultaneously running ll command on mount point on a continuous i am not seeing any IO error with glusterfs-3.8.4-6.
Verified this bug on 3.8.4.6 with the same steps in description its not reproducible. Hence marking it as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html