Description of problem: ----------------------- 2*2 gluster volume,4 clients mounted from the same server via v4.0 Brought down two bricks . Created lots of small files.Brought all the bricks online and triggered a multithreaded self heal. Running heal info periodically in the background as well. Multiple touches and Bonnies failed (on 2 different clients mounted from the same server) : <snip> [root@gqac008 gluster-mount]# touch a touch: cannot touch ‘a’: Input/output error [root@gqac008 gluster-mount]# touch a touch: cannot touch ‘a’: Input/output error [root@gqac008 gluster-mount]# touch b touch: cannot touch ‘b’: Input/output error [root@gqac008 gluster-mount]# [root@gqac008 gluster-mount]# [root@gqac008 gluster-mount]# touch c touch: cannot touch ‘c’: Input/output error <snip> AND .. <snip> Changing to the specified mountpoint /gluster-mount/run2220 executing bonnie Using uid:0, gid:0. Can't open file ./Bonnie.2247 real 0m1.227s user 0m0.002s sys 0m0.001s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory Removing /gluster-mount/run2220/ [root@gqac008 /]# <snip> I did not find anything in brick logs(around Sunday,July30, 3 PM IST). tcpdumps,logs etc will be shared in comments Version-Release number of selected component (if applicable): ------------------------------------------------------------- [root@gqas013 glusterfs]# rpm -qa|grep ganes nfs-ganesha-2.4.4-16.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.4-16.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-36.el7rhgs.x86_64 How reproducible: ----------------- 1/1 Actual results: --------------- EIO on mount point. Expected results: ----------------- No EIO on mount point. Additional info: --------------- Volume Name: testvol Type: Distributed-Replicate Volume ID: 41c5aa32-ec60-4591-ae6d-f93a0b13b47c Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0 Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1 Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2 Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3 Options Reconfigured: cluster.shd-wait-qlength: 655536 cluster.shd-max-threads: 64 client.event-threads: 4 server.event-threads: 4 cluster.lookup-optimize: on diagnostics.count-fop-hits: on diagnostics.latency-measurement: on ganesha.enable: on features.cache-invalidation: on server.allow-insecure: on performance.stat-prefetch: off transport.address-family: inet nfs.disable: on nfs-ganesha: enable cluster.enable-shared-storage: enable [root@gqas013 glusterfs]#
Proposing as a blocker as application got affected.
Test case was tried once on FUSE where it passed.
[root@gqac019 gluster-mount]# dd if=/dev/zero of=a count=1 bs=100 conv=fdatasync dd: failed to open ‘a’: Input/output error [root@gqac019 gluster-mount]#
Updating this: The reaper thread is running, so it's a livelock, not a deadlock. There is a state on the so_state_list that is not in the hashtable. This means state_del_locked() bails early, and the state is not deleted or removed from the so_state_list, causing an infinite loop with cr_mutex held. I'm wondering if this will fix it, but I'd like Frank to weigh in: b049eb90e78670d3e17ffe91b5c4048f8d7520d4 There may be one or two needed on top of that.
There are a set of somewhat related patches, the ones marked with * are NLM (NFS v3) only, but may be required due to the other patches: * feb12d2fe13fcdbd5ae80cbe6575af98c4657520 Fix nlm client refcount going up from zero * acb632c319b3846200bcd718aa8e637bb0f4e1fd Fix nsm client refcount going up from zero * 52e0e125322fb0cc5c608be4cd43b90a702d88e2 Fix nlm state refcount going up from zero b049eb90e78670d3e17ffe91b5c4048f8d7520d4 Convert state_owner hash table to behave like others * 51d0f6c77d3e0d95be5ea27abe1f8c66db242884 Fix typo in hash table name in get_nlm_client() 006575d43d77dcd5c3eefd11e9d508a33e2bf459 Fix hashtable_setlatched overwrite parameter * 84d5ef4003e13a4078fa01d69a67bfe2ae02c61a Use care_t care instead of bool nsm_state_applies in get_nlm_state 60e20e2e9b531910c2ef1a20ad4036ff595df66f Fix a race in using hashtables leading to crashes[root@localhost src] There may be other relevant patches also.
IO is successful when I try it on the same volume accessed via v3.
The livelock is related to the v4 Session ID, so v3 should be unaffected.
strea
POST with rebase to nfs-ganesha-2.5.x
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2610