Description of problem: This is a breakdown of bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538358 Files being temporary unavailable due to bricks unable to fully heal. Version-Release number of selected component (if applicable): 3.3.1 How reproducible: Frequently Steps to Reproduce: 1. Induce healing on an EC volume through disconnects 2. Stat files / writes new data / append / read 3. EC healing doesn't complete Actual results: Bricks are unable to fully heal. Expected results: Normal operation. Additional info: Volume Name: tezavol Type: Distributed-Disperse Volume ID: 4f6279a0-7939-443c-9a4f-4d69df65d722 Status: Started Snapshot Count: 0 Number of Bricks: 8 x (4 + 2) = 48 Transport-type: tcp Bricks: Brick1: chst-gstor-01-san:/bricks/brick1/brick1 Brick2: chst-gstor-03-san:/bricks/brick1/brick1 Brick3: chst-gstor-05-san:/bricks/brick1/brick1 Brick4: chst-gstor-07-san:/bricks/brick1/brick1 Brick5: chst-gstor-09-san:/bricks/brick1/brick1 Brick6: chst-gstor-11-san:/bricks/brick1/brick1 Brick7: chst-gstor-02-san:/bricks/brick1/brick1 Brick8: chst-gstor-04-san:/bricks/brick1/brick1 Brick9: chst-gstor-06-san:/bricks/brick1/brick1 Brick10: chst-gstor-08-san:/bricks/brick1/brick1 Brick11: chst-gstor-10-san:/bricks/brick1/brick1 Brick12: chst-gstor-12-san:/bricks/brick1/brick1 Brick13: chst-gstor-01-san:/bricks/brick2/brick2 Brick14: chst-gstor-03-san:/bricks/brick2/brick2 Brick15: chst-gstor-05-san:/bricks/brick2/brick2 Brick16: chst-gstor-07-san:/bricks/brick2/brick2 Brick17: chst-gstor-09-san:/bricks/brick2/brick2 Brick18: chst-gstor-11-san:/bricks/brick2/brick2 Brick19: chst-gstor-02-san:/bricks/brick2/brick2 Brick20: chst-gstor-04-san:/bricks/brick2/brick2 Brick21: chst-gstor-06-san:/bricks/brick2/brick2 Brick22: chst-gstor-08-san:/bricks/brick2/brick2 Brick23: chst-gstor-10-san:/bricks/brick2/brick2 Brick24: chst-gstor-12-san:/bricks/brick2/brick2 Brick25: chst-gstor-01-san:/bricks/brick3/brick3 Brick26: chst-gstor-03-san:/bricks/brick3/brick3 Brick27: chst-gstor-05-san:/bricks/brick3/brick3 Brick28: chst-gstor-07-san:/bricks/brick3/brick3 Brick29: chst-gstor-09-san:/bricks/brick3/brick3 Brick30: chst-gstor-11-san:/bricks/brick3/brick3 Brick31: chst-gstor-02-san:/bricks/brick3/brick3 Brick32: chst-gstor-04-san:/bricks/brick3/brick3 Brick33: chst-gstor-06-san:/bricks/brick3/brick3 Brick34: chst-gstor-08-san:/bricks/brick3/brick3 Brick35: chst-gstor-10-san:/bricks/brick3/brick3 Brick36: chst-gstor-12-san:/bricks/brick3/brick3 Brick37: chst-gstor-01-san:/bricks/brick4/brick4 Brick38: chst-gstor-03-san:/bricks/brick4/brick4 Brick39: chst-gstor-05-san:/bricks/brick4/brick4 Brick40: chst-gstor-07-san:/bricks/brick4/brick4 Brick41: chst-gstor-09-san:/bricks/brick4/brick4 Brick42: chst-gstor-11-san:/bricks/brick4/brick4 Brick43: chst-gstor-02-san:/bricks/brick4/brick4 Brick44: chst-gstor-04-san:/bricks/brick4/brick4 Brick45: chst-gstor-06-san:/bricks/brick4/brick4 Brick46: chst-gstor-08-san:/bricks/brick4/brick4 Brick47: chst-gstor-10-san:/bricks/brick4/brick4 Brick48: chst-gstor-12-san:/bricks/brick4/brick4 Options Reconfigured: performance.parallel-readdir: off performance.io-thread-count: 32 server.event-threads: 4 client.event-threads: 4 transport.address-family: inet nfs.disable: off performance.cache-size: 1024MB cluster.lookup-optimize: on disperse.eager-lock: off Status of volume: tezavol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick chst-gstor-01-san:/bricks/brick1/bric k1 49152 0 Y 15013 Brick chst-gstor-03-san:/bricks/brick1/bric k1 49152 0 Y 15103 Brick chst-gstor-05-san:/bricks/brick1/bric k1 49152 0 Y 15102 Brick chst-gstor-07-san:/bricks/brick1/bric k1 49152 0 Y 15295 Brick chst-gstor-09-san:/bricks/brick1/bric k1 49152 0 Y 14933 Brick chst-gstor-11-san:/bricks/brick1/bric k1 49152 0 Y 14796 Brick chst-gstor-02-san:/bricks/brick1/bric k1 49152 0 Y 15303 Brick chst-gstor-04-san:/bricks/brick1/bric k1 49152 0 Y 15044 Brick chst-gstor-06-san:/bricks/brick1/bric k1 49152 0 Y 15209 Brick chst-gstor-08-san:/bricks/brick1/bric k1 49152 0 Y 15045 Brick chst-gstor-10-san:/bricks/brick1/bric k1 49152 0 Y 14831 Brick chst-gstor-12-san:/bricks/brick1/bric k1 49152 0 Y 14931 Brick chst-gstor-01-san:/bricks/brick2/bric k2 49153 0 Y 15039 Brick chst-gstor-03-san:/bricks/brick2/bric k2 49153 0 Y 15130 Brick chst-gstor-05-san:/bricks/brick2/bric k2 49153 0 Y 15129 Brick chst-gstor-07-san:/bricks/brick2/bric k2 49153 0 Y 15321 Brick chst-gstor-09-san:/bricks/brick2/bric k2 49153 0 Y 14959 Brick chst-gstor-11-san:/bricks/brick2/bric k2 49153 0 Y 14822 Brick chst-gstor-02-san:/bricks/brick2/bric k2 49153 0 Y 15329 Brick chst-gstor-04-san:/bricks/brick2/bric k2 49153 0 Y 15071 Brick chst-gstor-06-san:/bricks/brick2/bric k2 49153 0 Y 15235 Brick chst-gstor-08-san:/bricks/brick2/bric k2 49153 0 Y 15072 Brick chst-gstor-10-san:/bricks/brick2/bric k2 49153 0 Y 14857 Brick chst-gstor-12-san:/bricks/brick2/bric k2 49153 0 Y 14957 Brick chst-gstor-01-san:/bricks/brick3/bric k3 49154 0 Y 15065 Brick chst-gstor-03-san:/bricks/brick3/bric k3 49154 0 Y 15156 Brick chst-gstor-05-san:/bricks/brick3/bric k3 49154 0 Y 15155 Brick chst-gstor-07-san:/bricks/brick3/bric k3 49154 0 Y 15347 Brick chst-gstor-09-san:/bricks/brick3/bric k3 49154 0 Y 14985 Brick chst-gstor-11-san:/bricks/brick3/bric k3 49154 0 Y 14848 Brick chst-gstor-02-san:/bricks/brick3/bric k3 49154 0 Y 15355 Brick chst-gstor-04-san:/bricks/brick3/bric k3 49154 0 Y 15097 Brick chst-gstor-06-san:/bricks/brick3/bric k3 49154 0 Y 15261 Brick chst-gstor-08-san:/bricks/brick3/bric k3 49154 0 Y 15098 Brick chst-gstor-10-san:/bricks/brick3/bric k3 49154 0 Y 14884 Brick chst-gstor-12-san:/bricks/brick3/bric k3 49154 0 Y 14983 Brick chst-gstor-01-san:/bricks/brick4/bric k4 49155 0 Y 15092 Brick chst-gstor-03-san:/bricks/brick4/bric k4 49155 0 Y 15182 Brick chst-gstor-05-san:/bricks/brick4/bric k4 49155 0 Y 15181 Brick chst-gstor-07-san:/bricks/brick4/bric k4 49155 0 Y 15373 Brick chst-gstor-09-san:/bricks/brick4/bric k4 49155 0 Y 15011 Brick chst-gstor-11-san:/bricks/brick4/bric k4 49155 0 Y 14874 Brick chst-gstor-02-san:/bricks/brick4/bric k4 49155 0 Y 15381 Brick chst-gstor-04-san:/bricks/brick4/bric k4 49155 0 Y 15123 Brick chst-gstor-06-san:/bricks/brick4/bric k4 49155 0 Y 15287 Brick chst-gstor-08-san:/bricks/brick4/bric k4 49155 0 Y 15124 Brick chst-gstor-10-san:/bricks/brick4/bric k4 49155 0 Y 14910 Brick chst-gstor-12-san:/bricks/brick4/bric k4 49155 0 Y 15009 NFS Server on localhost 2049 0 Y 30949 Self-heal Daemon on localhost N/A N/A Y 15128 NFS Server on chst-gstor-04-san 2049 0 Y 28021 Self-heal Daemon on chst-gstor-04-san N/A N/A Y 15159 NFS Server on chst-gstor-08-san 2049 0 Y 34241 Self-heal Daemon on chst-gstor-08-san N/A N/A Y 15160 NFS Server on chst-gstor-02-san 2049 0 Y 28723 Self-heal Daemon on chst-gstor-02-san N/A N/A Y 28560 NFS Server on chst-gstor-10-san 2049 0 Y 20566 Self-heal Daemon on chst-gstor-10-san N/A N/A Y 14946 NFS Server on chst-gstor-05-san 2049 0 Y 5216 Self-heal Daemon on chst-gstor-05-san N/A N/A Y 15217 NFS Server on chst-gstor-03-san 2049 0 Y 33838 Self-heal Daemon on chst-gstor-03-san N/A N/A Y 15218 NFS Server on chst-gstor-11-san 2049 0 Y 28833 Self-heal Daemon on chst-gstor-11-san N/A N/A Y 14910 NFS Server on chst-gstor-07-san 2049 0 Y 34273 Self-heal Daemon on chst-gstor-07-san N/A N/A Y 15409 NFS Server on chst-gstor-12-san 2049 0 Y 8998 Self-heal Daemon on chst-gstor-12-san N/A N/A Y 15045 NFS Server on chst-gstor-06-san 2049 0 Y 36119 Self-heal Daemon on chst-gstor-06-san N/A N/A Y 15323 NFS Server on chst-gstor-09-san 2049 0 Y 7472 Self-heal Daemon on chst-gstor-09-san N/A N/A Y 15047 Task Status of Volume tezavol ------------------------------------------------------------------------------ Task : Rebalance ID : 4fb32a67-a507-4a31-81a4-d849037156b7 Status : completed
This issue seems to be fixed in upstream, but it didn't make to the downstream. Customer is asking for backporting the fix.
Tried several attempts to reproduce this issue on older build but it is not reproduced on downstream code. On 3.12.2-16.el7rhgs, followed the steps mentioned in Comment 28 (mounted as nfs as root user and continued the remaining steps) and was able to access the file without any issues. Also, done sanity check and ran non-root user scenarios and didn't find any issues. Considering above, I am moving this BZ to Conditionally verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607