Description of problem: When running gluster volume heal <vol> info, several bricks are showing "Transport endpoint not connected" Version-Release number of selected component (if applicable): RHEL: 7.2 RHGS: glusterfs-3.12.2-47.el7rhgs.x86_64 glusterfs-api-3.12.2-47.el7rhgs.x86_64 glusterfs-cli-3.12.2-47.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-47.el7rhgs.x86_64 glusterfs-fuse-3.12.2-47.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-47.el7rhgs.x86_64 glusterfs-libs-3.12.2-47.el7rhgs.x86_64 glusterfs-rdma-3.12.2-47.el7rhgs.x86_64 glusterfs-server-3.12.2-47.el7rhgs.x86_64 kernel: kernel-3.10.0-327.18.2.el7.x86_64 How reproducible: Ongoing Steps to Reproduce: Run gluster volume heal <vol> info from nodes 7 or 8 and all bricks from node 4 show as "Transport endpoint not connected" Additional info: Client VM's running various applications are having trouble connecting to gluster volumes. This is what originally presented as the problem. After sequentially restarting the gluster nodes and checking for healing, the transport endpoint messages were noticed. During troubleshooting we performed the following: 1. Initially noted that there were several bricks down. Force restarted volumes and most bricks came back online. Afterwards, for the most part, gluster volume status shows all bricks and self-heal daemons online. There are a couple of outliers but most volumes appeared fine. 2. We then tried stopping gluster services with systemctl stop glusterd; pkill glusterfs; pkill glusterfsd followed by systemctl start glusterd sequentially on each node. Again, gluster volume status showed only a couple of bricks offline but the transport messages continue on nodes 7 and 8 for all bricks on node 4. 3. We then tried stopping glusterd on all nodes then starting back up sequentially. No improvement in the transport messages. 4. We noticed that op.version was set to 30712 (RHGS 3.1 update 3). Had them set op.version to 31305. 5. Requested the customer to check with their end-users to see if the applications were responding but at the time this BZ is being opened, we do not have a response yet. 6. New gluster node sosreports (post changes) and at least one client sosreport from a client having difficulties have been requested. Original sosreports are on collab-shell. We will continue adding information to this BZ as it becomes available. At this point, we are not sure if the transport messages and the clients having trouble with gluster volumes are related.
https://review.gluster.org/c/glusterfs/+/23606
kindly verify the updated doc text in the doc text field.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0288