Description of problem: Mount failing even though a single brick (out of 3) is up [root@dhcp47-30 ~]# ls /var/lib/origin/openshift.local.volumes/pods/1bc94e09-b7d7-11e9-a78c-005056b20f7c/volumes/kubernetes.io~glusterfs/db ls: cannot access /var/lib/origin/openshift.local.volumes/pods/1bc94e09-b7d7-11e9-a78c-005056b20f7c/volumes/kubernetes.io~glusterfs/db: Transport endpoint is not connected [root@dhcp46-151 ~]# oc rsh pod/glusterfs-storage-7l2h2 gluster volume status heketidbstorage Status of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.134:/var/lib/heketi/mounts/v g_fc55eb0aa163a697d4e38c8f8f118d79/brick_49 ffc91a89f142ee40d42b2ccf1ac64d/brick N/A N/A N N/A Brick 10.70.47.1:/var/lib/heketi/mounts/vg_ 18c96c7188b87e757921ea2688cf4b4c/brick_5c96 e4982dab85eb8944f84856ef0355/brick 49152 0 Y 166 Brick 10.70.46.245:/var/lib/heketi/mounts/v g_e71f46d991fc4ef534858d2e62912860/brick_a8 18266a862e91a15f216b8436fa1830/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 63959 Self-heal Daemon on dhcp46-245.lab.eng.blr. redhat.com N/A N/A Y 3086 Self-heal Daemon on 10.70.47.14 N/A N/A Y 47398 Self-heal Daemon on 10.70.47.134 N/A N/A Y 129961 Task Status of Volume heketidbstorage ------------------------------------------------------------------------------ There are no active volume tasks Tried to connect another client to the volume an it failed. Logs to be attached. Version-Release number of selected component (if applicable): sh-4.2# rpm -qa | grep glusterfs glusterfs-api-6.0-8.el7rhgs.x86_64 glusterfs-fuse-6.0-8.el7rhgs.x86_64 glusterfs-server-6.0-8.el7rhgs.x86_64 glusterfs-libs-6.0-8.el7rhgs.x86_64 glusterfs-6.0-8.el7rhgs.x86_64 glusterfs-client-xlators-6.0-8.el7rhgs.x86_64 glusterfs-cli-6.0-8.el7rhgs.x86_64 glusterfs-geo-replication-6.0-8.el7rhgs.x86_64 Actual results: Existing connection reports "Transport endpoint not connected". Subsequent mounts fail. Expected results: Connected but read-only volume.
In 3.5 latest we fixed an issue specific to health check thread failure. We fixed the issue from the bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1752713 Can we try to reproduce the same on the latest RHGS-3.5 release? Thanks, Mohit Agrawal
I am closing the bug. Please reopen if you face the issue on the latest 3.5 release.