Before you record your issue, ensure you are using the latest version of Gluster. Provide version-Release number of selected component (if applicable): glusterfs-6.0-63.el7rhgs.x86_64 Have you searched the Bugzilla archives for same/similar issues reported. Did you run SoS report with Insights tool?. Have you discovered any workarounds?. If not, Read the troubleshooting documentation to help solve your issue. (https://mojo.redhat.com/groups/gss-gluster (Gluster feature and its troubleshooting) https://access.redhat.com/articles/1365073 (Specific debug data that needs to be collected for GlusterFS to help troubleshooting) Please provide the below Mandatory Information: 1 - gluster v <volname> info 2 - gluster v <volname> heal info 3 - gluster v <volname> status 4 - Fuse Mount/SMB/nfs-ganesha/OCS ??? Describe the issue:(please be detailed as possible and provide log snippets) [Provide TimeStamp when the issue is seen]glusterfs-6.0-63.el7rhgs.x86_64 From the support case description: ~~~ What are you experiencing? What are you expecting to happen? The backups that writes to gluster disk errors out intermittently with "Failed to mount the disk media in library" Define the value or impact to you or the business backups are impacted Where are you experiencing this behavior? What environment? From the commvault log, we could find that the backup job failed to get gluster media to write the data 2742 5231 08/12 04:36:40 6943840 [DM_BASE ] 23080429--1 Failed to get a media to write the backup data 2742 5231 08/12 04:36:40 6943840 [DM_RECEIVER] 23080429--1 DataReceiver::InitWriter: DataWriter Init failed for media_group [3957] 2742 5231 08/12 04:36:42 ####### [DSBACKUP ] ERROR: DataReceiver reported Initialization Failure 2742 5231 08/12 04:36:42 ####### [DSBACKUP ] Error During DataMover Initialization Type: 16 SubT' -During this timestamp,from gluster logs, I see "Transport endpoint is not connected" errors. glusterfs\ws-glus_69.log ********************************************* The message "W [MSGID: 122053] [ec-common.c:331:ec_check_status] 0-CHBSP1_devid_69-disperse-3: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111110, remaining=000000, good=111110, bad=000001,(Least significant bit represents first client/brick of subvol), FOP : 'STAT' failed on '/Folder_D6XXFP_10.18.2021_17.26/CV_MAGNETIC' with gfid e4695067-407f-473e-8d25-007b0352c9f1. Parent FOP: No Parent)" repeated 12 times between [2023-08-12 04:22:40.956384] and [2023-08-12 04:23:01.069339] [2023-08-12 04:26:24.283028] E [rpc-clnt.c:183:call_bail] 0-CHBSP1_devid_69-client-18: bailing out frame type(GlusterFS 4.x v1), op(INODELK(29)), xid = 0x1481fd21, unique = 8844720681, sent = 2023-08-12 03:56:22.148967, timeout = 1800 for 10.166.168.149:49159 [2023-08-12 04:26:24.283090] E [MSGID: 114031] [client-rpc-fops_v2.c:1346:client4_0_inodelk_cbk] 0-CHBSP1_devid_69-client-18: remote operation failed [Transport endpoint is not connected] [2023-08-12 04:28:54.341795] E [rpc-clnt.c:183:call_bail] 0-CHBSP1_devid_69-client-18: bailing out frame type(GlusterFS 4.x v1), op(INODELK(29)), xid = 0x148200c6, unique = 8844741275, sent = 2023-08-12 03:58:52.184769, timeout = 1800 for 10.166.168.149:49159 [2023-08-12 04:28:54.341839] E [MSGID: 114031] [client-rpc-fops_v2.c:1346:client4_0_inodelk_cbk] 0-CHBSP1_devid_69-client-18: remote operation failed [Transport endpoint is not connected] [2023-08-12 04:28:54.341907] E [rpc-clnt.c:183:call_bail] 0-CHBSP1_devid_69-client-18: bailing out frame type(GlusterFS 4.x v1), op(INODELK(29)), xid = 0x148200bf, unique = 8844741274, sent = 2023-08-12 03:58:51.983738, timeout = 1800 for 10.166.168.149:49159 [2023-08-12 04:28:54.341916] E [MSGID: 114031] [client-rpc-fops_v2.c:1346:client4_0_inodelk_cbk] 0-CHBSP1_devid_69-client-18: remote operation failed [Transport endpoint is not connected] [2023-08-12 04:56:25.050003] E [rpc-clnt.c:183:call_bail] 0-CHBSP1_devid_69-client-19: bailing out frame type(GlusterFS 4.x v1), op(INODELK(29)), xid = 0x147fbb39, unique = 8844720681, sent = 2023-08-12 04 When does this behavior occur? Frequency? Repeatedly? At certain times? -Need to find why there are transport endpoint error reported, and how to resolve the same. -Validated there are no brick failures, and there are no much pending heals ~~~ The logs for the volume mounts ws-glus-69.log on each gluster node show lots of these errors: 0-CHBSP1_devid_69-client-18: remote operation failed [Transport endpoint is not connected] 0-CHBSP1_devid_69-client-19: remote operation failed [Transport endpoint is not connected] It is client-18 and client-19 for each node. Is this issue reproducible? If yes, share more details.: glusterfs-6.0-63.el7rhgs.x86_64glusterfs-6.0-63.el7rhgs.x86_64 Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Any Additional info: It appears the problem is communication with two bricks. The customer wants to how to prevent the problem in the future. I want to know which bricks are pointed out and is there anything wrong with those two bricks.