Bug 1666974

Summary: [geo-rep]: Errno 107 Transport endpoint is not connected
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rochelle <rallan>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED DUPLICATE QA Contact: Rahul Hinduja <rhinduja>
Severity: low Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: avishwan, csaba, nchilaka, pasik, rhs-bugs, storage-qa-internal, sunkumar
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-19 05:25:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rochelle 2019-01-17 06:28:55 UTC
Description of problem:
=======================
While running geo-rep automation, there were a number of Tracebacks seen on the master and the slave.

On the slave, there were a number of 'Transport endpoint not connected' Tracebacks:

[2019-01-16 11:28:42.438910] E [repce(slave):117:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 602, in entry_ops
    er = entry_purge(op, entry, gfid, e, uid, gid)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 469, in entry_purge
    isinstance(lstat(os.path.join(pfx, gfid)), int):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 577, in lstat
    return errno_wrap(os.lstat, [e], [ENOENT], [ESTALE, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 559, in errno_wrap
    return call(*arg)
OSError: [Errno 107] Transport endpoint is not connected: '.gfid/3b534583-f7b5-40e9-a5eb-294dbd2c516d'





Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.12.2-37.el7rhgs.x86_64

How reproducible:
================
Always with automation

Steps to Reproduce:
===================
Geo-rep automation cases


Actual results:
==============
A lot of 'Transport endpoint not connected' Tracebacks were seen on the slave

Expected results:
================
There were no brick down ops so the traceback should not
 be seen 

Additional info:
================
There were crashes seen and a bug was raised at : https://bugzilla.redhat.com/show_bug.cgi?id=1666969 

Not raising this as a blocker for now since there is no functionality impact

Comment 3 Sunny Kumar 2019-02-19 07:30:18 UTC
Similar to https://bugzilla.redhat.com/show_bug.cgi?id=1640573.

Detailed analysis here-

https://bugzilla.redhat.com/show_bug.cgi?id=1640573#c7

this bug should be closed as duplicate of 1640573.