Bug 1344322

Summary: [geo-rep]: Worker crashed with OSError: [Errno 9] Bad file descriptor
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: geo-replicationAssignee: Aravinda VK <avishwan>
Status: CLOSED ERRATA QA Contact: Rahul Hinduja <rhinduja>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: amukherj, asrivast, avishwan, csaba
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.2.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 05:35:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1351522    

Description Rahul Hinduja 2016-06-09 12:18:22 UTC
Description of problem:
=======================

While doing stress testing on fanout setup, found the following traceback: 

[2016-06-08 18:08:58.969448] E [syncdutils(/rhs/brick2/b4):276:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 306, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 172, in tailer
    l = os.read(fd, 1024)
OSError: [Errno 9] Bad file descriptor
[2016-06-08 18:08:58.971385] I [syncdutils(/rhs/brick2/b4):220:finalize] <top>: exiting.

Worker crashed and restarted, files are synced to slave. 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-geo-replication-3.7.9-9.el7rhgs.x86_64
glusterfs-3.7.9-9.el7rhgs.x86_64


How reproducible:
=================
Not sure about the iteration, since similar cases on non fanout setup is already tried multiple times. With fanout also, i couldn't hit again with smaller data set though. 


Additional Info:
================

Found this in the logs, not sure about the exact steps.

Comment 3 Aravinda VK 2016-09-01 07:12:27 UTC
BZ 1340756 fixes this issue too. Sent patch to Upstream
http://review.gluster.org/#/c/15379/

Comment 4 Aravinda VK 2016-09-19 08:29:08 UTC
Patch sent to 3.2.0 as part of BZ 1340756
https://code.engineering.redhat.com/gerrit/#/c/85007/

Comment 5 Atin Mukherjee 2016-09-19 09:02:28 UTC
Upstream mainline : http://review.gluster.org/15379
Upstream 3.8 : http://review.gluster.org/15447
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/85007

Comment 9 Rahul Hinduja 2017-03-07 09:40:35 UTC
Verified with the build: glusterfs-geo-replication-3.8.4-17.el7rhgs.x86_64

Tried following fops:

create,chmod,chown,chgrp,symlink,hardlink,truncate,rename,remove via rsync in changelog,xsync and history crawl. In any of these cases, did not see the crash with respect to Bad file descriptor. Moving this bug to verified state. 

Will create/reopen if could reproduce by anyother steps.

Comment 11 errata-xmlrpc 2017-03-23 05:35:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html