Bug 1344322 - [geo-rep]: Worker crashed with OSError: [Errno 9] Bad file descriptor
Summary: [geo-rep]: Worker crashed with OSError: [Errno 9] Bad file descriptor
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: RHGS 3.2.0
Assignee: Aravinda VK
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: 1351522
TreeView+ depends on / blocked
 
Reported: 2016-06-09 12:18 UTC by Rahul Hinduja
Modified: 2017-03-23 05:35 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.8.4-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-23 05:35:19 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Rahul Hinduja 2016-06-09 12:18:22 UTC
Description of problem:
=======================

While doing stress testing on fanout setup, found the following traceback: 

[2016-06-08 18:08:58.969448] E [syncdutils(/rhs/brick2/b4):276:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 306, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 172, in tailer
    l = os.read(fd, 1024)
OSError: [Errno 9] Bad file descriptor
[2016-06-08 18:08:58.971385] I [syncdutils(/rhs/brick2/b4):220:finalize] <top>: exiting.

Worker crashed and restarted, files are synced to slave. 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-geo-replication-3.7.9-9.el7rhgs.x86_64
glusterfs-3.7.9-9.el7rhgs.x86_64


How reproducible:
=================
Not sure about the iteration, since similar cases on non fanout setup is already tried multiple times. With fanout also, i couldn't hit again with smaller data set though. 


Additional Info:
================

Found this in the logs, not sure about the exact steps.

Comment 3 Aravinda VK 2016-09-01 07:12:27 UTC
BZ 1340756 fixes this issue too. Sent patch to Upstream
http://review.gluster.org/#/c/15379/

Comment 4 Aravinda VK 2016-09-19 08:29:08 UTC
Patch sent to 3.2.0 as part of BZ 1340756
https://code.engineering.redhat.com/gerrit/#/c/85007/

Comment 5 Atin Mukherjee 2016-09-19 09:02:28 UTC
Upstream mainline : http://review.gluster.org/15379
Upstream 3.8 : http://review.gluster.org/15447
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/85007

Comment 9 Rahul Hinduja 2017-03-07 09:40:35 UTC
Verified with the build: glusterfs-geo-replication-3.8.4-17.el7rhgs.x86_64

Tried following fops:

create,chmod,chown,chgrp,symlink,hardlink,truncate,rename,remove via rsync in changelog,xsync and history crawl. In any of these cases, did not see the crash with respect to Bad file descriptor. Moving this bug to verified state. 

Will create/reopen if could reproduce by anyother steps.

Comment 11 errata-xmlrpc 2017-03-23 05:35:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.