1344322 – [geo-rep]: Worker crashed with OSError: [Errno 9] Bad file descriptor

Bug 1344322 - [geo-rep]: Worker crashed with OSError: [Errno 9] Bad file descriptor

Summary: [geo-rep]: Worker crashed with OSError: [Errno 9] Bad file descriptor

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Aravinda VK
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1351522
TreeView+	depends on / blocked

Reported:	2016-06-09 12:18 UTC by Rahul Hinduja
Modified:	2017-03-23 05:35 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.8.4-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-03-23 05:35:19 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description Rahul Hinduja 2016-06-09 12:18:22 UTC

Description of problem:
=======================

While doing stress testing on fanout setup, found the following traceback: 

[2016-06-08 18:08:58.969448] E [syncdutils(/rhs/brick2/b4):276:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 306, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 172, in tailer
    l = os.read(fd, 1024)
OSError: [Errno 9] Bad file descriptor
[2016-06-08 18:08:58.971385] I [syncdutils(/rhs/brick2/b4):220:finalize] <top>: exiting.

Worker crashed and restarted, files are synced to slave. 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-geo-replication-3.7.9-9.el7rhgs.x86_64
glusterfs-3.7.9-9.el7rhgs.x86_64


How reproducible:
=================
Not sure about the iteration, since similar cases on non fanout setup is already tried multiple times. With fanout also, i couldn't hit again with smaller data set though. 


Additional Info:
================

Found this in the logs, not sure about the exact steps.

Comment 3 Aravinda VK 2016-09-01 07:12:27 UTC

BZ 1340756 fixes this issue too. Sent patch to Upstream
http://review.gluster.org/#/c/15379/

Comment 4 Aravinda VK 2016-09-19 08:29:08 UTC

Patch sent to 3.2.0 as part of BZ 1340756
https://code.engineering.redhat.com/gerrit/#/c/85007/

Comment 5 Atin Mukherjee 2016-09-19 09:02:28 UTC

Upstream mainline : http://review.gluster.org/15379
Upstream 3.8 : http://review.gluster.org/15447
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/85007

Comment 9 Rahul Hinduja 2017-03-07 09:40:35 UTC

Verified with the build: glusterfs-geo-replication-3.8.4-17.el7rhgs.x86_64

Tried following fops:

create,chmod,chown,chgrp,symlink,hardlink,truncate,rename,remove via rsync in changelog,xsync and history crawl. In any of these cases, did not see the crash with respect to Bad file descriptor. Moving this bug to verified state. 

Will create/reopen if could reproduce by anyother steps.

Comment 11 errata-xmlrpc 2017-03-23 05:35:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.