Bug 905415

Summary: gnfs: nfs process gets killed and throws error
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Saurabh <saujain>
Component: glusterdAssignee: santosh pradhan <spradhan>
Status: CLOSED ERRATA QA Contact: Saurabh <saujain>
Severity: high Docs Contact:
Priority: high    
Version: 2.0CC: kkeithle, mzywusko, rhs-bugs, sdharane, shaines, vagarwal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.4rhs-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-23 22:39:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
nfs.log none

Description Saurabh 2013-01-29 12:02:12 UTC
Created attachment 689754 [details]
nfs.log

Description of problem:

vol type: 2x2 distribute-replicate
mount type: gluster-nfs
test getting executed on mount point:- bonnie++

nfs gets stopped and throws errors.

nfs.log snippet,
===============

[client3_1-fops.c:1781:client3_1_fxattrop_cbk] 0-dist-rep-client-0: remote operation failed: Transport endpoint is not connected
[2013-01-28 22:51:57.968945] E [rpc-clnt.c:208:call_bail] 0-dist-rep-client-0: bailing out frame type(GlusterFS 3.1) op(FXATTROP(34)) xid = 0x144676x sent = 2013-01-28 22:21:54.705027. timeout = 1800
[2013-01-28 22:51:57.968975] W [client3_1-fops.c:1781:client3_1_fxattrop_cbk] 0-dist-rep-client-0: remote operation failed: Transport endpoint is not connected
[2013-01-28 22:51:57.969075] E [rpc-clnt.c:208:call_bail] 0-dist-rep-client-0: bailing out frame type(GlusterFS 3.1) op(FXATTROP(34)) xid = 0x144672x sent = 2013-01-28 22:21:54.704157. timeout = 1800
[2013-01-28 22:51:57.969090] W [client3_1-fops.c:1781:client3_1_fxattrop_cbk] 0-dist-rep-client-0: remote operation failed: Transport endpoint is not connected
[2013-01-28 22:51:57.970173] E [rpc-clnt.c:208:call_bail] 0-dist-rep-client-0: bailing out frame type(GlusterFS 3.1) op(FXATTROP(34)) xid = 0x144667x sent = 2013-01-28 22:21:54.701778. timeout = 1800
[2013-01-28 22:51:57.970202] W [client3_1-fops.c:1781:client3_1_fxattrop_cbk] 0-dist-rep-client-0: remote operation failed: Transport endpoint is not connected
[2013-01-28 22:51:57.970709] E [rpc-clnt.c:208:call_bail] 0-dist-rep-client-0: bailing out frame type(GlusterFS 3.1) op(FXATTROP(34)) xid = 0x144665x sent = 2013-01-28 22:21:54.701514. timeout = 1800
[2013-01-28 22:51:57.970737] W [client3_1-fops.c:1781:client3_1_fxattrop_cbk] 0-dist-rep-client-0: remote operation failed: Transport endpoint is not connected


Version-Release number of selected component (if applicable):
glusterfs-3.3.0.5rhs-37.el6rhs.6.goldman.x86_64

client:- RHEL 6.4 
Linux RHEL6.4snap5 2.6.32-356.el6.x86_64 #1 SMP Mon Jan 21 17:56:56 EST 2013 x86_64 x86_64 x86_64 GNU/Linux


How reproducible:
tried once.

Steps to Reproduce:
1. execute bonnie++ over gluster-nfs mount

Comment 2 Scott Haines 2013-02-06 20:08:04 UTC
Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 3 Scott Haines 2013-02-06 20:11:03 UTC
Per Feb-06 bug triage meeting, targeting for 2.1.0.

Comment 4 Rajesh 2013-02-12 11:37:58 UTC
Hi saurabh, does the bonnie fail for the latest goldman build?

Comment 5 Saurabh 2013-02-13 07:00:11 UTC
Hi Rajesh,

 while executing the bonnie from  a RHEL6.3 the glusterNFS server goes down without mentioning any info in logs.

 Presently executing the bonnie++ from RHEL6.2

Comment 6 Saurabh 2013-02-13 08:09:24 UTC
Rajesh, 

   with RHEL6.4snap5 the issue mentioned in description still happens.

Comment 8 santosh pradhan 2013-05-31 09:56:27 UTC
The fix for BZ 961198 (which is same as the one you mentioned:  https://bugzilla.redhat.com/show_bug.cgi?id=961929) is integrated downstream.

The link for fix:
https://code.engineering.redhat.com/gerrit/#/c/7503/

Fix is available with glusterfs-3.4.0.8rhs. Marking it into ON_QA.

Comment 10 Scott Haines 2013-09-23 22:39:26 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 11 Scott Haines 2013-09-23 22:43:44 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html