959486 – RDMA transport transient bad file descriptor error (dd, iozone, java sequence writer tested)

Bug 959486 - RDMA transport transient bad file descriptor error (dd, iozone, java sequence writer tested)

Summary: RDMA transport transient bad file descriptor error (dd, iozone, java sequence...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	rdma
Sub Component:
Version:	3.4.0-alpha
Hardware:	x86_64
OS:	Other
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Raghavendra G
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-05-03 15:30 UTC by Nickolas Wood
Modified:	2015-10-07 14:05 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-10-07 14:05:11 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nickolas Wood 2013-05-03 15:30:10 UTC

Description of problem:
When using the RDMA transport type on a distributed replicated volume there is a chance that a bad file descriptor error is returned. Due note, I have not tested any other volume type. Changing the volume transport type to tcp according to these directions: 
http://community.gluster.org/q/how-to-change-transport-type-on-active-volume---glusterfs-3-3/
and retesting using IPoIB yields 100% passing results; no failure of any kind. This leads me to believe that the OS, my hardware and infiniband fabric are stable and that the issue lies in the RDMA libraries provided by and used by gluster.


Version-Release number of selected component (if applicable):
OFED 1.5.4.1, GLusterFS 3.4.0 alpha3, debian squeeze OS.

How reproducible:
This issue appears transient as not all test iterations will cause it. I have been able to produce this problem using dd, iozone and our own java sequence writer. During all tests, the issue started showing up when the bytesize was >= 64k. I have seen this issue when the underlying disks are formatted as ext4, XFS and ZFS mirror. 


Steps to Reproduce:
1. Created a distributed replicated volume utilizing only the RDMA transport type.
2. Mount
3. Run dd/iozone operations against the mount
    a. I run dd in a loop with 3 iterations of each bytesize. The dd operation I perform: dd if=/dev/zero of=<mnt>/<file> bs=<16k|32k|64k|128k|256K> count=10000
    b. I run iozone in a loop with 1 iteration of each bytesize. The iozone operation I perform: iozone -R -l 5 -u 5 -r <16k|32k|64k|128k|256K> -s 100m -F <mnt>/<file1> <mnt>/<file2> <mnt>/<file3> <mnt>/<file4> <mnt>/<file5>

NOTE: naturally, replace <*> components in the above cmd's with actual information.

  
Actual results:
Bad File Descriptor:
[2013-05-02 15:46:23.156343] W [fuse-bridge.c:2127:fuse_writev_cbk] 0-glusterfs-fuse: 1175: WRITE => -1 (Bad file descriptor)
[2013-05-02 15:46:23.156820] W [fuse-bridge.c:1132:fuse_err_cbk] 0-glusterfs-fuse: 1176: FLUSH() ERR => -1 (Bad file descriptor)

In the failure case with dd, you have to killall glusterfs/glusterd processes before the dd operation will die as it is stuck in kernel io code. Interestingly, iozone will exit gracefully when this error is encountered.


Expected results:
No error


Additional info:
My infiniband fabric is entirely QDR Mellanox.
Given the nature of my environment and the lack of availability of a recent OFED version for debian, I can provide the packages that I have built.
I can also provide the perl programs that I have written to perform the tests.
Connected mode was used during all IPoIB testing.

Comment 1 Raghavendra G 2013-05-17 06:29:48 UTC

Hi Nickolas,

Can you please provide,

1. glusterfs client and server log files.
2. perl scripts you mentioned to reproduce the issues.

regards,
Raghavendra.

Comment 2 Nickolas Wood 2013-05-17 16:46:30 UTC

Since reporting this bug I have upgraded the debian OS to Wheezy, upgraded the OFED software to 3.5 and upgraded the glusterfs software to 3.4.0 beta1. I am now unable to reproduce the indicated bug. All testing I have done since these upgrades took place has passed flawlessly. I will continue investigating and testing.

Comment 3 Nickolas Wood 2013-05-24 21:24:02 UTC

I have to reaffirm that this bug does exist; I just hit it again. I also narrowed down why I wasn't seeing it. There appears to be a difference if the volume has transport rdma vs transport tcp,rdma; even when mounting with <volume>.rdma. When the volume uses tcp,rdma transport, this error is not hit. When the volume uses transport rdma, this error is hit.

Comment 4 Niels de Vos 2015-05-17 22:00:48 UTC

GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5.

This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs".

If there is no response by the end of the month, this bug will get automatically closed.

Comment 5 Kaleb KEITHLEY 2015-10-07 14:05:11 UTC

GlusterFS 3.4.x has reached end-of-life.

If this bug still exists in a later release please reopen this and change the version or open a new bug.

Note You need to log in before you can comment on or make changes to this bug.