Bug 999944 - `Transport end point not connected errors' upon creating files on 2.1 clients mounting U5 volume
Summary: `Transport end point not connected errors' upon creating files on 2.1 clients...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: fuse
Version: 2.1
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Amar Tumballi
QA Contact: Sachidananda Urs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-22 12:11 UTC by Sachidananda Urs
Modified: 2013-12-19 00:09 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.4.0.32rhs-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-23 22:38:34 UTC
Embargoed:


Attachments (Terms of Use)
sosreports for nfs client hang (413.48 KB, application/x-xz)
2013-08-22 12:53 UTC, Sachidananda Urs
no flags Details
sosreports for nfs client hang - 2 (403.25 KB, application/x-xz)
2013-08-22 12:54 UTC, Sachidananda Urs
no flags Details
Replicate logs (30.00 KB, application/x-bzip)
2013-08-22 13:25 UTC, Sachidananda Urs
no flags Details

Description Sachidananda Urs 2013-08-22 12:11:06 UTC
Description of problem:

1. Create a U5 volume.
2. Mount the volume with 2.1 client
3. Try creating some files.

Version-Release number of selected component (if applicable):
glusterfs 3.4.0.21rhs built on Aug 20 2013 12:09:43


How reproducible:
Always

Actual results:
Transport end point not connected

Expected results:
File operations should succeed.

Additional info:
sosreports attached.

Comment 1 Justin Clift 2013-08-22 12:19:21 UTC
As a data point, please include the /var/log/gluster* and /var/lib/gluster* files (tarred) with this report, from the client with the problem and the gluster volume servers.

This will let people look through the configuration, and any errors in the log files.

Comment 2 Sachidananda Urs 2013-08-22 12:36:49 UTC
Jclift, I'm in the process of attaching sosreports. Attach is failing, give me a moment.

Comment 3 Sachidananda Urs 2013-08-22 12:45:31 UTC
Client logs that could not mount the volume: mount just hangs:

[2013-08-19 05:46:42.100949] I [glusterfsd.c:1970:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.0.20rhs (/usr/sbin/glusterfs --volfile-id=/testvol --volfile-server=storage-qe10.lab.eng.rdu2.redhat.com /tmp)
[2013-08-19 05:46:42.103896] I [socket.c:3487:socket_init] 0-glusterfs: SSL support is NOT enabled
[2013-08-19 05:46:42.103940] I [socket.c:3502:socket_init] 0-glusterfs: using system polling thread
[2013-08-19 05:55:45.582236] E [glusterfsd-mgmt.c:1644:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:/testvol)
[2013-08-19 05:55:45.585232] W [glusterfsd.c:1062:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x148) [0x7f61a276bf38] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f61a276af35] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3b8) [0x40bbb8]))) 0-: received signum (0), shutting down
[2013-08-19 05:55:45.585258] I [fuse-bridge.c:6289:fini] 0-fuse: Unmounting '/tmp'.
[2013-08-19 05:55:45.585280] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f61a1ce690d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f61a2332851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down
[2013-08-19 05:55:45.596353] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f61a1ce690d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f61a2332851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down

Comment 4 Sachidananda Urs 2013-08-22 12:53:53 UTC
Created attachment 789174 [details]
sosreports for nfs client hang

Comment 5 Sachidananda Urs 2013-08-22 12:54:20 UTC
Created attachment 789175 [details]
sosreports for nfs client hang - 2

Comment 6 Sachidananda Urs 2013-08-22 12:55:28 UTC
Volume info:

Volume Name: hybrid
Type: Distribute
Volume ID: a6cc8192-41c7-441b-8c7c-1e4bd4289f3b
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.70.37.155:/data/re-0
Brick2: 10.70.37.193:/data/re-0
Options Reconfigured:
diagnostics.client-log-level: DEBUG

====================================================

Attaching sosreports for the NFS hang. Trying to reproduce the ENOTCONN error, will update the bug as I progress.

Comment 7 Sachidananda Urs 2013-08-22 13:17:36 UTC
Volume for distributed-replicate setup:

Volume Name: hybrid-1
Type: Replicate
Volume ID: 2b9b2e81-189c-47e0-bc42-b79d3705d3a5
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.70.37.155:/data/re-1
Brick2: 10.70.37.193:/data/re-1

==================================================

In distributed-replicate setup, mount succeeds and data creation fails.

[root@bob-the-minion hy-1]# cp /tmp/linux-3.10.3.tar.xz .
cp: writing `./linux-3.10.3.tar.xz': Transport endpoint is not connected
cp: closing `./linux-3.10.3.tar.xz': Transport endpoint is not connected

============================

[2013-08-22 14:41:34.077513] I [fuse-bridge.c:4706:fuse_init] 0-glusterfs-fuse: FUSE inited with
 protocol versions: glusterfs 7.13 kernel 7.13
[2013-08-22 14:41:34.078794] I [afr-common.c:2121:afr_set_root_inode_on_first_lookup] 0-hybrid-1
-replicate-0: added root inode
[2013-08-22 14:41:48.714858] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-1:
 remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.717916] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-0:
 remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.718036] W [fuse-bridge.c:2695:fuse_writev_cbk] 0-glusterfs-fuse: 32: WRITE 
=> -1 (Transport endpoint is not connected)
[2013-08-22 14:41:48.718721] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-1: remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.721054] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-0: remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.721974] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-1: remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.724112] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-0: remote operation failed: Transport endpoint is not connected

Comment 8 Sachidananda Urs 2013-08-22 13:25:54 UTC
Created attachment 789192 [details]
Replicate logs

Comment 9 Sudhir D 2013-09-05 06:07:09 UTC
<snip>
 Internal Whiteboard: Big Bend → Big Bend U1
 Flags: rhs-2.1.0+ → rhs-2.1.z?
</snip>

Removing the blocker flag as it seem to be targetted for Update1 now.

Comment 10 Amar Tumballi 2013-09-05 08:47:33 UTC
Most probable fix posted upstream for review @ http://review.gluster.org/#/c/5803/

Comment 11 Sachidananda Urs 2013-09-07 13:57:27 UTC
Verified mounting U5 volume with 2.1 clients. Both NFS and Fuse transports tested, working fine. Verified.

Comment 12 Scott Haines 2013-09-23 22:38:34 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 13 Scott Haines 2013-09-23 22:41:26 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.