Bug 999944 - `Transport end point not connected errors' upon creating files on 2.1 clients mounting U5 volume
`Transport end point not connected errors' upon creating files on 2.1 clients...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: fuse (Show other bugs)
2.1
x86_64 Linux
urgent Severity urgent
: ---
: ---
Assigned To: Amar Tumballi
Sachidananda Urs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-22 08:11 EDT by Sachidananda Urs
Modified: 2013-12-18 19:09 EST (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0.32rhs-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-23 18:38:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sosreports for nfs client hang (413.48 KB, application/x-xz)
2013-08-22 08:53 EDT, Sachidananda Urs
no flags Details
sosreports for nfs client hang - 2 (403.25 KB, application/x-xz)
2013-08-22 08:54 EDT, Sachidananda Urs
no flags Details
Replicate logs (30.00 KB, application/x-bzip)
2013-08-22 09:25 EDT, Sachidananda Urs
no flags Details

  None (edit)
Description Sachidananda Urs 2013-08-22 08:11:06 EDT
Description of problem:

1. Create a U5 volume.
2. Mount the volume with 2.1 client
3. Try creating some files.

Version-Release number of selected component (if applicable):
glusterfs 3.4.0.21rhs built on Aug 20 2013 12:09:43


How reproducible:
Always

Actual results:
Transport end point not connected

Expected results:
File operations should succeed.

Additional info:
sosreports attached.
Comment 1 Justin Clift 2013-08-22 08:19:21 EDT
As a data point, please include the /var/log/gluster* and /var/lib/gluster* files (tarred) with this report, from the client with the problem and the gluster volume servers.

This will let people look through the configuration, and any errors in the log files.
Comment 2 Sachidananda Urs 2013-08-22 08:36:49 EDT
Jclift, I'm in the process of attaching sosreports. Attach is failing, give me a moment.
Comment 3 Sachidananda Urs 2013-08-22 08:45:31 EDT
Client logs that could not mount the volume: mount just hangs:

[2013-08-19 05:46:42.100949] I [glusterfsd.c:1970:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.0.20rhs (/usr/sbin/glusterfs --volfile-id=/testvol --volfile-server=storage-qe10.lab.eng.rdu2.redhat.com /tmp)
[2013-08-19 05:46:42.103896] I [socket.c:3487:socket_init] 0-glusterfs: SSL support is NOT enabled
[2013-08-19 05:46:42.103940] I [socket.c:3502:socket_init] 0-glusterfs: using system polling thread
[2013-08-19 05:55:45.582236] E [glusterfsd-mgmt.c:1644:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:/testvol)
[2013-08-19 05:55:45.585232] W [glusterfsd.c:1062:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x148) [0x7f61a276bf38] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f61a276af35] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3b8) [0x40bbb8]))) 0-: received signum (0), shutting down
[2013-08-19 05:55:45.585258] I [fuse-bridge.c:6289:fini] 0-fuse: Unmounting '/tmp'.
[2013-08-19 05:55:45.585280] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f61a1ce690d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f61a2332851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down
[2013-08-19 05:55:45.596353] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f61a1ce690d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f61a2332851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down
Comment 4 Sachidananda Urs 2013-08-22 08:53:53 EDT
Created attachment 789174 [details]
sosreports for nfs client hang
Comment 5 Sachidananda Urs 2013-08-22 08:54:20 EDT
Created attachment 789175 [details]
sosreports for nfs client hang - 2
Comment 6 Sachidananda Urs 2013-08-22 08:55:28 EDT
Volume info:

Volume Name: hybrid
Type: Distribute
Volume ID: a6cc8192-41c7-441b-8c7c-1e4bd4289f3b
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.70.37.155:/data/re-0
Brick2: 10.70.37.193:/data/re-0
Options Reconfigured:
diagnostics.client-log-level: DEBUG

====================================================

Attaching sosreports for the NFS hang. Trying to reproduce the ENOTCONN error, will update the bug as I progress.
Comment 7 Sachidananda Urs 2013-08-22 09:17:36 EDT
Volume for distributed-replicate setup:

Volume Name: hybrid-1
Type: Replicate
Volume ID: 2b9b2e81-189c-47e0-bc42-b79d3705d3a5
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.70.37.155:/data/re-1
Brick2: 10.70.37.193:/data/re-1

==================================================

In distributed-replicate setup, mount succeeds and data creation fails.

[root@bob-the-minion hy-1]# cp /tmp/linux-3.10.3.tar.xz .
cp: writing `./linux-3.10.3.tar.xz': Transport endpoint is not connected
cp: closing `./linux-3.10.3.tar.xz': Transport endpoint is not connected

============================

[2013-08-22 14:41:34.077513] I [fuse-bridge.c:4706:fuse_init] 0-glusterfs-fuse: FUSE inited with
 protocol versions: glusterfs 7.13 kernel 7.13
[2013-08-22 14:41:34.078794] I [afr-common.c:2121:afr_set_root_inode_on_first_lookup] 0-hybrid-1
-replicate-0: added root inode
[2013-08-22 14:41:48.714858] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-1:
 remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.717916] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-0:
 remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.718036] W [fuse-bridge.c:2695:fuse_writev_cbk] 0-glusterfs-fuse: 32: WRITE 
=> -1 (Transport endpoint is not connected)
[2013-08-22 14:41:48.718721] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-1: remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.721054] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-0: remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.721974] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-1: remote operation failed: Transport endpoint is not connected
[2013-08-22 14:41:48.724112] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-hybrid-1-client-0: remote operation failed: Transport endpoint is not connected
Comment 8 Sachidananda Urs 2013-08-22 09:25:54 EDT
Created attachment 789192 [details]
Replicate logs
Comment 9 Sudhir D 2013-09-05 02:07:09 EDT
<snip>
 Internal Whiteboard: Big Bend → Big Bend U1
 Flags: rhs-2.1.0+ → rhs-2.1.z?
</snip>

Removing the blocker flag as it seem to be targetted for Update1 now.
Comment 10 Amar Tumballi 2013-09-05 04:47:33 EDT
Most probable fix posted upstream for review @ http://review.gluster.org/#/c/5803/
Comment 11 Sachidananda Urs 2013-09-07 09:57:27 EDT
Verified mounting U5 volume with 2.1 clients. Both NFS and Fuse transports tested, working fine. Verified.
Comment 12 Scott Haines 2013-09-23 18:38:34 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html
Comment 13 Scott Haines 2013-09-23 18:41:26 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.