Bug 786094 - fuse client inaccessible with transport endpoint not connected error
Summary: fuse client inaccessible with transport endpoint not connected error
Keywords:
Status: CLOSED DUPLICATE of bug 810450
Alias: None
Product: GlusterFS
Classification: Community
Component: unclassified
Version: pre-release
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: shishir gowda
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-31 12:51 UTC by M S Vishwanath Bhat
Modified: 2016-06-01 01:55 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-04-27 05:37:24 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
fuse client log (19.54 KB, text/x-log)
2012-01-31 12:51 UTC, M S Vishwanath Bhat
no flags Details

Description M S Vishwanath Bhat 2012-01-31 12:51:30 UTC
Created attachment 558618 [details]
fuse client log

Description of problem:
Was building glusterfs on the mountpoint and doing some profile/top operations on the server. It was stripe-replicate volume with stripe-block-size set to 64MB. After make exited successfully with the zero exit status took down one of the replicate pair down,then mountpoint became inaccessible.

Version-Release number of selected component (if applicable):
glusterfs-3.3.0qa20

How reproducible:
1/1

Steps to Reproduce:
1. Create and start a stripe replicate volume.
2. Set the stripe-block-size to 64MB and enable profiling.
3. untar both linux kernel source and glusterfs source and start building the glusterfs source.
4. meanwhile Keep running  some profile and top operations.
5. After 'make' took one of the glusterfsd down.

Actual results:
mountpoint became inaccessible. 

[root@RHEL6 hosa_dir]# ls
ls: reading directory .: Transport endpoint is not connected
[root@RHEL6 hosa_dir]# 



Expected results:
Mountpoint should be accessible.

Additional info:

Following options were set on volume.

Volume Name: hosdu
Type: Striped-Replicate
Volume ID: 56528124-1918-4923-a1cd-c02ddf22e671
Status: Started
Number of Bricks: 1 x 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.1.11.113:/data/brick/hosdu_brick1
Brick2: 10.1.11.114:/data/brick/hosdu_brick2
Brick3: 10.1.11.136:/data/brick/hosdu_brick3
Brick4: 10.1.11.137:/data/brick/hosdu_brick4
Options Reconfigured:
cluster.stripe-block-size: 64MB
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on


Entries from the client log.


2012-01-31 07:21:56.632023] W [client3_1-fops.c:1273:client3_1_finodelk_cbk] 0-hosdu-client-1: remote operation failed: Invalid argument
[2012-01-31 07:21:56.632077] E [afr-lk-common.c:567:afr_unlock_inodelk_cbk] 0-hosdu-replicate-0: /hosa_dir/glusterfs-3.3.0qa20/rpc/rpc-lib/src/rpcsvc.loT: unlock failed on 1, reason: Invalid argument
[2012-01-31 07:28:07.133314] W [socket.c:1510:__socket_proto_state_machine] 0-hosdu-client-0: reading from socket failed. Error (Transport endpoint is not connected), peer (10.1.11.113:24009)
[2012-01-31 07:28:07.133443] I [client.c:1885:client_rpc_notify] 0-hosdu-client-0: disconnected
[2012-01-31 07:28:17.351524] E [socket.c:1713:socket_connect_finish] 0-hosdu-client-0: connection to 10.1.11.113:24009 failed (Connection refused)
[2012-01-31 07:28:19.221959] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1100583: READDIR => -1 (Transport endpoint is not connected)
[2012-01-31 07:28:28.205013] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1100597: READDIR => -1 (Transport endpoint is not connected)
[2012-01-31 07:28:38.486903] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1100615: READDIR => -1 (Transport endpoint is not connected)
[2012-01-31 07:28:39.418324] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1100619: READDIR => -1 (Transport endpoint is not connected)
[2012-01-31 07:28:42.739546] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1100623: READDIR => -1 (Transport endpoint is not connected)
[2012-01-31 07:29:12.037329] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1100627: READDIR => -1 (Transport endpoint is not connected)
[2012-01-31 07:34:52.870849] W [fuse-bridge.c:2352:fuse_readdir_cbk] 0-glusterfs-fuse: 1101063: READDIR => -1 (Transport endpoint is not connected)


I have attached the client log.

Comment 1 M S Vishwanath Bhat 2012-04-12 11:15:58 UTC
I got a core this time around with the glusterfs-3.3.30qa34. 


(gdb) bt
#0  0x00007ffda03c4da1 in stripe_readv_cbk (frame=0x7ffda3c190f4, cookie=<value optimized out>, this=<value optimized out>, op_ret=8070, op_errno=<value optimized out>, vector=<value optimized out>, count=1, stbuf=0x7fff71f33e10, 
    iobref=0x5544190, xdata=0x0) at stripe.c:3271
#1  0x00007ffda05e64f1 in afr_readv_cbk (frame=0x7ffda3db7368, cookie=<value optimized out>, this=<value optimized out>, op_ret=8070, op_errno=2, vector=0x7fff71f33c80, count=1, buf=0x7fff71f33e10, iobref=0x5544190, xdata=0x0)
    at afr-inode-read.c:1298
#2  0x00007ffda085e3fb in client3_1_readv_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7ffda3da3e58) at client3_1-fops.c:2679
#3  0x00007ffda4d2e515 in rpc_clnt_handle_reply (clnt=0x25fda80, pollin=0x590e6b0) at rpc-clnt.c:797
#4  0x00007ffda4d2ed10 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x25fdab0, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:916
#5  0x00007ffda4d29e48 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:498
#6  0x00007ffda1693704 in socket_event_poll_in (this=0x260d4e0) at socket.c:1686
#7  0x00007ffda16937e7 in socket_event_handler (fd=<value optimized out>, idx=1, data=0x260d4e0, poll_in=1, poll_out=0, poll_err=<value optimized out>) at socket.c:1801
#8  0x00007ffda4f75884 in event_dispatch_epoll_handler (event_pool=0x2538db0) at event.c:794
#9  event_dispatch_epoll (event_pool=0x2538db0) at event.c:856
#10 0x0000000000406eda in main (argc=<value optimized out>, argv=0x7fff71f34188) at glusterfsd.c:1650


(gdb) p ((stripe_local_t *)(((stripe_local_t *)(frame->local))->orig_frame->local))->fctx->xl_array[0]
$9 = (xlator_t *) 0x25638e0
(gdb) p ((stripe_local_t *)(((stripe_local_t *)(frame->local))->orig_frame->local))->fctx->xl_array[1]
$10 = (xlator_t *) 0x0

Comment 2 Vijay Bellur 2012-04-18 07:14:37 UTC
Shishir,

Can you please take a look in?

Comment 3 shishir gowda 2012-04-27 05:37:24 UTC

*** This bug has been marked as a duplicate of bug 810450 ***


Note You need to log in before you can comment on or make changes to this bug.