Bug 800396 - [228d01916c57d5a5716e1097e39e7aa06f31f3e4]: nfs client reports IO error with transport endpoint not connected
Summary: [228d01916c57d5a5716e1097e39e7aa06f31f3e4]: nfs client reports IO error with ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: pre-release
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-06 12:50 UTC by Rahul C S
Modified: 2015-10-22 15:40 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-22 15:40:20 UTC
Regression: ---
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
glusterfs logs (264.03 KB, application/x-bzip)
2012-03-06 12:50 UTC, Rahul C S
no flags Details
directory hierarchy structure creation script (504 bytes, application/x-shellscript)
2012-03-06 12:51 UTC, Rahul C S
no flags Details

Description Rahul C S 2012-03-06 12:50:12 UTC
Created attachment 567940 [details]
glusterfs logs

Description of problem:
1.Created a replicate volume with replica 3. Set stat-prefetch/md-cache translator off.
2. On a Fuse client, ran posix compliance & then started running fileop with force factor of 30.
3. Set geo-replication.indexing on
4. Mounted a nfs client & started a directory hierarchy structure creating script which is attached.
5. Brought down 2 bricks, & then issued heal command. 
6. Brought the bricks back up & then issued heal full command.
7. Next i brought down the 3rd brick, the nfs server started issuing IO errors with transport endpoint not connected. [No crashes found]. 

I have attached the whole log directory as well as the script i ran.

Server Nfs log:
[2012-03-06 17:45:25.366349] E [afr-self-heal-common.c:1007:afr_sh_common_lookup_resp_handler] 0-vol-replicate-0: path /simple/49/85 on subvo
lume vol-client-0 => -1 (No such file or directory)
[2012-03-06 17:45:25.823846] I [afr-self-heal-common.c:2037:afr_self_heal_completion_cbk] 0-vol-replicate-0: background  meta-data entry self
-heal completed on /simple/49
[2012-03-06 17:45:50.881161] W [socket.c:204:__socket_rwv] 0-vol-client-2: readv failed (Connection reset by peer)
[2012-03-06 17:45:50.881192] W [socket.c:1521:__socket_proto_state_machine] 0-vol-client-2: reading from socket failed. Error (Connection res
et by peer), peer (127.0.1.1:24011)
[2012-03-06 17:45:51.025457] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7f5fbdb2426b] 
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7f5fbdb237e0] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x
1f) [0x7f5fbdb2326e]))) 0-vol-client-2: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-03-06 17:45:50.852267 (xid=0
x97569x)
[2012-03-06 17:45:51.025565] W [client3_1-fops.c:2180:client3_1_lookup_cbk] 0-vol-client-2: remote operation failed: Transport endpoint is no
t connected. Path: /simple/51/05
[2012-03-06 17:45:51.026190] I [socket.c:2314:socket_submit_request] 0-vol-client-2: not connected (priv->connected = 0)
[2012-03-06 17:45:51.026210] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-vol-client-2: failed to submit rpc-request (XID: 0x97570x Program: Gluster
FS 3.1, ProgVers: 330, Proc: 31) to rpc-transport (vol-client-2)
[2012-03-06 17:45:51.026227] W [client3_1-fops.c:1305:client3_1_entrylk_cbk] 0-vol-client-2: remote operation failed: Transport endpoint is n
ot connected
[2012-03-06 17:45:51.026272] W [client.c:2011:client_rpc_notify] 0-vol-client-2: Registering a grace timer
[2012-03-06 17:45:51.026287] I [client.c:2024:client_rpc_notify] 0-vol-client-2: disconnected
[2012-03-06 17:45:51.026424] E [socket.c:1724:socket_connect_finish] 0-vol-client-2: connection to 127.0.1.1:24011 failed (Connection refused
)
[2012-03-06 17:45:51.027251] W [client3_1-fops.c:4789:client3_1_entrylk] 0-vol-client-2: failed to send the fop: Transport endpoint is not co
nnected
[2012-03-06 17:45:52.070098] W [nfs3.c:1479:nfs3svc_access_cbk] 0-nfs: 7bb538f9: / => -1 (Transport endpoint is not connected)
[2012-03-06 17:45:52.106642] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 7bb538f9, ACCESS: NFS: 5(I/O error), POSIX: 107(Tr
ansport endpoint is not connected)
[2012-03-06 17:45:52.111343] W [nfs3.c:1479:nfs3svc_access_cbk] 0-nfs: 80b538f9: / => -1 (Transport endpoint is not connected)
[2012-03-06 17:45:52.111383] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 80b538f9, ACCESS: NFS: 5(I/O error), POSIX: 107(Tr
ansport endpoint is not connected)

Comment 1 Rahul C S 2012-03-06 12:51:54 UTC
Created attachment 567941 [details]
directory hierarchy structure creation script

Comment 2 Rajesh 2012-03-15 06:20:42 UTC
this looks like an afr issue, reassigning to pranith

Comment 4 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.


Note You need to log in before you can comment on or make changes to this bug.