Created attachment 567940 [details]
Description of problem:
1.Created a replicate volume with replica 3. Set stat-prefetch/md-cache translator off.
2. On a Fuse client, ran posix compliance & then started running fileop with force factor of 30.
3. Set geo-replication.indexing on
4. Mounted a nfs client & started a directory hierarchy structure creating script which is attached.
5. Brought down 2 bricks, & then issued heal command.
6. Brought the bricks back up & then issued heal full command.
7. Next i brought down the 3rd brick, the nfs server started issuing IO errors with transport endpoint not connected. [No crashes found].
I have attached the whole log directory as well as the script i ran.
Server Nfs log:
[2012-03-06 17:45:25.366349] E [afr-self-heal-common.c:1007:afr_sh_common_lookup_resp_handler] 0-vol-replicate-0: path /simple/49/85 on subvo
lume vol-client-0 => -1 (No such file or directory)
[2012-03-06 17:45:25.823846] I [afr-self-heal-common.c:2037:afr_self_heal_completion_cbk] 0-vol-replicate-0: background meta-data entry self
-heal completed on /simple/49
[2012-03-06 17:45:50.881161] W [socket.c:204:__socket_rwv] 0-vol-client-2: readv failed (Connection reset by peer)
[2012-03-06 17:45:50.881192] W [socket.c:1521:__socket_proto_state_machine] 0-vol-client-2: reading from socket failed. Error (Connection res
et by peer), peer (127.0.1.1:24011)
[2012-03-06 17:45:51.025457] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x123) [0x7f5fbdb2426b]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7f5fbdb237e0] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x
1f) [0x7f5fbdb2326e]))) 0-vol-client-2: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-03-06 17:45:50.852267 (xid=0
[2012-03-06 17:45:51.025565] W [client3_1-fops.c:2180:client3_1_lookup_cbk] 0-vol-client-2: remote operation failed: Transport endpoint is no
t connected. Path: /simple/51/05
[2012-03-06 17:45:51.026190] I [socket.c:2314:socket_submit_request] 0-vol-client-2: not connected (priv->connected = 0)
[2012-03-06 17:45:51.026210] W [rpc-clnt.c:1507:rpc_clnt_submit] 0-vol-client-2: failed to submit rpc-request (XID: 0x97570x Program: Gluster
FS 3.1, ProgVers: 330, Proc: 31) to rpc-transport (vol-client-2)
[2012-03-06 17:45:51.026227] W [client3_1-fops.c:1305:client3_1_entrylk_cbk] 0-vol-client-2: remote operation failed: Transport endpoint is n
[2012-03-06 17:45:51.026272] W [client.c:2011:client_rpc_notify] 0-vol-client-2: Registering a grace timer
[2012-03-06 17:45:51.026287] I [client.c:2024:client_rpc_notify] 0-vol-client-2: disconnected
[2012-03-06 17:45:51.026424] E [socket.c:1724:socket_connect_finish] 0-vol-client-2: connection to 127.0.1.1:24011 failed (Connection refused
[2012-03-06 17:45:51.027251] W [client3_1-fops.c:4789:client3_1_entrylk] 0-vol-client-2: failed to send the fop: Transport endpoint is not co
[2012-03-06 17:45:52.070098] W [nfs3.c:1479:nfs3svc_access_cbk] 0-nfs: 7bb538f9: / => -1 (Transport endpoint is not connected)
[2012-03-06 17:45:52.106642] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 7bb538f9, ACCESS: NFS: 5(I/O error), POSIX: 107(Tr
ansport endpoint is not connected)
[2012-03-06 17:45:52.111343] W [nfs3.c:1479:nfs3svc_access_cbk] 0-nfs: 80b538f9: / => -1 (Transport endpoint is not connected)
[2012-03-06 17:45:52.111383] W [nfs3-helpers.c:3392:nfs3_log_common_res] 0-nfs-nfsv3: XID: 80b538f9, ACCESS: NFS: 5(I/O error), POSIX: 107(Tr
ansport endpoint is not connected)
Created attachment 567941 [details]
directory hierarchy structure creation script
this looks like an afr issue, reassigning to pranith
pre-release version is ambiguous and about to be removed as a choice.
If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.