Bug 764325 (GLUSTER-2593)

Summary: Connection Failed / Refused
Product: [Community] GlusterFS Reporter: Tech Suport @bidorbuy <tech>
Component: transportAssignee: Raghavendra G <raghavendra>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: urgent    
Version: 3.1.3CC: amarts, gluster-bugs, prasanth, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
tar ball of brick01 log
none
tar ball of brick02 log none

Description Tech Suport @bidorbuy 2011-03-28 03:07:10 UTC
Created attachment 464

Comment 1 Tech Suport @bidorbuy 2011-03-28 03:07:43 UTC
Created attachment 465

Comment 2 Tech Suport @bidorbuy 2011-03-28 03:09:55 UTC
The error is as follow

[2011-03-27 13:07:26.308799] E [afr-common.c:110:afr_set_split_brain] 0-shared_images-replicate-0: invalid argument: inode
[2011-03-27 13:07:26.309835] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-shared_images-replicate-0: background  entry self-heal completed on /prodza/user_images/800
[2011-03-27 14:31:12.544537] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(READDIRP(40)) called at 2011-03-27 14:31:10.894860
[2011-03-27 14:31:12.564103] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(READDIRP(40)) called at 2011-03-27 14:31:11.239745
[2011-03-27 14:31:12.564168] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(FINODELK(30)) called at 2011-03-27 14:31:11.312936
[2011-03-27 14:31:12.564231] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(SETATTR(38)) called at 2011-03-27 14:31:11.313011
[2011-03-27 14:31:12.584182] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(READDIRP(40)) called at 2011-03-27 14:31:11.313092
[2011-03-27 14:31:12.584269] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(READDIRP(40)) called at 2011-03-27 14:31:11.313168
[2011-03-27 14:31:12.584370] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(STAT(1)) called at 2011-03-27 14:31:11.443245
[2011-03-27 14:31:12.584465] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(STAT(1)) called at 2011-03-27 14:31:11.736443
[2011-03-27 14:31:12.584540] E [rpc-clnt.c:340:saved_frames_unwind] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2ac49f4a9389] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2ac49f4a8b2e] (-->/opt/glusterfs/3.1.3/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2ac49f4a8a9e]))) 0-rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(STAT(1)) called at 2011-03-27 14:31:12.206095
[2011-03-27 14:31:12.584709] I [client.c:1601:client_rpc_notify] 0-shared_images-client-0: disconnected
[2011-03-27 14:31:12.584770] E [socket.c:1661:socket_connect_finish] 0-shared_images-client-0: connection to 10.0.0.210:24009 failed (Connection refused)

Comment 3 Tech Suport @bidorbuy 2011-03-28 05:27:58 UTC
Additional - error from brick01

[2011-03-27 14:30:13.813265] E [posix.c:699:posix_setattr] 0-shared_images-posix: setattr (lstat) on /data/prodza/user_images/900/.1419921_101128225333_25592-q5800-black-1.jpg.sYdU9w failed: No such file or directory
pending frames:

patchset: v3.1.3
signal received: 6
time of crash: 2011-03-27 14:31:11
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.1.3
/lib64/libc.so.6[0x394ac302d0]
/lib64/libc.so.6(gsignal+0x35)[0x394ac30265]
/lib64/libc.so.6(abort+0x110)[0x394ac31d10]
/lib64/libc.so.6[0x394ac6a84b]
/lib64/libc.so.6[0x394ac72fae]
/lib64/libc.so.6(__libc_calloc+0xcd)[0x394ac7495d]
/opt/glusterfs/3.1.3/lib64/libglusterfs.so.0(__gf_calloc+0x3b)[0x2b07f6bb11eb]
/opt/glusterfs/3.1.3/lib64/libglusterfs.so.0(gf_dirent_for_name+0x26)[0x2b07f6bb1436]
/opt/glusterfs/3.1.3/lib64/glusterfs/3.1.3/xlator/storage/posix.so(posix_do_readdir+0x2fd)[0x2aaaab19ef6d]
/opt/glusterfs/3.1.3/lib64/glusterfs/3.1.3/xlator/storage/posix.so(posix_readdirp+0xf)[0x2aaaab19f6cf]
/opt/glusterfs/3.1.3/lib64/libglusterfs.so.0(default_readdirp+0xe9)[0x2b07f6b935c9]
/opt/glusterfs/3.1.3/lib64/libglusterfs.so.0(default_readdirp+0xe9)[0x2b07f6b935c9]
/opt/glusterfs/3.1.3/lib64/glusterfs/3.1.3/xlator/performance/io-threads.so(iot_readdirp_wrapper+0xe9)[0x2aaaab7d2e79]
/opt/glusterfs/3.1.3/lib64/libglusterfs.so.0(call_resume+0xd66)[0x2b07f6ba3366]
/opt/glusterfs/3.1.3/lib64/glusterfs/3.1.3/xlator/performance/io-threads.so(iot_worker+0x119)[0x2aaaab7d9119]
/lib64/libpthread.so.0[0x394b40673d]
/lib64/libc.so.6(clone+0x6d)[0x394acd3f6d]

Comment 4 Tech Suport @bidorbuy 2011-03-28 06:06:08 UTC
We have a 2 node gluster bricks that replicate to each other. While i was transfering data to the nfs client share we experienced a connection failed / refused error on both brick. I'm attaching the log files for this occurrence.

Comment 5 Raghavendra G 2011-03-28 07:24:27 UTC
Hi,

The attached log files contain some garbage. Can you re-attach them? Did you see any "Out of memory" messages in dmesg output? If not, is it possible for you to run glusterfs servers in valgrind and attach the output of it?

regards,
Raghavendra.

Comment 6 Tech Suport @bidorbuy 2011-03-29 06:56:45 UTC
Hi the log files are to big to attach to this bug report so i have uploaded it to your ftp server ndiavoip.gluster.com.  If you could have a look there for them

Regards
Emile

Comment 7 Raghavendra G 2011-06-09 02:40:14 UTC
Can you paste the glusterfs configuration of two bricks? Is it server-side replication?

regards,
(In reply to comment #6)
> Hi the log files are to big to attach to this bug report so i have uploaded it
> to your ftp server ndiavoip.gluster.com.  If you could have a look there for
> them
> 
> Regards
> Emile

Comment 8 Amar Tumballi 2011-09-28 04:49:33 UTC
the crash report here is very much same as bug 764903 and a patch to fix it has been committed http://patches.gluster.com/patch/7917 

Resolving the issue as resolved. If it still persists with lastest version of glusterfs, please re-open.