Bug 888235 - GlusterFS crashes when graph changes are done while IO is huge
Summary: GlusterFS crashes when graph changes are done while IO is huge
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: vpshastry
QA Contact: Sachidananda Urs
URL:
Whiteboard:
Depends On: 979861
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-12-18 10:57 UTC by Sachidananda Urs
Modified: 2014-08-11 23:22 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.4.0qa8
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-23 22:34:43 UTC
Embargoed:


Attachments (Terms of Use)
Core (738.44 KB, application/x-bzip)
2012-12-18 11:04 UTC, Sachidananda Urs
no flags Details

Description Sachidananda Urs 2012-12-18 10:57:54 UTC
Run compilebench on the mountpoint, for example:

./compilebench -D /mnt/replicate/ -i 25 -r 75

Do some graph changes repeatedly on the servers


for i in `seq 10`; do gluster volume set quick-read on; gluster volume set quick-read off; done

glusterfs client crashes. And on the subsequent mounts, mkdir and rmdir crashes the client.

==========================


[2012-12-18 10:45:03.903009] W [dht-diskusage.c:45:dht_du_info_cbk] 0-rep-dht: failed to get disk info from rep-replicate-1
[2012-12-18 10:45:03.903019] W [dht-layout.c:179:dht_layout_search] 0-rep-dht: no subvolume for hash (value) = 3441719530
[2012-12-18 10:45:03.903032] W [fuse-bridge.c:417:fuse_entry_cbk] 0-glusterfs-fuse: 33: MKDIR() /jon => -1 (Invalid argument)
[2012-12-18 10:45:06.154077] W [common-utils.c:2296:gf_ports_reserved] 0-glusterfs-socket:  is not a valid port identifier
[2012-12-18 10:45:06.154301] E [socket.c:2695:socket_connect] 0-rep-client-1: connection attempt failed (Connection refused)
[2012-12-18 10:45:06.158675] W [common-utils.c:2296:gf_ports_reserved] 0-glusterfs-socket:  is not a valid port identifier
[2012-12-18 10:45:06.158860] E [socket.c:2695:socket_connect] 0-rep-client-2: connection attempt failed (Connection refused)
[2012-12-18 10:45:06.163262] W [common-utils.c:2296:gf_ports_reserved] 0-glusterfs-socket:  is not a valid port identifier
[2012-12-18 10:45:06.163478] E [socket.c:2695:socket_connect] 0-rep-client-3: connection attempt failed (Connection refused)
[2012-12-18 10:45:09.168510] W [common-utils.c:2296:gf_ports_reserved] 0-glusterfs-socket:  is not a valid port identifier
[2012-12-18 10:45:09.168784] E [socket.c:2695:socket_connect] 0-rep-client-1: connection attempt failed (Connection refused)
[2012-12-18 10:45:09.173340] W [common-utils.c:2296:gf_ports_reserved] 0-glusterfs-socket:  is not a valid port identifier
[2012-12-18 10:45:09.173687] E [socket.c:2695:socket_connect] 0-rep-client-2: connection attempt failed (Connection refused)
[2012-12-18 10:45:09.178247] W [common-utils.c:2296:gf_ports_reserved] 0-glusterfs-socket:  is not a valid port identifier
[2012-12-18 10:45:09.178532] E [socket.c:2695:socket_connect] 0-rep-client-3: connection attempt failed (Connection refused)
[2012-12-18 10:45:10.822603] I [afr-common.c:3874:afr_local_init] 0-rep-replicate-1: no subvolumes up
pending frames:
frame : type(1) op(LOOKUP)
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2012-12-18 10:45:10configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.0qa5
/lib64/libc.so.6[0x358ba32900]
/usr/lib64/glusterfs/3.4.0qa5/xlator/cluster/distribute.so(dht_inode_ctx_time_update+0x4e)[0x7fc7821aec9e]
/usr/lib64/glusterfs/3.4.0qa5/xlator/cluster/distribute.so(dht_revalidate_cbk+0x1b1)[0x7fc7821c9291]
/usr/lib64/glusterfs/3.4.0qa5/xlator/cluster/replicate.so(afr_lookup+0x20c)[0x7fc7824373fc]
/usr/lib64/glusterfs/3.4.0qa5/xlator/cluster/distribute.so(dht_lookup+0x275)[0x7fc7821c7635]
/usr/lib64/libglusterfs.so.0(default_lookup+0x6d)[0x7fc786e4ec3d]

Comment 1 Sachidananda Urs 2012-12-18 11:04:56 UTC
Created attachment 665443 [details]
Core

Comment 3 Amar Tumballi 2012-12-18 11:35:58 UTC
looks similar to bug 881013. Can you please work on this?

Comment 4 vpshastry 2012-12-21 06:55:46 UTC
Patch is on review http://review.gluster.com/#change,4338.

Comment 5 Sachidananda Urs 2013-07-23 18:25:18 UTC
Tried with TC: https://tcms.engineering.redhat.com/case/243282/?from_plan=7656 unable to reproduce.

Comment 6 Scott Haines 2013-09-23 22:34:43 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.