Bug 1226254 - Glusterd crash [NEEDINFO]
Summary: Glusterd crash
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.6.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Atin Mukherjee
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-29 10:04 UTC by Felix
Modified: 2015-08-12 05:11 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-12 05:11:44 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
amukherj: needinfo? (felix.delelisdd)


Attachments (Terms of Use)
sosreport (10.00 MB, application/x-xz)
2015-05-29 10:30 UTC, Felix
no flags Details
File1 (10.00 MB, application/x-xz)
2015-05-29 10:50 UTC, Felix
no flags Details
File2 (10.00 MB, application/octet-stream)
2015-05-29 10:59 UTC, Felix
no flags Details
File 3 (10.00 MB, application/octet-stream)
2015-05-29 11:04 UTC, Felix
no flags Details
File 4 (10.00 MB, application/octet-stream)
2015-05-29 11:10 UTC, Felix
no flags Details
File 5 (10.00 MB, application/octet-stream)
2015-05-29 11:17 UTC, Felix
no flags Details
File 6 (10.00 MB, application/octet-stream)
2015-05-29 11:18 UTC, Felix
no flags Details
File 7 (10.00 MB, application/octet-stream)
2015-05-29 11:20 UTC, Felix
no flags Details
File 8 (10.00 MB, application/octet-stream)
2015-05-29 11:25 UTC, Felix
no flags Details
File 9 (10.00 MB, application/octet-stream)
2015-05-29 11:30 UTC, Felix
no flags Details
File 10 (10.00 MB, application/octet-stream)
2015-05-29 11:35 UTC, Felix
no flags Details
File 11 (10.00 MB, application/octet-stream)
2015-05-29 11:39 UTC, Felix
no flags Details
File 12 (10.00 MB, application/octet-stream)
2015-05-29 11:43 UTC, Felix
no flags Details
File 13 (10.00 MB, application/octet-stream)
2015-05-29 11:46 UTC, Felix
no flags Details
File 14 (10.00 MB, application/octet-stream)
2015-05-29 11:54 UTC, Felix
no flags Details
File 15 (10.00 MB, application/octet-stream)
2015-05-29 11:58 UTC, Felix
no flags Details
Glusterd log (4.50 MB, application/x-gzip)
2015-06-01 09:34 UTC, Felix
no flags Details
Cli log (3.20 MB, application/x-gzip)
2015-06-01 09:37 UTC, Felix
no flags Details
Glustershd log (78.75 KB, text/plain)
2015-06-01 10:25 UTC, Felix
no flags Details
cmd history (5.28 MB, text/plain)
2015-06-01 10:30 UTC, Felix
no flags Details

Description Felix 2015-05-29 10:04:33 UTC
Description of problem:

Hi,

I have a cluster with 3 nodes on pre-production. Yesterday, one node was down. The errror that I have seen is that:


[2015-05-28 19:04:27.305560] E [glusterd-syncop.c:1578:gd_sync_task_begin] 0-management: Unable to acquire lock for cfe-gv1
The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 5 times between [2015-05-28 19:04:09.346088] and [2015-05-28 19:04:24.349191]
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-05-28 19:04:27
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.1
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd86e2f1232]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd86e30871d]
/usr/lib64/libc.so.6(+0x35640)[0x7fd86d30c640]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_remove_pending_entry+0x2c)[0x7fd85f52450c]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(+0x5ae28)[0x7fd85f511e28]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x237)[0x7fd85f50f027]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_brick_op_cbk+0x2fe)[0x7fd85f53be5e]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x4c)[0x7fd85f53d48c]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fd86e0c50b0]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x171)[0x7fd86e0c5321]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fd86e0c1273]
/usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x8530)[0x7fd85d17d530]
/usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0xace4)[0x7fd85d17fce4]
/usr/lib64/libglusterfs.so.0(+0x76322)[0x7fd86e346322]
/usr/sbin/glusterd(main+0x502)[0x7fd86e79afb2]
/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd86d2f8af5]
/usr/sbin/glusterd(+0x6351)[0x7fd86e79b351]
---------

Version-Release number of selected component (if applicable):

6.3.1

How reproducible:



Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Felix 2015-05-29 10:30:56 UTC
Created attachment 1031880 [details]
sosreport

Comment 2 Felix 2015-05-29 10:50:24 UTC
Created attachment 1031897 [details]
File1

Comment 3 Felix 2015-05-29 10:59:38 UTC
Created attachment 1031909 [details]
File2

Comment 4 Felix 2015-05-29 11:04:35 UTC
Created attachment 1031921 [details]
File 3

Comment 5 Felix 2015-05-29 11:10:22 UTC
Created attachment 1031935 [details]
File 4

Comment 6 Felix 2015-05-29 11:17:15 UTC
Created attachment 1031949 [details]
File 5

Comment 7 Felix 2015-05-29 11:18:56 UTC
Created attachment 1031952 [details]
File 6

Comment 8 Felix 2015-05-29 11:20:58 UTC
Created attachment 1031953 [details]
File 7

Comment 9 Felix 2015-05-29 11:25:57 UTC
Created attachment 1031982 [details]
File 8

Comment 10 Felix 2015-05-29 11:30:35 UTC
Created attachment 1032015 [details]
File 9

Comment 11 Felix 2015-05-29 11:35:28 UTC
Created attachment 1032017 [details]
File 10

Comment 12 Felix 2015-05-29 11:39:59 UTC
Created attachment 1032018 [details]
File 11

Comment 13 Felix 2015-05-29 11:43:22 UTC
Created attachment 1032019 [details]
File 12

Comment 14 Felix 2015-05-29 11:46:59 UTC
Created attachment 1032022 [details]
File 13

Comment 15 Felix 2015-05-29 11:54:07 UTC
Created attachment 1032034 [details]
File 14

Comment 16 Felix 2015-05-29 11:58:58 UTC
Created attachment 1032036 [details]
File 15

Comment 17 Felix 2015-06-01 09:34:49 UTC
Created attachment 1033229 [details]
Glusterd log

Comment 18 Felix 2015-06-01 09:37:34 UTC
Created attachment 1033231 [details]
Cli log

Comment 19 Felix 2015-06-01 10:25:26 UTC
Created attachment 1033249 [details]
Glustershd log

Comment 20 Atin Mukherjee 2015-06-01 10:28:33 UTC
Please attach the core file and mention the steps performed to hit the crash.

Comment 21 Felix 2015-06-01 10:30:08 UTC
Created attachment 1033252 [details]
cmd history

Comment 22 Atin Mukherjee 2015-06-01 11:48:35 UTC
The problem what I see here is concurrent volume status transactions were run at a given point of time. 3.6.1 has some fixes missing to take care of the issues identified on the same line. If you upgrade your cluster to 3.6.3 beta version the problem will go away. However 3.6.3 still misses one more fix http://review.gluster.org/#/c/10023/ which will be released in 3.6.4.

I would request you to upgrade your cluster to 3.6.3 if not 3.7.

Comment 23 Atin Mukherjee 2015-06-02 04:06:53 UTC
Could you upgrade your cluster and check if this problem goes away, if so then mind to close this bug?

Comment 24 Atin Mukherjee 2015-08-12 05:11:44 UTC
Since the reported hasn't gotten back with updates closing it, feel free to reopen if the problem persists.


Note You need to log in before you can comment on or make changes to this bug.