1226254 – Glusterd crash

Bug 1226254 - Glusterd crash

Summary: Glusterd crash

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.6.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Atin Mukherjee
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-05-29 10:04 UTC by Felix
Modified:	2023-09-14 02:59 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-08-12 05:11:44 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sosreport (10.00 MB, application/x-xz) 2015-05-29 10:30 UTC, Felix	no flags	Details
File1 (10.00 MB, application/x-xz) 2015-05-29 10:50 UTC, Felix	no flags	Details
File2 (10.00 MB, application/octet-stream) 2015-05-29 10:59 UTC, Felix	no flags	Details
File 3 (10.00 MB, application/octet-stream) 2015-05-29 11:04 UTC, Felix	no flags	Details
File 4 (10.00 MB, application/octet-stream) 2015-05-29 11:10 UTC, Felix	no flags	Details
File 5 (10.00 MB, application/octet-stream) 2015-05-29 11:17 UTC, Felix	no flags	Details
File 6 (10.00 MB, application/octet-stream) 2015-05-29 11:18 UTC, Felix	no flags	Details
File 7 (10.00 MB, application/octet-stream) 2015-05-29 11:20 UTC, Felix	no flags	Details
File 8 (10.00 MB, application/octet-stream) 2015-05-29 11:25 UTC, Felix	no flags	Details
File 9 (10.00 MB, application/octet-stream) 2015-05-29 11:30 UTC, Felix	no flags	Details
File 10 (10.00 MB, application/octet-stream) 2015-05-29 11:35 UTC, Felix	no flags	Details
File 11 (10.00 MB, application/octet-stream) 2015-05-29 11:39 UTC, Felix	no flags	Details
File 12 (10.00 MB, application/octet-stream) 2015-05-29 11:43 UTC, Felix	no flags	Details
File 13 (10.00 MB, application/octet-stream) 2015-05-29 11:46 UTC, Felix	no flags	Details
File 14 (10.00 MB, application/octet-stream) 2015-05-29 11:54 UTC, Felix	no flags	Details
File 15 (10.00 MB, application/octet-stream) 2015-05-29 11:58 UTC, Felix	no flags	Details
Glusterd log (4.50 MB, application/x-gzip) 2015-06-01 09:34 UTC, Felix	no flags	Details
Cli log (3.20 MB, application/x-gzip) 2015-06-01 09:37 UTC, Felix	no flags	Details
Glustershd log (78.75 KB, text/plain) 2015-06-01 10:25 UTC, Felix	no flags	Details
cmd history (5.28 MB, text/plain) 2015-06-01 10:30 UTC, Felix	no flags	Details
Show Obsolete (16) View All

Description Felix 2015-05-29 10:04:33 UTC

Description of problem:

Hi,

I have a cluster with 3 nodes on pre-production. Yesterday, one node was down. The errror that I have seen is that:


[2015-05-28 19:04:27.305560] E [glusterd-syncop.c:1578:gd_sync_task_begin] 0-management: Unable to acquire lock for cfe-gv1
The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 5 times between [2015-05-28 19:04:09.346088] and [2015-05-28 19:04:24.349191]
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-05-28 19:04:27
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.1
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd86e2f1232]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd86e30871d]
/usr/lib64/libc.so.6(+0x35640)[0x7fd86d30c640]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_remove_pending_entry+0x2c)[0x7fd85f52450c]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(+0x5ae28)[0x7fd85f511e28]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x237)[0x7fd85f50f027]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_brick_op_cbk+0x2fe)[0x7fd85f53be5e]
/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x4c)[0x7fd85f53d48c]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fd86e0c50b0]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x171)[0x7fd86e0c5321]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fd86e0c1273]
/usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x8530)[0x7fd85d17d530]
/usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0xace4)[0x7fd85d17fce4]
/usr/lib64/libglusterfs.so.0(+0x76322)[0x7fd86e346322]
/usr/sbin/glusterd(main+0x502)[0x7fd86e79afb2]
/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd86d2f8af5]
/usr/sbin/glusterd(+0x6351)[0x7fd86e79b351]
---------

Version-Release number of selected component (if applicable):

6.3.1

How reproducible:



Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Felix 2015-05-29 10:30:56 UTC

Created attachment 1031880 [details]
sosreport

Comment 2 Felix 2015-05-29 10:50:24 UTC

Created attachment 1031897 [details]
File1

Comment 3 Felix 2015-05-29 10:59:38 UTC

Created attachment 1031909 [details]
File2

Comment 4 Felix 2015-05-29 11:04:35 UTC

Created attachment 1031921 [details]
File 3

Comment 5 Felix 2015-05-29 11:10:22 UTC

Created attachment 1031935 [details]
File 4

Comment 6 Felix 2015-05-29 11:17:15 UTC

Created attachment 1031949 [details]
File 5

Comment 7 Felix 2015-05-29 11:18:56 UTC

Created attachment 1031952 [details]
File 6

Comment 8 Felix 2015-05-29 11:20:58 UTC

Created attachment 1031953 [details]
File 7

Comment 9 Felix 2015-05-29 11:25:57 UTC

Created attachment 1031982 [details]
File 8

Comment 10 Felix 2015-05-29 11:30:35 UTC

Created attachment 1032015 [details]
File 9

Comment 11 Felix 2015-05-29 11:35:28 UTC

Created attachment 1032017 [details]
File 10

Comment 12 Felix 2015-05-29 11:39:59 UTC

Created attachment 1032018 [details]
File 11

Comment 13 Felix 2015-05-29 11:43:22 UTC

Created attachment 1032019 [details]
File 12

Comment 14 Felix 2015-05-29 11:46:59 UTC

Created attachment 1032022 [details]
File 13

Comment 15 Felix 2015-05-29 11:54:07 UTC

Created attachment 1032034 [details]
File 14

Comment 16 Felix 2015-05-29 11:58:58 UTC

Created attachment 1032036 [details]
File 15

Comment 17 Felix 2015-06-01 09:34:49 UTC

Created attachment 1033229 [details]
Glusterd log

Comment 18 Felix 2015-06-01 09:37:34 UTC

Created attachment 1033231 [details]
Cli log

Comment 19 Felix 2015-06-01 10:25:26 UTC

Created attachment 1033249 [details]
Glustershd log

Comment 20 Atin Mukherjee 2015-06-01 10:28:33 UTC

Please attach the core file and mention the steps performed to hit the crash.

Comment 21 Felix 2015-06-01 10:30:08 UTC

Created attachment 1033252 [details]
cmd history

Comment 22 Atin Mukherjee 2015-06-01 11:48:35 UTC

The problem what I see here is concurrent volume status transactions were run at a given point of time. 3.6.1 has some fixes missing to take care of the issues identified on the same line. If you upgrade your cluster to 3.6.3 beta version the problem will go away. However 3.6.3 still misses one more fix http://review.gluster.org/#/c/10023/ which will be released in 3.6.4.

I would request you to upgrade your cluster to 3.6.3 if not 3.7.

Comment 23 Atin Mukherjee 2015-06-02 04:06:53 UTC

Could you upgrade your cluster and check if this problem goes away, if so then mind to close this bug?

Comment 24 Atin Mukherjee 2015-08-12 05:11:44 UTC

Since the reported hasn't gotten back with updates closing it, feel free to reopen if the problem persists.

Comment 25 Red Hat Bugzilla 2023-09-14 02:59:51 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.