Bug 796119 - [glusterfs-3.3.0qa23]: glusterfs server crashed when statedump was issued
Summary: [glusterfs-3.3.0qa23]: glusterfs server crashed when statedump was issued
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: protocol
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 817967
TreeView+ depends on / blocked
 
Reported: 2012-02-22 10:56 UTC by Raghavendra Bhat
Modified: 2013-12-19 00:07 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:53:05 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions: glusterfs-3.3.0qa40
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2012-02-22 10:56:46 UTC
Description of problem:
2x2 distributed replicate volume. 1 fuse client. Ran sanity script on the fuse mount and after the tests are over tool the statedump of the volume (bricks).

glusterfs servers running in all the peers of the cluster segfaulted.

This is the backtrace of the core.

Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id mirror.10.1.11.130.export-'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336
336                     total_read  += xprt->total_bytes_read;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6_1.3.x86_64 libgcc-4.4.5-6.el6.x86_64
(gdb) bt
#0  0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336
#1  0x00007f883bbfc1bf in gf_proc_dump_xlator_info (top=0x118df40) at ../../../libglusterfs/src/statedump.c:403
#2  0x00007f883bbfcc15 in gf_proc_dump_info (signum=10) at ../../../libglusterfs/src/statedump.c:655
#3  0x000000000040738d in glusterfs_sigwaiter (arg=0x7fff6c0e2a10) at ../../../glusterfsd/src/glusterfsd.c:1342
#4  0x00000034c2a077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x00000034c22e577d in clone () from /lib64/libc.so.6
(gdb) f 0
#0  0x00007f8834cee99e in server_priv (this=0x118df40) at ../../../../../xlators/protocol/server/src/server.c:336
336                     total_read  += xprt->total_bytes_read;
(gdb) p xprt
$1 = (rpc_transport_t *) 0xcafebabe00007ce8
(gdb) l
331             conf = this->private;
332             if (!conf)
333                     return 0;
334
335             list_for_each_entry (xprt, &conf->xprt_list, list) {
336                     total_read  += xprt->total_bytes_read;
337                     total_write += xprt->total_bytes_write;
338             }
339
340             gf_proc_dump_build_key(key, "server", "total-bytes-read");
(gdb) p conf->xprt_list
$2 = {next = 0x11808a0, prev = 0x11c27a0}
(gdb) p *conf
$3 = {rpc = 0x11913b0, rpc_conf = {max_block_size = 4194304}, inode_lru_limit = 1024, verify_volfile = _gf_true, trace = _gf_false, 
  conf_dir = 0x7f8834d100a8 "/usr/local/etc/glusterfs", volfile = 0x0, grace_tv = {tv_sec = 10, tv_usec = 0}, auth_modules = 0x118e9f0, 
  mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}, conns = {next = 0x11bee90, prev = 0x11a25a0}, xprt_list = {next = 0x11808a0, 
    prev = 0x11c27a0}}
(gdb) p *xprt
Cannot access memory at address 0xcafebabe00007ce8
(gdb)info thr
  8 Thread 0x7f883455f700 (LWP 13920)  0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  7 Thread 0x7f88367da700 (LWP 13910)  0x00000034c2a0ecbd in nanosleep () from /lib64/libpthread.so.0
  6 Thread 0x7f882f3fc700 (LWP 14060)  0x00000034c2a0b74b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5 Thread 0x7f8838c74700 (LWP 13906)  0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4 Thread 0x7f883b764700 (LWP 13904)  0x00000034c22e5d73 in epoll_wait () from /lib64/libc.so.6
  3 Thread 0x7f882ffff700 (LWP 13922)  0x00000034c2a0b74b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2 Thread 0x7f8838273700 (LWP 13907)  0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1 Thread 0x7f8839675700 (LWP 13905)  0x00007f8834cee99e in server_priv (this=0x118df40)
    at ../../../../../xlators/protocol/server/src/server.c:336
(gdb) 



Version-Release number of selected component (if applicable):


How reproducible:

Steps to Reproduce:
1. Ran sanity script
2. take the statedump of the volume
3.
  
Actual results:

glusterfs servers crashed

Expected results:

glusterfs servers should not crash.

Additional info:

logs of the crashed server.

[2012-02-22 02:42:31.019360] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4167147: FINODELK 113 (daa452b7-447f-4e7f-bae7-084c
5fb4435a) ==> -1 (Invalid argument)
[2012-02-22 02:42:31.584372] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4167884: FINODELK 113 (f3855d49-c316-45eb-a6f1-02b320a5860b) ==> -1 (Invalid argument)
[2012-02-22 02:42:40.034678] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4176969: FINODELK 43 (714e946a-26d1-4c4a-ae7d-abb98666b0bb) ==> -1 (Invalid argument)
[2012-02-22 02:43:40.739659] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4213829: FINODELK 113 (a573ad13-4ae7-45e9-a2b4-ebf5c2f87a4f) ==> -1 (Invalid argument)
[2012-02-22 02:44:36.537695] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4250861: FINODELK 43 (b442c196-c9c4-44e8-a6f3-c9afe0713c07) ==> -1 (Invalid argument)
[2012-02-22 02:45:25.625239] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4293977: FINODELK 113 (4f20a9cd-29f2-46cc-b321-92e187c3394b) ==> -1 (Invalid argument)
[2012-02-22 02:46:01.620192] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4323819: FINODELK 43 (aff78366-4122-4570-89e8-95b81dcb30d0) ==> -1 (Invalid argument)
[2012-02-22 02:47:02.022653] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4362653: FINODELK 113 (93bf7adf-174f-40dc-ade0-0f07d42c4f88) ==> -1 (Invalid argument)
[2012-02-22 02:47:07.644661] I [server3_1-fops.c:277:server_finodelk_cbk] 0-mirror-server: 4367012: FINODELK 113 (ed78b83b-e388-4118-9983-bf06b5584b81) ==> -1 (Invalid argument)
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2012-02-22 04:20:11
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0qa23
/lib64/libc.so.6[0x34c2232980]
/usr/local/lib/glusterfs/3.3.0qa23/xlator/protocol/server.so(server_priv+0xfa)[0x7f8834cee99e]
/usr/local/lib/libglusterfs.so.0(gf_proc_dump_xlator_info+0x151)[0x7f883bbfc1bf]
/usr/local/lib/libglusterfs.so.0(gf_proc_dump_info+0x23e)[0x7f883bbfcc15]
/usr/local/sbin/glusterfsd(glusterfs_sigwaiter+0x11a)[0x40738d]
/lib64/libpthread.so.0[0x34c2a077e1]
/lib64/libc.so.6(clone+0x6d)[0x34c22e577d]

Comment 1 Amar Tumballi 2012-03-02 04:22:33 UTC
should be fixed by http://review.gluster.com/2806

please do check the behavior now with 3.3.0qa25

Comment 2 Amar Tumballi 2012-03-14 10:29:34 UTC
after the patch : http://review.gluster.com/2911, this should not be happening anymore.

Comment 3 Raghavendra Bhat 2012-05-09 09:54:55 UTC
Checked with glusterfs-3.3.0qa40. Now after statedumps server processes does not crash.


Note You need to log in before you can comment on or make changes to this bug.