Bug 822067
Summary: | [27ae1677eb2a6ed4a04bda0df5cc92f2780c11ed]: glusterfs server crashed since loc->gfid was NULL | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Raghavendra Bhat <rabhat> | ||||||
Component: | quota | Assignee: | Raghavendra Bhat <rabhat> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | mainline | CC: | amarts, gluster-bugs, rfortier, vbellur | ||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | glusterfs-3.4.0 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 848318 (view as bug list) | Environment: | |||||||
Last Closed: | 2013-07-24 17:31:22 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 848318 | ||||||||
Attachments: |
|
Created attachment 584902 [details]
header file for the program attached
fixed in patch @ http://review.gluster.com/3567 |
Created attachment 584900 [details] multi threaded program running on one of the fuse clients Description of problem: 3x2 distributed replicate volume. 2 fuse clients. 1 client executing threaded-io and the other client executing dbench. Volume set operations were running, brought a brick from each replicate pair with some intervals. Gave volume start force after some time and did volume heal (also find |xargs stat). glusterfs brick crashed with the following backtrace. Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id mirror.hyperspace.mnt-sda8'. Program terminated with signal 6, Aborted. #0 0x00007fb15b802d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. in ../nptl/sysdeps/unix/sysv/linux/raise.c (gdb) bt #0 0x00007fb15b802d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007fb15b806ab6 in abort () at abort.c:92 #2 0x00007fb15b7fb7c5 in __assert_fail (assertion=0x7fb1571e22b1 "!\"uuid null\"", file=<value optimized out>, line=1790, function=<value optimized out>) at assert.c:81 #3 0x00007fb1571dc2a9 in mq_fetch_child_size_and_contri (frame=0x7fb15a601100, cookie=0x7fb15a807e88, this=0x186ccd0, op_ret=0, op_errno=0, xdata=0x0) at ../../../../../xlators/features/marker/src/marker-quota.c:1790 #4 0x00007fb15c56529a in default_setxattr_cbk (frame=0x7fb15a807e88, cookie=0x7fb15a807724, this=0x186baa0, op_ret=0, op_errno=0, xdata=0x0) at ../../../libglusterfs/src/defaults.c:284 #5 0x00007fb157602bb9 in iot_setxattr_cbk (frame=0x7fb15a807724, cookie=0x7fb15a8100e0, this=0x186a810, op_ret=0, op_errno=0, xdata=0x0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:1627 #6 0x00007fb15c56529a in default_setxattr_cbk (frame=0x7fb15a8100e0, cookie=0x7fb15a807bd8, this=0x18696b0, op_ret=0, op_errno=0, xdata=0x0) at ../../../libglusterfs/src/defaults.c:284 #7 0x00007fb157a390eb in posix_acl_setxattr_cbk (frame=0x7fb15a807bd8, cookie=0x7fb15a80a784, this=0x18684d0, op_ret=0, op_errno=0, xdata=0x0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:1802 #8 0x00007fb157c54ca3 in posix_setxattr (frame=0x7fb15a80a784, this=0x1867170, loc=0x7fb15a4d1074, dict=0x7fb15a47415c, flags=0, xdata=0x0) at ../../../../../xlators/storage/posix/src/posix.c:2417 #9 0x00007fb157a39391 in posix_acl_setxattr (frame=0x7fb15a807bd8, this=0x18684d0, loc=0x7fb15a4d1074, xattr=0x7fb15a47415c, flags=0, xdata=0x0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:1821 #10 0x00007fb15c56d32e in default_setxattr (frame=0x7fb15a8100e0, this=0x18696b0, loc=0x7fb15a4d1074, dict=0x7fb15a47415c, flags=0, xdata=0x0) at ../../../libglusterfs/src/defaults.c:889 #11 0x00007fb157602e15 in iot_setxattr_wrapper (frame=0x7fb15a807724, this=0x186a810, loc=0x7fb15a4d1074, dict=0x7fb15a47415c, flags=0, xdata=0x0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:1636 #12 0x00007fb15c58748e in call_resume_wind (stub=0x7fb15a4d1034) at ../../../libglusterfs/src/call-stub.c:2531 #13 0x00007fb15c58ed9b in call_resume (stub=0x7fb15a4d1034) at ../../../libglusterfs/src/call-stub.c:4151 #14 0x00007fb1575f890d in iot_worker (data=0x18814f0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:131 #15 0x00007fb15bef8d8c in start_thread (arg=0x7fb154d47700) at pthread_create.c:304 #16 0x00007fb15b8b504d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #17 0x0000000000000000 in ?? () (gdb) f 3 #3 0x00007fb1571dc2a9 in mq_fetch_child_size_and_contri (frame=0x7fb15a601100, cookie=0x7fb15a807e88, this=0x186ccd0, op_ret=0, op_errno=0, xdata=0x0) at ../../../../../xlators/features/marker/src/marker-quota.c:1790 1790 GF_UUID_ASSERT (local->loc.gfid); (gdb) l 1785 mq_set_ctx_updation_status (local->ctx, _gf_false); 1786 1787 if (uuid_is_null (local->loc.gfid)) 1788 uuid_copy (local->loc.gfid, local->loc.inode->gfid); 1789 1790 GF_UUID_ASSERT (local->loc.gfid); 1791 1792 STACK_WIND (frame, mq_update_inode_contribution, FIRST_CHILD(this), 1793 FIRST_CHILD(this)->fops->lookup, &local->loc, newdict); 1794 (gdb) p local->loc $1 = {path = 0x1d89f10 "/clients/client12/~dmtmp/COREL/GRAPHIC1.CDR", name = 0x1d89f2f "GRAPHIC1.CDR", inode = 0x7fb155a08bec, parent = 0x7fb1559f8980, gfid = '\000' <repeats 15 times>, pargfid = "\aZ\354\345\343\aF\243\243\065\025\006\275 \234\215"} (gdb) p *local->loc.inode $2 = {table = 0x188ecb0, gfid = '\000' <repeats 15 times>, lock = 1, nlookup = 0, ref = 1, ia_type = IA_INVAL, fd_list = { next = 0x7fb155a08c1c, prev = 0x7fb155a08c1c}, dentry_list = {next = 0x7fb155a08c2c, prev = 0x7fb155a08c2c}, hash = { next = 0x7fb155a08c3c, prev = 0x7fb155a08c3c}, list = {next = 0x7fb155a0777c, prev = 0x188ed10}, _ctx = 0x7fb14cda6cc0} (gdb) info thr 23 Thread 31833 0x00007fb15bf0139d in fsync () at ../sysdeps/unix/syscall-template.S:82 22 Thread 31864 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 21 Thread 31868 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 20 Thread 31838 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 19 Thread 31817 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 18 Thread 31865 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 17 Thread 31869 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 16 Thread 31839 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 15 Thread 31840 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 14 Thread 31835 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 13 Thread 31867 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 12 Thread 31837 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 11 Thread 31866 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 10 Thread 31836 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 9 Thread 31773 0x00007fb15bf014bd in nanosleep () at ../sysdeps/unix/syscall-template.S:82 8 Thread 31834 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 7 Thread 31768 0x00007fb15b8b56a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82 6 Thread 31771 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 5 Thread 31816 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 4 Thread 31770 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 3 Thread 31769 do_sigwait (set=<value optimized out>, sig=0x7fb15a28eeb8) at ../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:65 2 Thread 31820 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 * 1 Thread 31863 0x00007fb15b802d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 (gdb) t 23 [Switching to thread 23 (Thread 31833)]#0 0x00007fb15bf0139d in fsync () at ../sysdeps/unix/syscall-template.S:82 82 ../sysdeps/unix/syscall-template.S: No such file or directory. in ../sysdeps/unix/syscall-template.S (gdb) bt #0 0x00007fb15bf0139d in fsync () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007fb157c543a4 in posix_fsync (frame=0x7fb15a80889c, this=0x1867170, fd=0x1d8e29c, datasync=0, xdata=0x0) at ../../../../../xlators/storage/posix/src/posix.c:2346 #2 0x00007fb15c56deab in default_fsync (frame=0x7fb15a80d890, this=0x18684d0, fd=0x1d8e29c, flags=0, xdata=0x0) at ../../../libglusterfs/src/defaults.c:929 #3 0x00007fb15c56deab in default_fsync (frame=0x7fb15a81c668, this=0x18696b0, fd=0x1d8e29c, flags=0, xdata=0x0) at ../../../libglusterfs/src/defaults.c:929 #4 0x00007fb1575fe583 in iot_fsync_wrapper (frame=0x7fb15a81e958, this=0x186a810, fd=0x1d8e29c, datasync=0, xdata=0x0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:1020 #5 0x00007fb15c587436 in call_resume_wind (stub=0x7fb15a4de810) at ../../../libglusterfs/src/call-stub.c:2522 #6 0x00007fb15c58ed9b in call_resume (stub=0x7fb15a4de810) at ../../../libglusterfs/src/call-stub.c:4151 #7 0x00007fb1575f890d in iot_worker (data=0x18814f0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:131 #8 0x00007fb15bef8d8c in start_thread (arg=0x7fb155765700) at pthread_create.c:304 #9 0x00007fb15b8b504d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #10 0x0000000000000000 in ?? () (gdb) f 1 #1 0x00007fb157c543a4 in posix_fsync (frame=0x7fb15a80889c, this=0x1867170, fd=0x1d8e29c, datasync=0, xdata=0x0) at ../../../../../xlators/storage/posix/src/posix.c:2346 2346 op_ret = fsync (_fd); (gdb) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. create a 3x2 distribute replicate volume, start it and mount it via 2 fuse clients. 2. Run a multi-threaded application (attached) on one fuse and dbench on other client 3. do volume set opertions parallely 4. bring a brick from each replica pair at regular intervals (300 seconds), sleep for some time and do volume start force. 5. give volume heal from both gluster cli and find | xargs stat. Actual results: glusterfs brick crashed Expected results: glusterfs brick should not crash Additional info: gluster volume info Volume Name: mirror Type: Distributed-Replicate Volume ID: c15b0415-46ec-485d-a1c6-989783bb154a Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: hyperspace:/mnt/sda7/export4 Brick2: hyperspace:/mnt/sda8/export4 Brick3: hyperspace:/mnt/sda7/export5 Brick4: hyperspace:/mnt/sda8/export5 Brick5: hyperspace:/mnt/sda7/export6 Brick6: hyperspace:/mnt/sda8/export6 Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on features.quota: on performance.quick-read: on performance.read-ahead: on performance.stat-prefetch: off features.limit-usage: /:250GB FIL (security.capability) ==> -1 (No data available) [2012-05-16 13:33:37.766524] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 66632: GETXATTR /clients/client7/~dmtmp/SEED/LARGE. FIL (security.capability) ==> -1 (No data available) [2012-05-16 13:33:37.771436] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 66641: GETXATTR /clients/client8/~dmtmp/SEED/LARGE. FIL (security.capability) ==> -1 (No data available) [2012-05-16 13:33:37.994634] W [marker-quota.c:2047:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node [2012-05-16 13:33:37.996264] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 66929: GETXATTR /clients/client5/~dmtmp/ACCESS/FAST ENER.MDB (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.079332] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67020: GETXATTR /clients/client13/~dmtmp/ACCESS/FAS TENER.MDB (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.123713] W [marker-quota.c:2047:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node [2012-05-16 13:33:38.226573] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67172: GETXATTR /clients/client21/~dmtmp/ACCESS/SAL ES.PRN (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.290289] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67245: GETXATTR /clients/client21/~dmtmp/ACCESS/SAL ES.PRN (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.406110] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67404: GETXATTR /clients/client18/~dmtmp/WORDPRO/LW PSAV0.TMP (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.440258] W [marker-quota.c:2047:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node [2012-05-16 13:33:38.476349] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67497: GETXATTR /clients/client18/~dmtmp/WORDPRO/LWPSAV0.TMP (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.528169] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67562: GETXATTR /clients/client20/~dmtmp/SEED/SMALL.FIL (security.capability) ==> -1 (No data available) [2012-05-16 13:33:38.540610] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67580: GETXATTR /clients/client20/~dmtmp/SEED/SMALL.FIL (security.capability) ==> -1 (No data available) pending frames: patchset: git://git.gluster.com/glusterfs.git signal received: 6 time of crash: 2012-05-16 13:33:38 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3git /lib/x86_64-linux-gnu/libc.so.6(+0x33d80)[0x7fb15b802d80] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fb15b802d05]