Bug 822067

Summary: [27ae1677eb2a6ed4a04bda0df5cc92f2780c11ed]: glusterfs server crashed since loc->gfid was NULL
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: quotaAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: mainlineCC: amarts, gluster-bugs, rfortier, vbellur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 848318 (view as bug list) Environment:
Last Closed: 2013-07-24 17:31:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 848318    
Attachments:
Description Flags
multi threaded program running on one of the fuse clients
none
header file for the program attached none

Description Raghavendra Bhat 2012-05-16 09:00:43 UTC
Created attachment 584900 [details]
multi threaded program running on one of the fuse clients

Description of problem:
3x2 distributed replicate volume. 2 fuse clients. 1 client executing threaded-io and the other client executing dbench. Volume set operations were running, brought a brick from each replicate pair with some intervals. Gave volume start force after some time and did volume heal (also find |xargs stat).

glusterfs brick crashed with the following backtrace.

Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id mirror.hyperspace.mnt-sda8'.
Program terminated with signal 6, Aborted.
#0  0x00007fb15b802d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
	in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x00007fb15b802d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fb15b806ab6 in abort () at abort.c:92
#2  0x00007fb15b7fb7c5 in __assert_fail (assertion=0x7fb1571e22b1 "!\"uuid null\"", file=<value optimized out>, line=1790, 
    function=<value optimized out>) at assert.c:81
#3  0x00007fb1571dc2a9 in mq_fetch_child_size_and_contri (frame=0x7fb15a601100, cookie=0x7fb15a807e88, this=0x186ccd0, op_ret=0, op_errno=0, 
    xdata=0x0) at ../../../../../xlators/features/marker/src/marker-quota.c:1790
#4  0x00007fb15c56529a in default_setxattr_cbk (frame=0x7fb15a807e88, cookie=0x7fb15a807724, this=0x186baa0, op_ret=0, op_errno=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:284
#5  0x00007fb157602bb9 in iot_setxattr_cbk (frame=0x7fb15a807724, cookie=0x7fb15a8100e0, this=0x186a810, op_ret=0, op_errno=0, xdata=0x0)
    at ../../../../../xlators/performance/io-threads/src/io-threads.c:1627
#6  0x00007fb15c56529a in default_setxattr_cbk (frame=0x7fb15a8100e0, cookie=0x7fb15a807bd8, this=0x18696b0, op_ret=0, op_errno=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:284
#7  0x00007fb157a390eb in posix_acl_setxattr_cbk (frame=0x7fb15a807bd8, cookie=0x7fb15a80a784, this=0x18684d0, op_ret=0, op_errno=0, 
    xdata=0x0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:1802
#8  0x00007fb157c54ca3 in posix_setxattr (frame=0x7fb15a80a784, this=0x1867170, loc=0x7fb15a4d1074, dict=0x7fb15a47415c, flags=0, xdata=0x0)
    at ../../../../../xlators/storage/posix/src/posix.c:2417
#9  0x00007fb157a39391 in posix_acl_setxattr (frame=0x7fb15a807bd8, this=0x18684d0, loc=0x7fb15a4d1074, xattr=0x7fb15a47415c, flags=0, 
    xdata=0x0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:1821
#10 0x00007fb15c56d32e in default_setxattr (frame=0x7fb15a8100e0, this=0x18696b0, loc=0x7fb15a4d1074, dict=0x7fb15a47415c, flags=0, 
    xdata=0x0) at ../../../libglusterfs/src/defaults.c:889
#11 0x00007fb157602e15 in iot_setxattr_wrapper (frame=0x7fb15a807724, this=0x186a810, loc=0x7fb15a4d1074, dict=0x7fb15a47415c, flags=0, 
    xdata=0x0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:1636
#12 0x00007fb15c58748e in call_resume_wind (stub=0x7fb15a4d1034) at ../../../libglusterfs/src/call-stub.c:2531
#13 0x00007fb15c58ed9b in call_resume (stub=0x7fb15a4d1034) at ../../../libglusterfs/src/call-stub.c:4151
#14 0x00007fb1575f890d in iot_worker (data=0x18814f0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:131
#15 0x00007fb15bef8d8c in start_thread (arg=0x7fb154d47700) at pthread_create.c:304
#16 0x00007fb15b8b504d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()
(gdb) f 3
#3  0x00007fb1571dc2a9 in mq_fetch_child_size_and_contri (frame=0x7fb15a601100, cookie=0x7fb15a807e88, this=0x186ccd0, op_ret=0, op_errno=0, 
    xdata=0x0) at ../../../../../xlators/features/marker/src/marker-quota.c:1790
1790	        GF_UUID_ASSERT (local->loc.gfid);
(gdb) l
1785	        mq_set_ctx_updation_status (local->ctx, _gf_false);
1786	
1787	        if (uuid_is_null (local->loc.gfid))
1788	                uuid_copy (local->loc.gfid, local->loc.inode->gfid);
1789	
1790	        GF_UUID_ASSERT (local->loc.gfid);
1791	
1792	        STACK_WIND (frame, mq_update_inode_contribution, FIRST_CHILD(this),
1793	                    FIRST_CHILD(this)->fops->lookup, &local->loc, newdict);
1794	
(gdb) p local->loc
$1 = {path = 0x1d89f10 "/clients/client12/~dmtmp/COREL/GRAPHIC1.CDR", name = 0x1d89f2f "GRAPHIC1.CDR", inode = 0x7fb155a08bec, 
  parent = 0x7fb1559f8980, gfid = '\000' <repeats 15 times>, pargfid = "\aZ\354\345\343\aF\243\243\065\025\006\275 \234\215"}
(gdb) p *local->loc.inode
$2 = {table = 0x188ecb0, gfid = '\000' <repeats 15 times>, lock = 1, nlookup = 0, ref = 1, ia_type = IA_INVAL, fd_list = {
    next = 0x7fb155a08c1c, prev = 0x7fb155a08c1c}, dentry_list = {next = 0x7fb155a08c2c, prev = 0x7fb155a08c2c}, hash = {
    next = 0x7fb155a08c3c, prev = 0x7fb155a08c3c}, list = {next = 0x7fb155a0777c, prev = 0x188ed10}, _ctx = 0x7fb14cda6cc0}
(gdb) info thr
  23 Thread 31833  0x00007fb15bf0139d in fsync () at ../sysdeps/unix/syscall-template.S:82
  22 Thread 31864  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  21 Thread 31868  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  20 Thread 31838  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  19 Thread 31817  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  18 Thread 31865  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  17 Thread 31869  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  16 Thread 31839  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  15 Thread 31840  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  14 Thread 31835  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  13 Thread 31867  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  12 Thread 31837  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  11 Thread 31866  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  10 Thread 31836  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  9 Thread 31773  0x00007fb15bf014bd in nanosleep () at ../sysdeps/unix/syscall-template.S:82
  8 Thread 31834  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
  7 Thread 31768  0x00007fb15b8b56a3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
  6 Thread 31771  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  5 Thread 31816  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  4 Thread 31770  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  3 Thread 31769  do_sigwait (set=<value optimized out>, sig=0x7fb15a28eeb8)
    at ../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:65
  2 Thread 31820  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
* 1 Thread 31863  0x00007fb15b802d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
(gdb) t 23
[Switching to thread 23 (Thread 31833)]#0  0x00007fb15bf0139d in fsync () at ../sysdeps/unix/syscall-template.S:82
82	../sysdeps/unix/syscall-template.S: No such file or directory.
	in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  0x00007fb15bf0139d in fsync () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fb157c543a4 in posix_fsync (frame=0x7fb15a80889c, this=0x1867170, fd=0x1d8e29c, datasync=0, xdata=0x0)
    at ../../../../../xlators/storage/posix/src/posix.c:2346
#2  0x00007fb15c56deab in default_fsync (frame=0x7fb15a80d890, this=0x18684d0, fd=0x1d8e29c, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:929
#3  0x00007fb15c56deab in default_fsync (frame=0x7fb15a81c668, this=0x18696b0, fd=0x1d8e29c, flags=0, xdata=0x0)
    at ../../../libglusterfs/src/defaults.c:929
#4  0x00007fb1575fe583 in iot_fsync_wrapper (frame=0x7fb15a81e958, this=0x186a810, fd=0x1d8e29c, datasync=0, xdata=0x0)
    at ../../../../../xlators/performance/io-threads/src/io-threads.c:1020
#5  0x00007fb15c587436 in call_resume_wind (stub=0x7fb15a4de810) at ../../../libglusterfs/src/call-stub.c:2522
#6  0x00007fb15c58ed9b in call_resume (stub=0x7fb15a4de810) at ../../../libglusterfs/src/call-stub.c:4151
#7  0x00007fb1575f890d in iot_worker (data=0x18814f0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:131
#8  0x00007fb15bef8d8c in start_thread (arg=0x7fb155765700) at pthread_create.c:304
#9  0x00007fb15b8b504d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()
(gdb) f 1
#1  0x00007fb157c543a4 in posix_fsync (frame=0x7fb15a80889c, this=0x1867170, fd=0x1d8e29c, datasync=0, xdata=0x0)
    at ../../../../../xlators/storage/posix/src/posix.c:2346
2346	                op_ret = fsync (_fd);
(gdb) 





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. create a 3x2 distribute replicate volume, start it and mount it via 2 fuse clients.
2. Run a multi-threaded application (attached) on one fuse and dbench on other client
3. do volume set opertions parallely
4. bring a brick from each replica pair at regular intervals (300 seconds), sleep for some time and do volume start force.
5. give volume heal from both gluster cli and find | xargs stat.
  
Actual results:

glusterfs brick crashed

Expected results:

glusterfs brick should not crash

Additional info:
gluster volume info
 
Volume Name: mirror
Type: Distributed-Replicate
Volume ID: c15b0415-46ec-485d-a1c6-989783bb154a
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: hyperspace:/mnt/sda7/export4
Brick2: hyperspace:/mnt/sda8/export4
Brick3: hyperspace:/mnt/sda7/export5
Brick4: hyperspace:/mnt/sda8/export5
Brick5: hyperspace:/mnt/sda7/export6
Brick6: hyperspace:/mnt/sda8/export6
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
features.quota: on
performance.quick-read: on
performance.read-ahead: on
performance.stat-prefetch: off
features.limit-usage: /:250GB


FIL (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:37.766524] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 66632: GETXATTR /clients/client7/~dmtmp/SEED/LARGE.
FIL (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:37.771436] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 66641: GETXATTR /clients/client8/~dmtmp/SEED/LARGE.
FIL (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:37.994634] W [marker-quota.c:2047:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node
[2012-05-16 13:33:37.996264] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 66929: GETXATTR /clients/client5/~dmtmp/ACCESS/FAST
ENER.MDB (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.079332] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67020: GETXATTR /clients/client13/~dmtmp/ACCESS/FAS
TENER.MDB (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.123713] W [marker-quota.c:2047:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node
[2012-05-16 13:33:38.226573] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67172: GETXATTR /clients/client21/~dmtmp/ACCESS/SAL
ES.PRN (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.290289] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67245: GETXATTR /clients/client21/~dmtmp/ACCESS/SAL
ES.PRN (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.406110] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67404: GETXATTR /clients/client18/~dmtmp/WORDPRO/LW
PSAV0.TMP (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.440258] W [marker-quota.c:2047:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node
[2012-05-16 13:33:38.476349] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67497: GETXATTR /clients/client18/~dmtmp/WORDPRO/LWPSAV0.TMP (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.528169] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67562: GETXATTR /clients/client20/~dmtmp/SEED/SMALL.FIL (security.capability) ==> -1 (No data available)
[2012-05-16 13:33:38.540610] I [server3_1-fops.c:823:server_getxattr_cbk] 0-mirror-server: 67580: GETXATTR /clients/client20/~dmtmp/SEED/SMALL.FIL (security.capability) ==> -1 (No data available)
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 2012-05-16 13:33:38
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3git
/lib/x86_64-linux-gnu/libc.so.6(+0x33d80)[0x7fb15b802d80]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fb15b802d05]

Comment 1 Raghavendra Bhat 2012-05-16 09:01:17 UTC
Created attachment 584902 [details]
header file for the program attached

Comment 2 Amar Tumballi 2012-07-11 10:16:02 UTC
fixed in patch @  http://review.gluster.com/3567