Bug 798179 - [728de5be7ce2975efb59bb5928fd7261d5ec7760]: client crashed in mdc_lookup with assert during unref
Summary: [728de5be7ce2975efb59bb5928fd7261d5ec7760]: client crashed in mdc_lookup with...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: stat-prefetch
Version: pre-release
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
: 798508 (view as bug list)
Depends On:
Blocks: 817967
TreeView+ depends on / blocked
 
Reported: 2012-02-28 09:04 UTC by Rahul C S
Modified: 2013-12-19 00:07 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:39:11 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Rahul C S 2012-02-28 09:04:17 UTC
Description of problem:
[2012-02-27 05:22:09.751140] E [afr-lk-common.c:568:afr_unlock_inodelk_cbk] 0-vol-replicate-0: /system_light/run29351/test/file97138: unlock
failed on 0, reason: Invalid argument
[2012-02-27 05:23:47.720289] W [client3_1-fops.c:1266:client3_1_finodelk_cbk] 0-vol-client-0: remote operation failed: Invalid argument
[2012-02-27 05:23:47.720356] E [afr-lk-common.c:568:afr_unlock_inodelk_cbk] 0-vol-replicate-0: /system_light/run29351/test/file99922: unlock
failed on 0, reason: Invalid argument
[2012-02-27 06:01:52.562271] W [client3_1-fops.c:1228:client3_1_inodelk_cbk] 0-vol-client-0: remote operation failed: Invalid argument
[2012-02-27 06:01:52.607917] E [afr-lk-common.c:568:afr_unlock_inodelk_cbk] 0-vol-replicate-0: /system_light/run29351/sbench.5044: unlock fai
led on 0, reason: Invalid argument
[2012-02-27 06:09:02.871766] W [fd-lk.c:407:print_lock_list] 0-fd-lk: lock list:
[2012-02-27 06:09:05.273563] W [fd-lk.c:407:print_lock_list] 0-fd-lk: lock list:
[2012-02-27 06:09:06.380844] W [fd-lk.c:407:print_lock_list] 0-fd-lk: lock list:
[2012-02-27 06:09:07.893275] W [fd-lk.c:407:print_lock_list] 0-fd-lk: lock list:
[2012-02-27 06:09:09.346986] W [fd-lk.c:407:print_lock_list] 0-fd-lk: lock list:
pending frames:
frame : type(1) op(LOOKUP) 
frame : type(1) op(LOOKUP) 
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)

patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 2012-02-27 06:09:10
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3git
/lib64/libc.so.6[0x3e5d632ac0]
/lib64/libc.so.6(gsignal+0x35)[0x3e5d632a45]
/lib64/libc.so.6(abort+0x175)[0x3e5d634225]
/lib64/libc.so.6(__assert_fail+0xf5)[0x3e5d62b9d5]
/usr/local/lib/libglusterfs.so.0(__gf_free+0xa3)[0x7fb181b830ef]
/usr/local/lib/libglusterfs.so.0(dict_destroy+0xbf)[0x7fb181b48d39]
/usr/local/lib/libglusterfs.so.0(dict_unref+0xb3)[0x7fb181b48e73]
/usr/local/lib/glusterfs/3git/xlator/performance/md-cache.so(mdc_lookup+0x340)[0x7fb17c45c73b]
/usr/local/lib/glusterfs/3git/xlator/debug/io-stats.so(io_stats_lookup+0x28c)[0x7fb17c24ca02]
/usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0xb501)[0x7fb18042b501]
/usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x1e4e0)[0x7fb18043e4e0]
/lib64/libpthread.so.0[0x3e5da077e1]
/lib64/libc.so.6(clone+0x6d)[0x3e5d6e68ed]
---------

Core backtrace:
Core was generated by `/usr/local/sbin/glusterfs --volfile-id=vol --volfile-server=10.1.11.152 mount/'.
Program terminated with signal 6, Aborted.
#0  0x0000003e5d632a45 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6.x86_64 libgcc-4.4.5-6.el6.x86_64
(gdb) bt
#0  0x0000003e5d632a45 in raise () from /lib64/libc.so.6
#1  0x0000003e5d634225 in abort () from /lib64/libc.so.6
#2  0x0000003e5d62b9d5 in __assert_fail () from /lib64/libc.so.6
#3  0x00007fb181b830ef in __gf_free (free_ptr=0xc198420) at mem-pool.c:273
#4  0x00007fb181b48d39 in dict_destroy (this=0xc02b460) at dict.c:418
#5  0x00007fb181b48e73 in dict_unref (this=0xc02b460) at dict.c:454
#6  0x00007fb17c45c73b in mdc_lookup (frame=0x7fb1809954dc, this=0xac5f60, loc=0x7fb159446740, xattr_req=0x7fb1594c0df0) at md-cache.c:634
#7  0x00007fb17c24ca02 in io_stats_lookup (frame=0x7fb1809aaad8, this=0xac7160, loc=0x7fb159446740, xattr_req=0x7fb1594c0df0)
    at io-stats.c:1855
#8  0x00007fb18042b501 in fuse_getattr (this=0xaa5720, finh=0x7fb1593cc3b0, msg=0x7fb1593cc3d8) at fuse-bridge.c:571
#9  0x00007fb18043e4e0 in fuse_thread_proc (data=0xaa5720) at fuse-bridge.c:3970
#10 0x0000003e5da077e1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003e5d6e68ed in clone () from /lib64/libc.so.6
(gdb) f 6
#6  0x00007fb17c45c73b in mdc_lookup (frame=0x7fb1809954dc, this=0xac5f60, loc=0x7fb159446740, xattr_req=0x7fb1594c0df0) at md-cache.c:634
634                     dict_unref (xattr_rsp);
(gdb) l
629   
630             MDC_STACK_UNWIND (lookup, frame, 0, 0, loc->inode, &stbuf,
631                               xattr_rsp, &postparent);
632   
633             if (xattr_rsp)
634                     dict_unref (xattr_rsp);
635   
636             return 0;
637   
638     uncached:
(gdb) p *xattr_rsp
$12 = {is_static = 0 '\000', hash_size = 1, count = 5, refcount = 0, members = 0xc08f80, members_list = 0x984e0c0,
  extra_free = 0xb710730 "", extra_stdfree = 0x0, lock = 1}
(gdb) f 5
#5  0x00007fb181b48e73 in dict_unref (this=0xc02b460) at dict.c:454
454                     dict_destroy (this);
(gdb) f 4
#4  0x00007fb181b48d39 in dict_destroy (this=0xc02b460) at dict.c:418
418                     GF_FREE (prev->key);
(gdb) f 3
#3  0x00007fb181b830ef in __gf_free (free_ptr=0xc198420) at mem-pool.c:273
273                     GF_ASSERT (0);
(gdb) l
268
269             ptr = (char *)free_ptr - 8 - 4;
270
271             if (GF_MEM_HEADER_MAGIC != *(uint32_t *)ptr) {
272                     //Possible corruption, assert here
273                     GF_ASSERT (0);
274             }
275
276             *(uint32_t *)ptr = 0;
277
(gdb)

Distributed replicate volume with fuse client running sanity & nfs client running rdd. Did a brick up/down with stat-prefetch on & off

Comment 1 Anand Avati 2012-03-01 03:45:56 UTC
CHANGE: http://review.gluster.com/2834 (perf/md-cache: hold lock on modification of md_cache structure) merged in master by Vijay Bellur (vijay)

Comment 2 Amar Tumballi 2012-03-01 06:29:43 UTC
*** Bug 798508 has been marked as a duplicate of this bug. ***

Comment 4 Anush Shetty 2012-05-19 05:45:54 UTC
Verified with 3.3.0qa41


Note You need to log in before you can comment on or make changes to this bug.