Created attachment 1040398 [details] Sample program to reproduce the crash Description of problem: I have a binary file (a PNG picture). I read the first four bytes which works as expected. During glfs_fini() which frees resources, the program crashes. Workaround: This crash is not observed when io-cache xlator is turned off # gluster volume set test io-cache off Always reproducible with attached program and binary file given below. Setup: [root@fed22 glusterfs]# uname -a Linux fed22 4.0.4-303.fc22.x86_64 #1 SMP Thu May 28 12:37:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [root@fed22 glusterfs]# gcc --version gcc (GCC) 5.1.1 20150422 (Red Hat 5.1.1-1) Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [root@fed22 glusterfs]# gluster volume info Volume Name: test Type: Distributed-Replicate Volume ID: e52a4cfe-9af8-40cf-ad20-71522bfc26cb Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: fed22:/export/brick1/b Brick2: fed22:/export/brick2/b Brick3: fed22:/export/brick3/b Brick4: fed22:/export/brick4/b Options Reconfigured: performance.quick-read: off performance.read-ahead: off performance.write-behind: off performance.open-behind: off performance.io-cache: on performance.readdir-ahead: off server.allow-insecure: on [root@fed22 glusterfs]# Steps to Reproduce: # mount -t glusterfs localhost:test /mnt # cd /mnt # wget https://upload.wikimedia.org/wikipedia/en/2/24/Lenna.png # umount /mnt # gcc crash.c -I /usr/local/include/glusterfs/ -lgfapi -O0 -ggdb -o ./crash # ./crash Actual results: [root@fed22 ~]# ./crash read returned: 4 crash: iobuf.c:128: __iobuf_arena_destroy_iobufs: Assertion `iobuf->ref == 0' failed. Aborted (core dumped) Expected results: The client program should not crash. [root@fed22 ~]# ./crash read returned: 4 glfs_fini returned: 0 Debug core: [root@fed22 ~]# gdb crash GNU gdb (GDB) Fedora 7.9.1-13.fc22 Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from crash...done. (gdb) r Starting program: /root/crash Missing separate debuginfos, use: dnf debuginfo-install glibc-2.21-5.fc22.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7fffedbe2700 (LWP 26296)] [New Thread 0x7ffff4a9d700 (LWP 26295)] [New Thread 0x7ffff529e700 (LWP 26294)] [New Thread 0x7fffeb12a700 (LWP 26301)] read returned: 4 [Thread 0x7ffff529e700 (LWP 26294) exited] [Thread 0x7ffff4a9d700 (LWP 26295) exited] [Thread 0x7fffeb12a700 (LWP 26301) exited] [Thread 0x7fffedbe2700 (LWP 26296) exited] crash: iobuf.c:128: __iobuf_arena_destroy_iobufs: Assertion `iobuf->ref == 0' failed. Program received signal SIGABRT, Aborted. 0x00007ffff782fa98 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.1-3.fc22.x86_64 libacl-2.2.52-7.fc22.x86_64 libattr-2.4.47-9.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 openssl-libs-1.0.1k-8.fc22.x86_64 pcre-8.37-1.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64 (gdb) bt #0 0x00007ffff782fa98 in raise () from /lib64/libc.so.6 #1 0x00007ffff783172a in abort () from /lib64/libc.so.6 #2 0x00007ffff7828247 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff78282f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00007ffff6f3ce3f in __iobuf_arena_destroy_iobufs (iobuf_arena=0x6201b0) at iobuf.c:128 #5 0x00007ffff6f3cf2f in __iobuf_arena_destroy (iobuf_pool=0x605060, iobuf_arena=0x6201b0) at iobuf.c:152 #6 0x00007ffff6f3d660 in iobuf_pool_destroy (iobuf_pool=0x605060) at iobuf.c:321 #7 0x00007ffff7bc33a0 in glusterfs_ctx_destroy (ctx=0x602190) at glfs.c:973 #8 0x00007ffff7bc38e9 in pub_glfs_fini (fs=0x602010) at glfs.c:1150 #9 0x0000000000400a89 in main () at crash.c:27 (gdb) f 4 #4 0x00007ffff6f3ce3f in __iobuf_arena_destroy_iobufs (iobuf_arena=0x6201b0) at iobuf.c:128 128 GF_ASSERT (iobuf->ref == 0); (gdb) p *iobuf $1 = {{list = {next = 0x620208, prev = 0x620208}, {next = 0x620208, prev = 0x620208}}, iobuf_arena = 0x6201b0, lock = 1, ref = 1, ptr = 0x7ffff5a5f000, free_ptr = 0x0} (gdb) Possible reason: io-cache xlator is taking a ref somewhere and does not unref.
Raghavendra - can you please take a look?
Milind has seen similar crash while trying to take backup from gluster volumes using baroeas. And we have observed that there is no issue when io-cache xlator is disabled. After adding few debug statements in iobref_ref & iobref_unref (attached the log file), just before the crash --> [2015-09-29 08:45:47.623190] E [MSGID: 101039] [iobuf.c:536:__iobuf_ref] 0-sun-client-0: $$$$$$$$$$$$$$$$$ in iobuf_ref iobuf(0x7fbb9402bae0), ref(1) /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xf2)[0x7fbb9d4f0c52] /usr/local/lib/libglusterfs.so.0(__iobuf_ref+0x8b)[0x7fbb9d53527b] /usr/local/lib/libglusterfs.so.0(iobuf_ref+0x87)[0x7fbb9d535e54] /usr/local/lib/libglusterfs.so.0(iobuf_get2+0x12b)[0x7fbb9d53577e] /usr/local/lib/libgfrpc.so.0(rpc_clnt_record_build_record+0x1f1)[0x7fbb9d9d9422] /usr/local/lib/libgfrpc.so.0(rpc_clnt_record+0x2b3)[0x7fbb9d9d97ba] /usr/local/lib/libgfrpc.so.0(rpc_clnt_submit+0x1e5)[0x7fbb9d9d9c68] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client_submit_request+0x540)[0x7fbb86dc0ef3] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client3_3_getxattr+0x53a)[0x7fbb86de4e10] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client_getxattr+0xee)[0x7fbb86dc5d1f] /usr/local/lib/glusterfs/3.8dev/xlator/cluster/distribute.so(dht_getxattr+0x165f)[0x7fbb86b6ee09] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/glusterfs/3.8dev/xlator/performance/md-cache.so(mdc_getxattr+0x462)[0x7fbb85cd7956] /usr/local/lib/glusterfs/3.8dev/xlator/debug/io-stats.so(io_stats_getxattr+0x23a)[0x7fbb85ac01cc] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(syncop_getxattr+0x2fb)[0x7fbb9d54aab7] /usr/local/lib/libgfapi.so.0(+0x12de7)[0x7fbb9dbf8de7] /usr/local/lib/libgfapi.so.0(glfs_llistxattr+0x36)[0x7fbb9dbf8f0d] /usr/lib64/bareos/plugins/gfapi-fd.so(+0x6a13)[0x7fbb9de0ea13] /usr/sbin/bareos-fd[0x417828] /usr/sbin/bareos-fd[0x40d1d8] /usr/sbin/bareos-fd[0x415f6f] /usr/lib64/libbareosfind-15.3.0.so(_Z10find_filesP3JCRP6FF_PKTPFiS0_S2_bES4_+0x385)[0x7fbba8478905] /usr/sbin/bareos-fd[0x40b435] /usr/sbin/bareos-fd[0x41200f] /usr/sbin/bareos-fd[0x413665] /usr/sbin/bareos-fd[0x41cefc] /usr/lib64/libbareos-15.3.0.so(workq_server+0x1f5)[0x7fbba7e35ec5] /usr/lib64/libbareos-15.3.0.so(lmgr_thread_launcher+0x5f)[0x7fbba7e1f73f] /usr/lib64/libpthread.so.0(+0x7555)[0x7fbba7369555] /usr/lib64/libc.so.6(clone+0x6d)[0x7fbba65ffb9d] [2015-09-29 08:45:47.623944] T [rpc-clnt.c:1256:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 96, payload: 28, rpc hdr: 68 [2015-09-29 08:45:47.623964] E [MSGID: 101039] [iobuf.c:536:__iobuf_ref] 0-sun-client-0: $$$$$$$$$$$$$$$$$ in iobuf_ref iobuf(0x7fbb9402bae0), ref(2) /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xf2)[0x7fbb9d4f0c52] /usr/local/lib/libglusterfs.so.0(__iobuf_ref+0x8b)[0x7fbb9d53527b] /usr/local/lib/libglusterfs.so.0(iobuf_ref+0x87)[0x7fbb9d535e54] /usr/local/lib/libglusterfs.so.0(__iobref_add+0x164)[0x7fbb9d53645d] /usr/local/lib/libglusterfs.so.0(iobref_add+0xf7)[0x7fbb9d53658a] /usr/local/lib/libgfrpc.so.0(rpc_clnt_submit+0x244)[0x7fbb9d9d9cc7] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client_submit_request+0x540)[0x7fbb86dc0ef3] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client3_3_getxattr+0x53a)[0x7fbb86de4e10] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client_getxattr+0xee)[0x7fbb86dc5d1f] /usr/local/lib/glusterfs/3.8dev/xlator/cluster/distribute.so(dht_getxattr+0x165f)[0x7fbb86b6ee09] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/glusterfs/3.8dev/xlator/performance/md-cache.so(mdc_getxattr+0x462)[0x7fbb85cd7956] /usr/local/lib/glusterfs/3.8dev/xlator/debug/io-stats.so(io_stats_getxattr+0x23a)[0x7fbb85ac01cc] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(syncop_getxattr+0x2fb)[0x7fbb9d54aab7] /usr/local/lib/libgfapi.so.0(+0x12de7)[0x7fbb9dbf8de7] /usr/local/lib/libgfapi.so.0(glfs_llistxattr+0x36)[0x7fbb9dbf8f0d] /usr/lib64/bareos/plugins/gfapi-fd.so(+0x6a13)[0x7fbb9de0ea13] /usr/sbin/bareos-fd[0x417828] /usr/sbin/bareos-fd[0x40d1d8] /usr/sbin/bareos-fd[0x415f6f] /usr/lib64/libbareosfind-15.3.0.so(_Z10find_filesP3JCRP6FF_PKTPFiS0_S2_bES4_+0x385)[0x7fbba8478905] /usr/sbin/bareos-fd[0x40b435] /usr/sbin/bareos-fd[0x41200f] /usr/sbin/bareos-fd[0x413665] /usr/sbin/bareos-fd[0x41cefc] /usr/lib64/libbareos-15.3.0.so(workq_server+0x1f5)[0x7fbba7e35ec5] /usr/lib64/libbareos-15.3.0.so(lmgr_thread_launcher+0x5f)[0x7fbba7e1f73f] /usr/lib64/libpthread.so.0(+0x7555)[0x7fbba7369555] /usr/lib64/libc.so.6(clone+0x6d)[0x7fbba65ffb9d] [2015-09-29 08:45:47.624264] T [rpc-clnt.c:1593:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x1e Program: GlusterFS 3.3, ProgVers: 330, Proc: 18) to rpc-transport (sun-client-0) [2015-09-29 08:45:47.624271] E [MSGID: 101039] [iobuf.c:549:__iobuf_unref] 0-sun-client-0: $$$$$$$$$$$$$$$$$ in iobuf_unref iobuf(0x7fbb9402bae0), ref(1) /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xf2)[0x7fbb9d4f0c52] /usr/local/lib/libglusterfs.so.0(__iobuf_unref+0x8b)[0x7fbb9d535310] /usr/local/lib/libglusterfs.so.0(iobuf_unref+0x8e)[0x7fbb9d535d9d] /usr/local/lib/libgfrpc.so.0(rpc_clnt_submit+0x493)[0x7fbb9d9d9f16] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client_submit_request+0x540)[0x7fbb86dc0ef3] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client3_3_getxattr+0x53a)[0x7fbb86de4e10] /usr/local/lib/glusterfs/3.8dev/xlator/protocol/client.so(client_getxattr+0xee)[0x7fbb86dc5d1f] /usr/local/lib/glusterfs/3.8dev/xlator/cluster/distribute.so(dht_getxattr+0x165f)[0x7fbb86b6ee09] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/glusterfs/3.8dev/xlator/performance/md-cache.so(mdc_getxattr+0x462)[0x7fbb85cd7956] /usr/local/lib/glusterfs/3.8dev/xlator/debug/io-stats.so(io_stats_getxattr+0x23a)[0x7fbb85ac01cc] /usr/local/lib/libglusterfs.so.0(default_getxattr+0xa5)[0x7fbb9d5089d0] /usr/local/lib/libglusterfs.so.0(syncop_getxattr+0x2fb)[0x7fbb9d54aab7] /usr/local/lib/libgfapi.so.0(+0x12de7)[0x7fbb9dbf8de7] /usr/local/lib/libgfapi.so.0(glfs_llistxattr+0x36)[0x7fbb9dbf8f0d] /usr/lib64/bareos/plugins/gfapi-fd.so(+0x6a13)[0x7fbb9de0ea13] /usr/sbin/bareos-fd[0x417828] /usr/sbin/bareos-fd[0x40d1d8] /usr/sbin/bareos-fd[0x415f6f] /usr/lib64/libbareosfind-15.3.0.so(_Z10find_filesP3JCRP6FF_PKTPFiS0_S2_bES4_+0x385)[0x7fbba8478905] /usr/sbin/bareos-fd[0x40b435] /usr/sbin/bareos-fd[0x41200f] /usr/sbin/bareos-fd[0x413665] /usr/sbin/bareos-fd[0x41cefc] /usr/lib64/libbareos-15.3.0.so(workq_server+0x1f5)[0x7fbba7e35ec5] /usr/lib64/libbareos-15.3.0.so(lmgr_thread_launcher+0x5f)[0x7fbba7e1f73f] /usr/lib64/libpthread.so.0(+0x7555)[0x7fbba7369555] /usr/lib64/libc.so.6(clone+0x6d)[0x7fbba65ffb9d] There is a ref taken on iobuf(0x7fbb9402bae0) twice in rpc_clnt_submit. But it has got unref'ed only once which led to the crash.
Created attachment 1078321 [details] bareos-glfs.log
As of current master (420c2d03543a35f1f5483f200541577f4a606b22), I can't reproduce this. I'll reopen this bug or file a new one if I ever bump into this again. [root@gd2-1 1233136]# gcc crash.c -I /usr/local/include/glusterfs/ -lgfapi -O0 -ggdb -o ./crash [root@gd2-1 1233136]# ./crash read returned: 4 glfs_fini returned: 0