I create a 2x2 hot over 2x(4+2) ec volume. I set promote freq as 1000sec and demote as 100 and mounted the tier vol on fuse. Now I wrote many different folders containing mp3 files(about 15 folders each on an avg holding 5-6 files) I left this volume for some time so that all files will get demoted. After about 5 hrs, this crash was hit [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id rebalance/gold --xlator-option *d'. Program terminated with signal 11, Segmentation fault. #0 0x00007ff8f9f82af0 in syncop_ipc (subvol=0x7ff7a2de5700, op=op@entry=1, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0) at syncop.c:2823 2823 SYNCOP (subvol, (&args), syncop_ipc_cbk, subvol->fops->ipc, Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libgcc-4.8.3-9.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-22.el7_1.1.x86_64 openssl-libs-1.0.1e-42.el7_1.9.x86_64 pcre-8.32-14.el7.x86_64 sqlite-3.7.17-6.el7_1.1.x86_64 sssd-client-1.12.2-58.el7_1.14.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) bt #0 0x00007ff8f9f82af0 in syncop_ipc (subvol=0x7ff7a2de5700, op=op@entry=1, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0) at syncop.c:2823 #1 0x00007ff8e7919fb9 in tier_process_brick_cbk (args=<synthetic pointer>, local_brick=0x7ff8db43bc60) at tier.c:548 #2 tier_build_migration_qfile (is_promotion=_gf_false, query_cbk_args=0x7ff79fddee70, args=0x7ff8db43bcc0) at tier.c:608 #3 tier_demote (args=0x7ff8db43bcc0) at tier.c:668 #4 0x00007ff8f8d99df5 in start_thread () from /lib64/libpthread.so.0 #5 0x00007ff8f86e01ad in clone () from /lib64/libc.so.6 (gdb) quit [root@zod /]# ll core.32719 [root@zod glusterfs]# tail -n 30 gold-tier.log 2015-09-16 07:00:00.348553] W [MSGID: 101105] [gfdb_sqlite3.c:379:apply_sql_params_db] 0-sqlite3: Failed to retrieve sql-db-autovacuum from params.Assigning default value: none pending frames: frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2015-09-16 07:00:00 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.4 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7ff8f9f30fd2] /lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7ff8f9f4d45d] /lib64/libc.so.6(+0x35650)[0x7ff8f861f650] /lib64/libglusterfs.so.0(syncop_ipc+0x140)[0x7ff8f9f82af0] /usr/lib64/glusterfs/3.7.4/xlator/cluster/tier.so(+0x55fb9)[0x7ff8e7919fb9] /lib64/libpthread.so.0(+0x7df5)[0x7ff8f8d99df5] /lib64/libc.so.6(clone+0x6d)[0x7ff8f86e01ad] --------- Version-Release number of selected component (if applicable): ============================================================= [root@zod glusterfs]# gluster --version glusterfs 3.7.4 built on Sep 12 2015 01:35:35 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. [root@zod glusterfs]# rpm -qa|grep gluster glusterfs-client-xlators-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-api-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-fuse-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-debuginfo-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-server-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-cli-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 glusterfs-libs-3.7.4-0.33.git1d02d4b.el7.centos.x86_64 [root@zod glusterfs]# gluster v info gold;gluster v status gold;gluster v rebal gold status;gluster v tier gold status Volume Name: gold Type: Tier Volume ID: ba50bc3f-5b0f-4707-bf4f-cf5a2f5a192b Status: Started Number of Bricks: 16 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/rhs/brick6/gold_hot Brick2: zod:/rhs/brick6/gold_hot Brick3: yarrow:/rhs/brick7/gold_hot Brick4: zod:/rhs/brick7/gold_hot Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (4 + 2) = 12 Brick5: zod:/rhs/brick1/gold Brick6: yarrow:/rhs/brick1/gold Brick7: zod:/rhs/brick2/gold Brick8: yarrow:/rhs/brick2/gold Brick9: zod:/rhs/brick3/gold Brick10: yarrow:/rhs/brick3/gold Brick11: zod:/rhs/brick4/gold Brick12: yarrow:/rhs/brick4/gold Brick13: zod:/rhs/brick5/gold Brick14: yarrow:/rhs/brick5/gold Brick15: yarrow:/rhs/brick6/gold Brick16: zod:/rhs/brick6/gold Options Reconfigured: cluster.tier-demote-frequency: 100 cluster.tier-promote-frequency: 1000 features.ctr-enabled: on performance.io-cache: off performance.quick-read: off performance.readdir-ahead: on Status of volume: gold Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Hot Bricks: Brick yarrow:/rhs/brick6/gold_hot 49227 0 Y 739 Brick zod:/rhs/brick6/gold_hot 49234 0 Y 32646 Brick yarrow:/rhs/brick7/gold_hot 49226 0 Y 713 Brick zod:/rhs/brick7/gold_hot 49233 0 Y 32622 Cold Bricks: Brick zod:/rhs/brick1/gold 49227 0 Y 32303 Brick yarrow:/rhs/brick1/gold 49220 0 Y 32644 Brick zod:/rhs/brick2/gold 49228 0 Y 32321 Brick yarrow:/rhs/brick2/gold 49221 0 Y 32668 Brick zod:/rhs/brick3/gold 49229 0 Y 32339 Brick yarrow:/rhs/brick3/gold 49222 0 Y 32686 Brick zod:/rhs/brick4/gold 49230 0 Y 32357 Brick yarrow:/rhs/brick4/gold 49223 0 Y 32704 Brick zod:/rhs/brick5/gold 49231 0 Y 32375 Brick yarrow:/rhs/brick5/gold 49224 0 Y 32723 Brick yarrow:/rhs/brick6/gold 49225 0 Y 32747 Brick zod:/rhs/brick6/gold 49232 0 Y 32393 NFS Server on localhost 2049 0 Y 11286 NFS Server on yarrow 2049 0 Y 23678 Task Status of Volume gold ------------------------------------------------------------------------------ Task : Rebalance ID : 8e8a40b4-be43-4593-a192-3f2b7411bd97 Status : in progress Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 108 0 0 in progress 765.00 yarrow 0 0Bytes 7344 0 0 in progress 68476.00 volume rebalance: gold: success: Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 0 0 in progress yarrow 0 0 in progress volume rebalance: gold: success: [root@zod glusterfs]#
cores and sosreports @rhsqe-repo.lab.eng.blr.redhat.com [root@rhsqe-repo sosreports]# /home/repo/sosreports/bug.1263585
Analysis: The bricklists have an extra entry at the end. I have verified this on a running tier process for a newly created volume as well . Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id rebalance/gold --xlator-option *d'. Program terminated with signal 11, Segmentation fault. #0 0x00007ff8f9f82af0 in syncop_ipc (subvol=0x7ff7a2de5700, op=op@entry=1, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0) at syncop.c:2823 2823 SYNCOP (subvol, (&args), syncop_ipc_cbk, subvol->fops->ipc, (gdb) bt #0 0x00007ff8f9f82af0 in syncop_ipc (subvol=0x7ff7a2de5700, op=op@entry=1, xdata_in=xdata_in@entry=0x0, xdata_out=xdata_out@entry=0x0) at syncop.c:2823 #1 0x00007ff8e7919fb9 in tier_process_brick_cbk (args=<synthetic pointer>, local_brick=0x7ff8db43bc60) at tier.c:548 #2 tier_build_migration_qfile (is_promotion=_gf_false, query_cbk_args=0x7ff79fddee70, args=0x7ff8db43bcc0) at tier.c:608 #3 tier_demote (args=0x7ff8db43bcc0) at tier.c:668 #4 0x00007ff8f8d99df5 in start_thread () from /lib64/libpthread.so.0 #5 0x00007ff8f86e01ad in clone () from /lib64/libc.so.6 (gdb) p *subvol $1 = {name = 0x7ff7a2de5700 "", type = 0x7ff8cc037460 "\002", next = 0x7ff7a2de5700, prev = 0x1, parents = 0x0, children = 0x5fd8b3b6b4645700, options = 0x86286dc0e2942214, dlhandle = 0x0, fops = 0x0, cbks = 0x0, dumpops = 0x0, volume_options = {next = 0x0, prev = 0x0}, fini = 0x0, init = 0x0, reconfigure = 0x0, mem_acct_init = 0x0, notify = 0x0, loglevel = GF_LOG_NONE, latencies = {{min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 11 times>, {min = 0, max = 0, total = 0, std = 6.9515818434295051e-310, mean = 6.9515814287748329e-310, count = 140527034957824}, { min = 140701566130656, max = 140701566130656, total = -nan(0xfffffffffffe0), std = 0, mean = 0, count = 140701566127856}, { min = 16, max = 1, total = 0, std = 0, mean = 0, count = 1}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, { min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, { min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, { min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, { min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, { min = 0, max = 0, total = 0, std = 0, mean = 0, count = 140701566130704}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 1.4308692302641265e-308, mean = 0, count = 0}, {min = 0, max = 140707013696768, total = 6.9518399606381148e-310, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 140701557739520}, {min = 8392704, max = 4096, total = 2.0236928853657458e-320, std = 0, mean = 0, count = 0}, {min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 17 times>}, history = 0x0, ctx = 0x0, graph = 0x0, itable = 0x0, init_succeeded = 0 '\000', private = 0x0, mem_acct = 0x0, winds = 0, switched = 0 '\000', local_pool = 0x0, is_autoloaded = _gf_false} (gdb) The subvol is not a valid xlator. Dereferencing this causes the crash. (gdb) f 2 #2 tier_build_migration_qfile (is_promotion=_gf_false, query_cbk_args=0x7ff79fddee70, args=0x7ff8db43bcc0) at tier.c:608 608 ret = tier_process_brick_cbk (local_brick, (gdb) l 603 gfdb_brick_dict_info.time_stamp = &time_in_past; 604 gfdb_brick_dict_info._gfdb_promote = is_promotion; 605 gfdb_brick_dict_info._query_cbk_args = query_cbk_args; 606 607 list_for_each_entry (local_brick, args->brick_list, list) { 608 ret = tier_process_brick_cbk (local_brick, 609 &gfdb_brick_dict_info); 610 if (ret) { 611 gf_msg (args->this->name, GF_LOG_ERROR, 0, 612 DHT_MSG_BRICK_QUERY_FAILED, (gdb) 613 "Brick query failed\n"); 614 goto out; 615 } 616 } (gdb) p *args $2 = {this = 0x0, defrag = 0x0, brick_list = 0x0, freq_time = 0, return_value = 0} args is NULL at this point because of the same issue as in BZ 1263200. However it looks like the demote thread had already started processing the brick list before the args were cleared. On switching to the tier_start thread and traversing bricklist_hot: (gdb) bt #0 0x00007ff8f8da099d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007ff8f9f50914 in gf_timer_proc (ctx=0x7ff8fb371010) at timer.c:205 #2 0x00007ff8f8d99df5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007ff8f86e01ad in clone () from /lib64/libc.so.6 (gdb) t 3 [Switching to thread 3 (Thread 0x7ff8ef549700 (LWP 32723))] #0 0x00007ff8f86a748d in nanosleep () from /lib64/libc.so.6 (gdb) bt #0 0x00007ff8f86a748d in nanosleep () from /lib64/libc.so.6 #1 0x00007ff8f86a7324 in sleep () from /lib64/libc.so.6 #2 0x00007ff8e791af2a in tier_start (this=0x7ff8e8020920, defrag=0x7ff8e8028c10) at tier.c:860 #3 0x00007ff8e7dafd34 in gf_defrag_start_crawl (data=0x7ff8e8020920) at dht-rebalance.c:2841 #4 0x00007ff8f9f72d72 in synctask_wrap (old_task=<optimized out>) at syncop.c:380 #5 0x00007ff8f86310f0 in ?? () from /lib64/libc.so.6 #6 0x0000000000000000 in ?? () (gdb) f 2 #2 0x00007ff8e791af2a in tier_start (this=0x7ff8e8020920, defrag=0x7ff8e8028c10) at tier.c:860 860 sleep(1); (gdb) p *((brick_list_t *)((char *)(bricklist_hot->next)-(unsigned long)(&((brick_list_t *)0)->list))) $3 = {xlator = 0x7ff8e8009c70, brick_db_path = 0x7ff8cc00c260 "/rhs/brick6/gold_hot/.glusterfs/gold_hot.db", list = { next = 0x7ff8cc000db0, prev = 0x7ff8db43bc70}} (gdb) p *((brick_list_t *)((char *)(bricklist_hot->next->next)-(unsigned long)(&((brick_list_t *)0)->list))) $4 = {xlator = 0x7ff8e800be70, brick_db_path = 0x7ff8cc00d2b0 "/rhs/brick7/gold_hot/.glusterfs/gold_hot.db", list = { next = 0x7ff8db43bc70, prev = 0x7ff8cc000d40}} (gdb) p *((brick_list_t *)((char *)(bricklist_hot->next->next->next)-(unsigned long)(&((brick_list_t *)0)->list))) $5 = {xlator = 0x7ff7a2de5700, brick_db_path = 0x7ff79fddf700 "", list = {next = 0x7ff8cc000d40, prev = 0x7ff8cc000db0}} =====> Invalid xlator (gdb) p *((brick_list_t *)((char *)(bricklist_hot->next->next->next->next)-(unsigned long)(&((brick_list_t *)0)->list))) $6 = {xlator = 0x7ff8e8009c70, brick_db_path = 0x7ff8cc00c260 "/rhs/brick6/gold_hot/.glusterfs/gold_hot.db", list = { next = 0x7ff8cc000db0, prev = 0x7ff8db43bc70}} ==> First entry
The extra entry to the list is actually the head of the list. The head of the list is an empty brick entry. Note when we traverse the list using list_for_each_entry, we start from head->next and go until we reach head.. so we skip this "extra" entry. As its build into the API this is normal I believe. I am fairly confident this a duplicate of BZ 1263200 . If we see this again we should reopen it, but that is the likeliest explanation. *** This bug has been marked as a duplicate of bug 1263200 ***