Description of problem: (gdb) bt full #0 0x00000032f1e32885 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00000032f1e34065 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x00000032f1e2b9fe in __assert_fail_base () from /lib64/libc.so.6 No symbol table info available. #3 0x00000032f1e2bac0 in __assert_fail () from /lib64/libc.so.6 No symbol table info available. #4 0x00007fe315a0e6ab in __gf_free (free_ptr=0x1443400) at mem-pool.c:278 req_size = 0 ptr = 0x14433f4 "" type = 0 xl = 0x0 __PRETTY_FUNCTION__ = "__gf_free" #5 0x00007fe310ff32e3 in gf_defrag_start_crawl (data=0x1425b00) at dht-rebalance.c:1485 this = 0x1425b00 conf = 0x14432f0 defrag = 0x1443400 ret = -1 loc = {path = 0x7fe311032eaf "/", name = 0x0, inode = 0x7fe2d161a04c, parent = 0x0, gfid = '\000' <repeats 15 times>, "\001", pargfid = '\000' <repeats 15 times>} iatt = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0} parent = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0} fix_layout = 0x0 migrate_data = 0x0 __FUNCTION__ = "gf_defrag_start_crawl" #6 0x00007fe315a1f286 in synctask_wrap (old_task=0x7fe2c8200d70) at syncop.c:128 task = 0x7fe2c8200d70 #7 0x00000032f1e43610 in ?? () from /lib64/libc.so.6 No symbol table info available. Version-Release number of selected component (if applicable): 3.3.0qa33 gfsc1.sh:- ----------- #!/bin/bash mountpoint=`pwd` for i in {1..10} do level1_dir=$mountpoint/fuse2.$i mkdir $level1_dir cd $level1_dir for j in {1..20} do level2_dir=dir.$j mkdir $level2_dir cd $level2_dir for k in {1..100} do echo "Creating File: $leve1_dir/$level2_dir/file.$k" dd if=/dev/zero of=file.$k bs=1M count=$k done cd $level1_dir done cd $mountpoint done nfsc1.sh:- ---------- #!/bin/bash mountpoint=`pwd` for i in {1..5} do level1_dir=$mountpoint/nfs2.$i mkdir $level1_dir cd $level1_dir for j in {1..20} do level2_dir=dir.$j mkdir $level2_dir cd $level2_dir for k in {1..100} do echo "Creating File: $leve1_dir/$level2_dir/file.$k" dd if=/dev/zero of=file.$k bs=1M count=$k done cd $level1_dir done cd $mountpoint done Steps to Reproduce: 1.create distribute-replicate volume(2X3). start the volume. 2.create fuse, nfs mounts. 3.run gfsc1.sh from fuse mount 4.run nfsc1.sh from nfs mount 4.add-brick to the volume 5.start rebalance 6.status rebalance 7.stop rebalance 8.brink down 2 bricks from each replicate set, so that one brick is online from each replica set 9.brick back bricks online 10.start force rebalance 11.query rebalance status 12.stop rebalance Repeat step8 to step12 3-4 times. Actual results: rebalance process crashed.
Can you please provide the rebalance logs. Also, a gdb o/p of the frame in question(5 i believe). If possible, can the setup infomation be made available for me to access(can be mailed across).
Created attachment 575383 [details] rebalance log
Able to recreate the problem with the above mentioned steps . Attaching the rebalance logs.
This seems to be a case where afr background self heal is in progress, and rebalance has called a cleanup_and_exit. Sending parent_down to xlators does not seem to be fixing this issue.
Closing this bug as we have switched off selfhealing from rebalance process. Please re-open the bug if you are able to reproduce it. *** This bug has been marked as a duplicate of bug 808997 ***
sorry, marked it as dup to a wrong bug *** This bug has been marked as a duplicate of bug 808977 ***
Unable to re-create the same issue.