Description of problem: ======================= (gdb) bt #0 0x00007fc668299469 in __inode_get_xl_index (xlator=0x7fc64c03ed20, inode=0x7fc638015a28) at inode.c:549 #1 __inode_unref (inode=0x7fc638015a28, clear=_gf_false) at inode.c:589 #2 0x00007fc668299ef3 in inode_unref (inode=0x7fc638015a28) at inode.c:670 #3 0x00007fc668287a2c in loc_wipe (loc=loc@entry=0x7fc6544e7bc0) at xlator.c:777 #4 0x00007fc6599f1fb9 in dht_heal_path (this=this@entry=0x7fc64c03ed20, path=0x7fc64c1ef170 "/thread5/level00", itable=itable@entry=0x7fc654099260) at dht-helper.c:2019 #5 0x00007fc6599f2318 in dht_heal_full_path (data=<optimized out>) at dht-helper.c:2067 #6 0x00007fc6682c3800 in synctask_wrap () at syncop.c:375 #7 0x00007fc6668d8010 in ?? () from /lib64/libc.so.6 #8 0x0000000000000000 in ?? () (gdb) bt #0 __check_cycle (data=<optimized out>, a_dentry=<optimized out>) at inode.c:292 #1 __foreach_ancestor_dentry (dentry=dentry@entry=0x2, data=data@entry=0x7f1cc4032518, per_dentry_fn=0x7f1cf4035680 <__check_cycle>) at inode.c:259 #2 0x00007f1cf4035fa3 in __foreach_ancestor_dentry (dentry=dentry@entry=0x7f1cd8097138, data=data@entry=0x7f1cc4032518, per_dentry_fn=0x7f1cf4035680 <__check_cycle>) at inode.c:276 #3 0x00007f1cf4035fa3 in __foreach_ancestor_dentry (dentry=dentry@entry=0x7f1cc803cd38, data=data@entry=0x7f1cc4032518, per_dentry_fn=0x7f1cf4035680 <__check_cycle>) at inode.c:276 #4 0x00007f1cf4035fa3 in __foreach_ancestor_dentry (dentry=dentry@entry=0x7f1cc403f538, data=data@entry=0x7f1cc4032518, per_dentry_fn=0x7f1cf4035680 <__check_cycle>) at inode.c:276 #5 0x00007f1cf403750e in __is_dentry_cyclic (dentry=0x7f1cc403f538) at inode.c:306 #6 __inode_link (inode=inode@entry=0x7f1cc803d748, parent=parent@entry=0x7f1cc803f248, name=name@entry=0x7f1cc803f019 "level24", iatt=iatt@entry=0x7f1cd8545f70) at inode.c:1174 #7 0x00007f1cf4037a09 in inode_link (inode=0x7f1cc803d748, parent=0x7f1cc803f248, name=name@entry=0x7f1cc803f019 "level24", iatt=iatt@entry=0x7f1cd8545f70) at inode.c:1207 #8 0x00007f1ce578f134 in dht_heal_path (this=this@entry=0x7f1ce00dc100, path=0x7f1cd814bec0 "/thread2/level04/level14/level24/level34/level44/level54", itable=itable@entry=0x7f1cd811b660) at dht-helper.c:2005 #9 0x00007f1ce578f318 in dht_heal_full_path (data=<optimized out>) at dht-helper.c:2067 #10 0x00007f1cf4060800 in synctask_wrap () at syncop.c:375 #11 0x00007f1cf2675010 in ?? () from /lib64/libc.so.6 #12 0x0000000000000000 in ?? () Version-Release number of selected component (if applicable): ============================================================= [root@dhcp42-46 master]# rpm -qa | grep gluster gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-rdma-3.12.2-43.el7rhgs.x86_64 python2-gluster-3.12.2-43.el7rhgs.x86_64 glusterfs-server-3.12.2-43.el7rhgs.x86_64 glusterfs-fuse-3.12.2-43.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-43.el7rhgs.x86_64 glusterfs-api-3.12.2-43.el7rhgs.x86_64 glusterfs-events-3.12.2-43.el7rhgs.x86_64 vdsm-gluster-4.19.43-2.3.el7rhgs.noarch tendrl-gluster-integration-1.6.3-10.el7rhgs.noarch glusterfs-client-xlators-3.12.2-43.el7rhgs.x86_64 glusterfs-cli-3.12.2-43.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 glusterfs-libs-3.12.2-43.el7rhgs.x86_64 glusterfs-3.12.2-43.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: =================== 1. Setup geo-rep session between master and slave volume 2. for i in {create,chmod,symlink,chown,chmod,rename,create,chmod,chgrp,create,truncate,hardlink,create,chmod,chown,symlink,chgrp,create,create}; do crefi --multi -n 10 -b 10 -d 10 --max=10K --min=500 --random -T 10 -t text --fop=$i /mnt/master/ ; sleep 10 ; done 3. Add bricks to the master volume and the slave volume , and start rebalance while geo-rep is syncing file to the slave. 4. Wait for all files to sync to the slave. 5. Check the arequal-checksum which currently matches. Actual results: =============== Cores seen on slave Expected results: ================= There should be no cores Additional info: ================ No functionality impact
Update: Could reproduce today on fuse. The crash happened at md-cache. Here is the bt. Missing separate debuginfos, use: dnf debuginfo-install glibc-2.26-28.fc27.x86_64 openssl-libs-1.1.0h-3.fc27.x86_64 sssd-client-1.16.2-4.fc27.x86_64 (gdb) bt #0 0x00007fb76d2ba660 in raise () from /lib64/libc.so.6 #1 0x00007fb76d2bbc41 in abort () from /lib64/libc.so.6 #2 0x00007fb76d2b2f7a in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fb76d2b2ff2 in __assert_fail () from /lib64/libc.so.6 #4 0x00007fb76ecfce71 in __inode_unref (inode=0x7fb75400c4f8, clear=_gf_false) at inode.c:585 #5 0x00007fb76ecfd108 in inode_unref (inode=0x7fb75400c4f8) at inode.c:670 #6 0x00007fb76ece8855 in loc_wipe (loc=0x7fb750009720) at xlator.c:777 #7 0x00007fb76073481c in mdc_local_wipe (this=0x7fb75c017fc0, local=0x7fb750009720) at md-cache.c:310 #8 0x00007fb7607375e6 in mdc_lookup_cbk (frame=0x7fb7500062b8, cookie=0x7fb750001538, this=0x7fb75c017fc0, op_ret=0, op_errno=0, inode=0x7fb748001ed8, stbuf=0x7fb75c0533d0, dict=0x7fb754016278, postparent=0x7fb75c053670) at md-cache.c:1248 #9 0x00007fb760b5e36a in qr_lookup_cbk (frame=0x7fb750001538, cookie=0x7fb750001838, this=0x7fb75c015230, op_ret=0, op_errno=0, inode_ret=0x7fb748001ed8, buf=0x7fb75c0533d0, xdata=0x7fb754016278, postparent=0x7fb75c053670) at quick-read.c:449 #10 0x00007fb760d69da1 in ioc_lookup_cbk (frame=0x7fb750001838, cookie=0x7fb75000f548, this=0x7fb75c013c00, op_ret=0, op_errno=0, inode=0x7fb748001ed8, stbuf=0x7fb75c0533d0, xdata=0x7fb754016278, postparent=0x7fb75c053670) at io-cache.c:267 #11 0x00007fb76119c0ab in wb_lookup_cbk (frame=0x7fb75000f548, cookie=0x7fb750001f18, this=0x7fb75c010f40, op_ret=0, op_errno=0, inode=0x7fb748001ed8, buf=0x7fb75c0533d0, xdata=0x7fb754016278, postparent=0x7fb75c053670) at write-behind.c:2390 #12 0x00007fb7613b7a10 in dht_heal_full_path_done (op_ret=0, heal_frame=0x7fb75c052248, data=0x7fb75c052248) at dht-helper.c:2113 #13 0x00007fb76ed34c41 in synctask_wrap () at syncop.c:377 #14 0x00007fb76d2cfd00 in ?? () from /lib64/libc.so.6 #15 0x0000000000000000 in ?? () The same issue of inode_unref. The extra unref that happened in dht_heal_path lead to this crash. After applying the patch https://review.gluster.org/#/c/glusterfs/+/21998/, it is resolved.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0658
*** Bug 1712871 has been marked as a duplicate of this bug. ***