Bug 1133383 - [AFR-V2] - Self-heal daemon crashes with SIGABRT when heal full is executed on a volume that has file with no gfid
Summary: [AFR-V2] - Self-heal daemon crashes with SIGABRT when heal full is executed o...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Karthik U S
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-25 05:16 UTC by Krutika Dhananjay
Modified: 2019-07-02 04:00 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-02 04:00:12 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Krutika Dhananjay 2014-08-25 05:16:28 UTC
Description of problem:
Self-heal daemon crashed when heal full was executed on a replicate volume which had a file with no gfid, on AFR-V2 when compiled with -DDEBUG option.

Credits to Pranith for discovering the bug.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Create, start and mount a plain replicate volume.
2. Create a file 'foo' (say) under the root of all the brick directories.
3. Execute gluster volume heal <volname> full.
4. Check volume's status.

Actual results:
Volume status indicates glustershd is down.

Expected results:
Heal full should not crash glustershd.

Additional info:


(gdb) bt
#0  0x0000003d7a0359e9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x0000003d7a0370f8 in __GI_abort () at abort.c:90
#2  0x0000003d7a02e956 in __assert_fail_base (fmt=0x3d7a17ddc8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f2ef4e3ca1e "0", file=file@entry=0x7f2ef4e3c8ff "client-rpc-fops.c", line=line@entry=4832, 
    function=function@entry=0x7f2ef4e3d870 <__PRETTY_FUNCTION__.15942> "client3_3_getxattr") at assert.c:92
#3  0x0000003d7a02ea02 in __GI___assert_fail (assertion=0x7f2ef4e3ca1e "0", file=0x7f2ef4e3c8ff "client-rpc-fops.c", line=4832, function=0x7f2ef4e3d870 <__PRETTY_FUNCTION__.15942> "client3_3_getxattr") at assert.c:101
#4  0x00007f2ef4e2f092 in client3_3_getxattr (frame=0x7f2ed40014ac, this=0x16655d0, data=0x7f2eef5fc400) at client-rpc-fops.c:4830
#5  0x00007f2ef4e129b8 in client_getxattr (frame=0x7f2ed40014ac, this=0x16655d0, loc=0x7f2eef5fcd00, name=0x7f2ef4bf3383 "glusterfs.gfid2path", xdata=0x0) at client.c:1467
#6  0x00007f2efeb42b0b in syncop_getxattr (subvol=0x16655d0, loc=0x7f2eef5fcd00, dict=0x7f2eef5fccf0, key=0x7f2ef4bf3383 "glusterfs.gfid2path") at syncop.c:1393
#7  0x00007f2ef4bdc670 in afr_shd_gfid_to_path (this=0x16684d0, subvol=0x16655d0, gfid=0x167cd60 "", path_p=0x7f2eef5fcd88) at afr-self-heald.c:875
#8  0x00007f2ef4bdaef0 in afr_shd_selfheal (healer=0x1670b80, child=0, gfid=0x167cd60 "") at afr-self-heald.c:305
#9  0x00007f2ef4bdb612 in afr_shd_full_sweep (healer=0x1670b80, inode=0x167088c) at afr-self-heald.c:504
#10 0x00007f2ef4bdb991 in afr_shd_full_healer (data=0x1670b80) at afr-self-heald.c:604
#11 0x0000003d7a807c53 in start_thread (arg=0x7f2eef5fd700) at pthread_create.c:308
#12 0x0000003d7a0f5dbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) f 4
#4  0x00007f2ef4e2f092 in client3_3_getxattr (frame=0x7f2ed40014ac, this=0x16655d0, data=0x7f2eef5fc400) at client-rpc-fops.c:4830
4830	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name,
(gdb) l
4825	        if (args->loc->inode && !uuid_is_null (args->loc->inode->gfid))
4826	                memcpy (req.gfid,  args->loc->inode->gfid, 16);
4827	        else
4828	                memcpy (req.gfid, args->loc->gfid, 16);
4829	
4830	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name,
4831	                                       !uuid_is_null (*((uuid_t*)req.gfid)),
4832	                                       unwind, op_errno, EINVAL);
4833	        req.namelen = 1; /* Use it as a flag */
4834	
(gdb) p req.gfid
$1 = '\000' <repeats 15 times>
(gdb) f 9
#9  0x00007f2ef4bdb612 in afr_shd_full_sweep (healer=0x1670b80, inode=0x167088c) at afr-self-heald.c:504
504				afr_shd_selfheal (healer, healer->subvol,
(gdb) l
499					continue;
500	
501				afr_shd_selfheal_name (healer, healer->subvol,
502						       inode->gfid, entry->d_name);
503	
504				afr_shd_selfheal (healer, healer->subvol,
505						  entry->d_stat.ia_gfid);
506	
507				if (entry->d_stat.ia_type == IA_IFDIR) {
508					ret = afr_shd_full_sweep (healer, entry->inode);
(gdb) p entry->d_stat.ia_gfid
$3 = '\000' <repeats 15 times>
(gdb)

Comment 1 Amar Tumballi 2019-07-02 04:00:12 UTC
Considering there is no activity in last 4 years, marking as DEFERRED. Feel free to update if considered critical.


Note You need to log in before you can comment on or make changes to this bug.