Bug 796171

Summary: [2b52b096a7db3124fdd97554e63792f36e889af9]: Source brick crashed with core when replace-brick is issued.
Product: [Community] GlusterFS Reporter: Rahul C S <rahulcs>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: gluster-bugs, nsathyan, rfortier
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:30:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: 1f3a0dd4742a2fcd3215aee4a5e22125d7ea4f4d Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description Rahul C S 2012-02-22 12:30:17 UTC
Description of problem:

Issue replace-brick operation at the above git head. Source brick crashes.

Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id vol.dagobah.data-export1 -'.
Program terminated with signal 6, Aborted.
#0  0x00007fd2853af3a5 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
	in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x00007fd2853af3a5 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fd2853b2b0b in __GI_abort () at abort.c:92
#2  0x00007fd2853a7d4d in __GI___assert_fail (assertion=0x7fd27f729f7e "0", file=<optimized out>, line=363, function=<optimized out>)
    at assert.c:81
#3  0x00007fd27f725615 in index_add (this=0x205f8e0, gfid=0x7fd27941f0e8 "", subdir=0x7fd27f729f02 "xattrop")
    at ../../../../../xlators/features/index/src/index.c:362
#4  0x00007fd27f725c9e in _xattrop_index_action (this=0x205f8e0, inode=0x7fd27941f0e0, xattr=0x7fd274000d70)
    at ../../../../../xlators/features/index/src/index.c:476
#5  0x00007fd27f725ce8 in fop_xattrop_index_action (this=0x205f8e0, inode=0x7fd27941f0e0, xattr=0x7fd274000d70)
    at ../../../../../xlators/features/index/src/index.c:487
#6  0x00007fd27f72646d in index_xattrop_cbk (frame=0x7fd2843c8178, cookie=0x7fd2843c8224, this=0x205f8e0, op_ret=0, op_errno=61, 
    xattr=0x7fd274000d70) at ../../../../../xlators/features/index/src/index.c:629
#7  0x00007fd27f93ff31 in iot_xattrop_cbk (frame=0x7fd2843c8224, cookie=0x7fd2843c82d0, this=0x205e610, op_ret=0, op_errno=61, 
    xattr=0x7fd274000d70) at ../../../../../xlators/performance/io-threads/src/io-threads.c:2172
#8  0x00007fd285d8b462 in default_xattrop_cbk (frame=0x7fd2843c82d0, cookie=0x7fd2843c837c, this=0x205d3e0, op_ret=0, op_errno=61, 
    dict=0x7fd274000d70) at ../../../libglusterfs/src/defaults.c:309
#9  0x00007fd285d8b462 in default_xattrop_cbk (frame=0x7fd2843c837c, cookie=0x7fd2843c8428, this=0x205c190, op_ret=0, op_errno=61, 
    dict=0x7fd274000d70) at ../../../libglusterfs/src/defaults.c:309
#10 0x00007fd27ff91aa2 in do_xattrop (frame=0x7fd2843c8428, this=0x205ac10, loc=0x7fd28324ad1c, fd=0x0, optype=GF_XATTROP_ADD_ARRAY, 
    xattr=0x7fd274000d70) at ../../../../../xlators/storage/posix/src/posix.c:3136
#11 0x00007fd27ff91aff in posix_xattrop (frame=0x7fd2843c8428, this=0x205ac10, loc=0x7fd28324ad1c, optype=GF_XATTROP_ADD_ARRAY, 
    xattr=0x7fd274000d70) at ../../../../../xlators/storage/posix/src/posix.c:3145
#12 0x00007fd285d95fb9 in default_xattrop (frame=0x7fd2843c837c, this=0x205c190, loc=0x7fd28324ad1c, flags=GF_XATTROP_ADD_ARRAY, 
    dict=0x7fd274000d70) at ../../../libglusterfs/src/defaults.c:1040
#13 0x00007fd285d95fb9 in default_xattrop (frame=0x7fd2843c82d0, this=0x205d3e0, loc=0x7fd28324ad1c, flags=GF_XATTROP_ADD_ARRAY, 
    dict=0x7fd274000d70) at ../../../libglusterfs/src/defaults.c:1040
#14 0x00007fd27f940181 in iot_xattrop_wrapper (frame=0x7fd2843c8224, this=0x205e610, loc=0x7fd28324ad1c, optype=GF_XATTROP_ADD_ARRAY, 
    xattr=0x7fd274000d70) at ../../../../../xlators/performance/io-threads/src/io-threads.c:2181
#15 0x00007fd285dac8c3 in call_resume_wind (stub=0x7fd28324ace4) at ../../../libglusterfs/src/call-stub.c:2507
#16 0x00007fd285db38a9 in call_resume (stub=0x7fd28324ace4) at ../../../libglusterfs/src/call-stub.c:3938
#17 0x00007fd27f9318ef in iot_worker (data=0x2077610) at ../../../../../xlators/performance/io-threads/src/io-threads.c:138
#18 0x00007fd28571fefc in start_thread (arg=0x7fd279f93700) at pthread_create.c:304
#19 0x00007fd28545a89d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#20 0x0000000000000000 in ?? ()
(gdb) f 3
#3  0x00007fd27f725615 in index_add (this=0x205f8e0, gfid=0x7fd27941f0e8 "", subdir=0x7fd27f729f02 "xattrop")
    at ../../../../../xlators/features/index/src/index.c:362
362	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name, !uuid_is_null (gfid),
(gdb) l
357	        index_priv_t      *priv = NULL;
358	        struct stat       st = {0};
359	        int               fd = 0;
360	
361	        priv = this->private;
362	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name, !uuid_is_null (gfid),
363	                                       out, op_errno, EINVAL);
364	
365	        make_gfid_path (priv->index_basepath, subdir, gfid,
366	                        gfid_path, sizeof (gfid_path));
(gdb) p gfid
$1 = (unsigned char *) 0x7fd27941f0e8 ""
(gdb) p *gfid
$2 = 0 '\000'
(gdb) p *this
$3 = {name = 0x205efd0 "vol-index", type = 0x205f780 "features/index", next = 0x205e610, prev = 0x2060c10, parents = 0x2063300, 
  children = 0x2060b80, options = 0x205f070, dlhandle = 0x20602f0, fops = 0x7fd27f92c800, cbks = 0x7fd27f92caa0, dumpops = 0x7fd27f92cae0, 
  volume_options = {next = 0x20609a0, prev = 0x20609a0}, fini = 0x7fd27f729bd3 <fini>, init = 0x7fd27f7297ef <init>, reconfigure = 0, 
  mem_acct_init = 0x7fd27f7297c3 <mem_acct_init>, notify = 0x7fd27f729d5a <notify>, loglevel = GF_LOG_NONE, latencies = {{min = 0, max = 0, 
      total = 0, std = 0, mean = 0, count = 0} <repeats 46 times>}, history = 0x0, ctx = 0x2034010, graph = 0x2056840, itable = 0x0, 
  init_succeeded = 1 '\001', private = 0x2076860, mem_acct = {num_types = 94, rec = 0x2075c70}, winds = 0, switched = 0 '\000', 
  local_pool = 0x0}

Comment 1 krishnan parthasarathi 2012-02-22 13:57:05 UTC
The issue is that index xlator needs a resolved inode in the incoming xattrop request to perform its indexing. In the case of replace-brick pump xlator, unlike server doesn't resolve inode, leading to null gfid sent to index xlator.
Positioning pump above index xlator in the graph avoids this problem in the first place. This solution is justified since pump as opposed to afr doesn't 'need' the services of index xlator.

Comment 2 Anand Avati 2012-02-23 09:57:25 UTC
CHANGE: http://review.gluster.com/2801 (glusterd: Modified server graph to have index xl above pump) merged in master by Vijay Bellur (vijay)

Comment 3 Rahul C S 2012-04-05 07:58:57 UTC
Not finding the crash with same procedure. replace-brick happening properly according to status.