| Summary: | Assertion Failed Error messages in rebalance logs during rebalance | |||
|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Prasad Desala <tdesala> | |
| Component: | distribute | Assignee: | Susant Kumar Palai <spalai> | |
| Status: | CLOSED DEFERRED | QA Contact: | Sayalee <saraut> | |
| Severity: | high | Docs Contact: | ||
| Priority: | medium | |||
| Version: | rhgs-3.2 | CC: | amukherj, aspandey, nchilaka, pkarampu, pprakash, rhinduja, rhs-bugs, sabose, saraut, sasundar, spalai, storage-qa-internal | |
| Target Milestone: | --- | Keywords: | ZStream | |
| Target Release: | --- | Flags: | nbalacha:
needinfo?
(spalai) amukherj: needinfo? (nbalacha) sabose: needinfo? (spalai) |
|
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1402538 (view as bug list) | Environment: | ||
| Last Closed: | 2020-01-02 05:35:04 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1402538 | |||
|
Description
Prasad Desala
2016-11-24 09:48:36 UTC
While trying to reproduce this issue, I hit with the crash -
(gdb) bt
#0 0x00007f7503611765 in raise () from /lib64/libc.so.6
#1 0x00007f750361336a in abort () from /lib64/libc.so.6
#2 0x00007f7503609f97 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f750360a042 in __assert_fail () from /lib64/libc.so.6
#4 0x00007f74f77bafd4 in ec_manager_setattr (fop=0x7f74ec0b491c, state=4) at ec-inode-write.c:394
#5 0x00007f74f779c405 in __ec_manager (fop=0x7f74ec0b491c, error=0) at ec-common.c:2283
#6 0x00007f74f7797d90 in ec_resume (fop=0x7f74ec0b491c, error=0) at ec-common.c:289
#7 0x00007f74f7797fba in ec_complete (fop=0x7f74ec0b491c) at ec-common.c:362
#8 0x00007f74f77b979a in ec_inode_write_cbk (frame=0x7f74ec2d82bc, this=0x7f74f002e770, cookie=0x3, op_ret=0, op_errno=0, prestat=0x7f74f848b980, poststat=0x7f74f848b910, xdata=0x7f74f01c6a3c) at ec-inode-write.c:65
#9 0x00007f74f77baacc in ec_setattr_cbk (frame=0x7f74ec2d82bc, cookie=0x3, this=0x7f74f002e770, op_ret=0, op_errno=0, prestat=0x7f74f848b980, poststat=0x7f74f848b910, xdata=0x7f74f01c6a3c) at ec-inode-write.c:349
#10 0x00007f74f7a5563d in client3_3_setattr_cbk (req=0x7f74ec0033ec, iov=0x7f74ec00342c, count=1, myframe=0x7f74ec16fabc) at client-rpc-fops.c:2264
#11 0x00007f7504d8026a in rpc_clnt_handle_reply (clnt=0x7f74f005fe70, pollin=0x7f74f0321c60) at rpc-clnt.c:790
#12 0x00007f7504d807c3 in rpc_clnt_notify (trans=0x7f74f0060300, mydata=0x7f74f005fec8, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f74f0321c60) at rpc-clnt.c:970
#13 0x00007f7504d7ca37 in rpc_transport_notify (this=0x7f74f0060300, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f74f0321c60) at rpc-transport.c:537
#14 0x00007f74f9f53ca1 in socket_event_poll_in (this=0x7f74f0060300) at socket.c:2265
#15 0x00007f74f9f541f2 in socket_event_handler (fd=29, idx=19, data=0x7f74f0060300, poll_in=1, poll_out=0, poll_err=0) at socket.c:2395
#16 0x00007f7505024dbf in event_dispatch_epoll_handler (event_pool=0x1268010, event=0x7f74f848bea0) at event-epoll.c:571
#17 0x00007f750502519e in event_dispatch_epoll_worker (data=0x12b1a60) at event-epoll.c:674
#18 0x00007f7503e065ca in start_thread () from /lib64/libpthread.so.0
#19 0x00007f75036e00ed in clone () from /lib64/libc.so.6
------------------------------
It is crashing in ec_manager_setattr at -
if (cbk->iatt[0].ia_type == IA_IFREG) {
ec_iatt_rebuild(fop->xl->private, cbk->iatt, 2,
cbk->count);
/* This shouldn't fail because we have the inode locked. */
GF_ASSERT(ec_get_inode_size(fop,
fop->locks[0].lock->loc.inode,
&cbk->iatt[0].ia_size));
cbk->iatt[1].ia_size = cbk->iatt[0].ia_size;
}
--------------------------------
Reason behind this is that, a file which is a link "IA_IFLNK" is coming as regular file "IA_IFREG" in callback.
(gdb) p *cbk
$6 = {list = {next = 0x7f74ec0b495c, prev = 0x7f74ec0b495c}, answer_list = {next = 0x7f74ec0b496c, prev = 0x7f74f04271ec}, fop = 0x7f74ec0b491c, next = 0x7f74f04271dc, idx = 3, op_ret = 0, op_errno = 0, count = 6, mask = 63, dirty = {0,
0}, xdata = 0x7f74f01c6a3c, dict = 0x0, int32 = 0, uintptr = {0, 0, 0}, size = 0, version = {0, 0}, inode = 0x0, fd = 0x0, statvfs = {f_bsize = 0, f_frsize = 0, f_blocks = 0, f_bfree = 0, f_bavail = 0, f_files = 0, f_ffree = 0,
f_favail = 0, f_fsid = 0, f_flag = 0, f_namemax = 0, __f_spare = {0, 0, 0, 0, 0, 0}}, iatt = {{ia_ino = 13705394905313093545, ia_gfid = "\200\032\325\370.\310D\256\276\063X\rU\242\247\251", ia_dev = 64771, ia_type = IA_IFREG,
ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 1 '\001', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000',
write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 1, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 4096, ia_blocks = 0, ia_atime = 1481113459, ia_atime_nsec = 489040926, ia_mtime = 1481113459,
ia_mtime_nsec = 489040926, ia_ctime = 1481113459, ia_ctime_nsec = 492040918}, {ia_ino = 13705394905313093545, ia_gfid = "\200\032\325\370.\310D\256\276\063X\rU\242\247\251", ia_dev = 64771, ia_type = IA_IFREG, ia_prot = {
suid = 0 '\000', sgid = 0 '\000', sticky = 1 '\001', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000',
exec = 0 '\000'}}, ia_nlink = 1, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 4096, ia_blocks = 0, ia_atime = 1481113459, ia_atime_nsec = 489040926, ia_mtime = 1481113459, ia_mtime_nsec = 489040926,
ia_ctime = 1481113459, ia_ctime_nsec = 494040912}, {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000',
write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0,
ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {
suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000',
exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, {ia_ino = 0,
ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000',
write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0,
ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}}, flock = {l_type = 0, l_whence = 0, l_start = 0, l_len = 0, l_pid = 0, l_owner = {len = 0, data = '\000' <repeats 1023 times>}}, vector = 0x0, buffers = 0x0,
str = 0x0, entries = {{list = {next = 0x7f74f01b4884, prev = 0x7f74f01b4884}, {next = 0x7f74f01b4884, prev = 0x7f74f01b4884}}, d_ino = 0, d_off = 0, d_len = 0, d_type = 0, d_stat = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>,
ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {
read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0,
ia_ctime = 0, ia_ctime_nsec = 0}, dict = 0x0, inode = 0x0, d_name = 0x7f74f01b492c ""}, offset = 0, what = GF_SEEK_DATA}
(gdb)
(gdb)
(gdb) p *cbk->fop
$7 = {id = 38, refs = 3, state = 4, minimum = 4, expected = 6, winds = 0, jobs = 1, error = 0, parent = 0x0, xl = 0x7f74f002e770, req_frame = 0x7f74ec218d8c, frame = 0x7f74ec2d82bc, cbk_list = {next = 0x7f74f01b40fc,
prev = 0x7f74f01b40fc}, answer_list = {next = 0x7f74ec12ef2c, prev = 0x7f74f01b410c}, pending_list = {next = 0x7f74e0017adc, prev = 0x7f74ec154f6c}, answer = 0x7f74f01b40fc, lock_count = 1, locked = 1, locks = {{
lock = 0x7f74ec07b3bc, fop = 0x7f74ec0b491c, owner_list = {next = 0x7f74ec07b3cc, prev = 0x7f74ec07b3cc}, wait_list = {next = 0x7f74ec0b49bc, prev = 0x7f74ec0b49bc}, update = {_gf_false, _gf_true}, base = 0x0, size = 0}, {
lock = 0x0, fop = 0x0, owner_list = {next = 0x7f74ec0b49f4, prev = 0x7f74ec0b49f4}, wait_list = {next = 0x7f74ec0b4a04, prev = 0x7f74ec0b4a04}, update = {_gf_false, _gf_false}, base = 0x0, size = 0}}, first_lock = 0, lock = {
spinlock = 1, mutex = {__data = {__lock = 1, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001", '\000' <repeats 38 times>, __align = 1}},
flags = 0, first = 0, mask = 63, healing = 0, remaining = 0, received = 63, good = 63, uid = 0, gid = 0, wind = 0x7f74f77baad2 <ec_wind_setattr>, handler = 0x7f74f77bae65 <ec_manager_setattr>, resume = 0x0, cbks = {
access = 0x7f750505f400 <default_setattr_cbk>, create = 0x7f750505f400 <default_setattr_cbk>, discard = 0x7f750505f400 <default_setattr_cbk>, entrylk = 0x7f750505f400 <default_setattr_cbk>,
fentrylk = 0x7f750505f400 <default_setattr_cbk>, fallocate = 0x7f750505f400 <default_setattr_cbk>, flush = 0x7f750505f400 <default_setattr_cbk>, fsync = 0x7f750505f400 <default_setattr_cbk>,
fsyncdir = 0x7f750505f400 <default_setattr_cbk>, getxattr = 0x7f750505f400 <default_setattr_cbk>, fgetxattr = 0x7f750505f400 <default_setattr_cbk>, heal = 0x7f750505f400 <default_setattr_cbk>,
fheal = 0x7f750505f400 <default_setattr_cbk>, inodelk = 0x7f750505f400 <default_setattr_cbk>, finodelk = 0x7f750505f400 <default_setattr_cbk>, link = 0x7f750505f400 <default_setattr_cbk>, lk = 0x7f750505f400 <default_setattr_cbk>,
lookup = 0x7f750505f400 <default_setattr_cbk>, mkdir = 0x7f750505f400 <default_setattr_cbk>, mknod = 0x7f750505f400 <default_setattr_cbk>, open = 0x7f750505f400 <default_setattr_cbk>, opendir = 0x7f750505f400 <default_setattr_cbk>,
readdir = 0x7f750505f400 <default_setattr_cbk>, readdirp = 0x7f750505f400 <default_setattr_cbk>, readlink = 0x7f750505f400 <default_setattr_cbk>, readv = 0x7f750505f400 <default_setattr_cbk>,
removexattr = 0x7f750505f400 <default_setattr_cbk>, fremovexattr = 0x7f750505f400 <default_setattr_cbk>, rename = 0x7f750505f400 <default_setattr_cbk>, rmdir = 0x7f750505f400 <default_setattr_cbk>,
setattr = 0x7f750505f400 <default_setattr_cbk>, fsetattr = 0x7f750505f400 <default_setattr_cbk>, setxattr = 0x7f750505f400 <default_setattr_cbk>, fsetxattr = 0x7f750505f400 <default_setattr_cbk>,
stat = 0x7f750505f400 <default_setattr_cbk>, fstat = 0x7f750505f400 <default_setattr_cbk>, statfs = 0x7f750505f400 <default_setattr_cbk>, symlink = 0x7f750505f400 <default_setattr_cbk>,
truncate = 0x7f750505f400 <default_setattr_cbk>, ftruncate = 0x7f750505f400 <default_setattr_cbk>, unlink = 0x7f750505f400 <default_setattr_cbk>, writev = 0x7f750505f400 <default_setattr_cbk>,
xattrop = 0x7f750505f400 <default_setattr_cbk>, fxattrop = 0x7f750505f400 <default_setattr_cbk>, zerofill = 0x7f750505f400 <default_setattr_cbk>, seek = 0x7f750505f400 <default_setattr_cbk>,
ipc = 0x7f750505f400 <default_setattr_cbk>}, data = 0x0, heal = 0x0, healer = {next = 0x7f74ec0b4ac4, prev = 0x7f74ec0b4ac4}, user_size = 0, head = 0, use_fd = 0, xdata = 0x7f74ec1ca3fc, dict = 0x0, int32 = 6, uint32 = 0, size = 0,
offset = 0, mode = {0, 0}, entrylk_cmd = ENTRYLK_LOCK, entrylk_type = ENTRYLK_RDLCK, xattrop_flags = GF_XATTROP_ADD_ARRAY, dev = 0, inode = 0x0, fd = 0x0, iatt = {ia_ino = 13705394905313093545,
ia_gfid = "\200\032\325\370.\310D\256\276\063X\rU\242\247\251", ia_dev = 64771, ia_type = IA_IFLNK, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 1 '\001', write = 1 '\001', exec = 1 '\001'},
group = {read = 1 '\001', write = 1 '\001', exec = 1 '\001'}, other = {read = 1 '\001', write = 1 '\001', exec = 1 '\001'}}, ia_nlink = 1, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 23, ia_blksize = 4096, ia_blocks = 4,
ia_atime = 1481113215, ia_atime_nsec = 659685788, ia_mtime = 1474704618, ia_mtime_nsec = 0, ia_ctime = 1481111709, ia_ctime_nsec = 633516169}, str = {0x0, 0x0}, loc = {{
path = 0x7f74ec2f0b80 "/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts", name = 0x7f74ec2f0b9f "sun8i-a33-ippo-q8h-v1.2.dts", inode = 0x7f74cc02756c, parent = 0x7f74e40085dc,
gfid = "\200\032\325\370.\310D\256\276\063X\rU\242\247\251", pargfid = "\224ϩ\346~UGZ\225\067\003\064\003J", <incomplete sequence \362>}, {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>,
pargfid = '\000' <repeats 15 times>}}, flock = {l_type = 0, l_whence = 0, l_start = 0, l_len = 0, l_pid = 0, l_owner = {len = 0, data = '\000' <repeats 1023 times>}}, vector = 0x0, buffers = 0x0, seek = GF_SEEK_DATA}
(gdb)
The RC for this is that, when we do setattr the link file was actually a link.
However, the moment it reached to backend, it has been migrated to some other subvolume. At this point when posix_pstat get called, it is the "T" file on which stat happens. This T file is a simply regular file.
root@apandey glusterfs]#
[root@apandey glusterfs]# ll /brick/gluster/vol*/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
lrwxrwxrwx. 2 root root 23 Sep 24 13:40 /brick/gluster/vol-10/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts -> sun8i-a33-q8-tablet.dts
lrwxrwxrwx. 2 root root 23 Sep 24 13:40 /brick/gluster/vol-11/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts -> sun8i-a33-q8-tablet.dts
lrwxrwxrwx. 2 root root 23 Sep 24 13:40 /brick/gluster/vol-12/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts -> sun8i-a33-q8-tablet.dts
---------T. 2 root root 0 Dec 7 17:54 /brick/gluster/vol-13/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
---------T. 2 root root 0 Dec 7 17:54 /brick/gluster/vol-14/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
---------T. 2 root root 0 Dec 7 17:54 /brick/gluster/vol-15/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
---------T. 2 root root 0 Dec 7 17:54 /brick/gluster/vol-16/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
---------T. 2 root root 0 Dec 7 17:54 /brick/gluster/vol-17/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
---------T. 2 root root 0 Dec 7 17:54 /brick/gluster/vol-18/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
lrwxrwxrwx. 2 root root 23 Sep 24 13:40 /brick/gluster/vol-7/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts -> sun8i-a33-q8-tablet.dts
lrwxrwxrwx. 2 root root 23 Sep 24 13:40 /brick/gluster/vol-8/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts -> sun8i-a33-q8-tablet.dts
lrwxrwxrwx. 2 root root 23 Sep 24 13:40 /brick/gluster/vol-9/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts -> sun8i-a33-q8-tablet.dts
[root@apandey glusterfs]#
[root@apandey glusterfs]#
[root@apandey glusterfs]#
[root@apandey glusterfs]# stat /brick/gluster/vol-13/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
File: '/brick/gluster/vol-13/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts'
Size: 0 Blocks: 8 IO Block: 4096 regular empty file
Device: fd03h/64771d Inode: 4233378 Links: 2
Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:default_t:s0
Access: 2016-12-07 17:54:19.488040928 +0530
Modify: 2016-12-07 17:54:19.488040928 +0530
Change: 2016-12-07 17:56:05.634760199 +0530
Birth: -
[root@apandey glusterfs]# stat /brick/gluster/vol-10/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts
File: '/brick/gluster/vol-10/linux-4.7.5/arch/arm/boot/dts/sun8i-a33-ippo-q8h-v1.2.dts' -> 'sun8i-a33-q8-tablet.dts'
Size: 23 Blocks: 8 IO Block: 4096 symbolic link
Device: fd03h/64771d Inode: 4208551 Links: 2
Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:default_t:s0
Access: 2016-12-07 17:56:09.681749496 +0530
Modify: 2016-09-24 13:40:18.000000000 +0530
Change: 2016-12-07 17:25:09.628516181 +0530
This is the log I placed in posix -
[2016-12-07 12:24:19.494117] E [MSGID: 113018] [posix.c:506:posix_setattr] 0-vol-posix: Getting REG ofr LINK : /brick/gluster/vol-18/.glusterfs/80/1a/801ad5f8-2ec8-44ae-be33-580d55a2a7a9 statpost.ia_type = 1statpre.ia_type = 1 LOC_IS_LNK(loc) = 0
------------------------
Now, when the call back reaches to EC, the condition
if (cbk->iatt[0].ia_type == IA_IFREG) {
becomes true and it checks for the size, which it will not have.
That leads to assertion failure.
I think we have to handle this case in EC. I talked to Prashant and he was saying that had it been given to dht, it would have handled in second phase of setattr.
upstream mainline patch http://review.gluster.org/16103 posted for review. I tried the same scenario with "--enable-debug" flag OFF to not to throw any assert.
I added bricks and started re-balance. The re-balance "completed" without fail and any crash.
I confirmed that files and links were migrated to newly added bricks too.
I also tried rename of 100 files from mount point and did not see any core.
I think this bug does not cause a functionality break for now.
volume rebalance: vol: success
[root@apandey glusterfs]# gluster v rebalance vol status
Node Rebalanced-files size scanned failures skipped status run time in h:m:s
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
localhost 891 3.9KB 3169 0 681 completed 0:0:28
volume rebalance: vol: success
Reproducible on 3.3.0 as well. From rebalance ogs during 3.3.0 Scale tests : [root@gqas007 glusterfs]# grep -R "Assertion failed" butcher-rebalance.log:[2017-07-19 14:32:00.813871] E [ec-inode-write.c:395:ec_manager_setattr] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0xdb98) [0x7fc5e97f5b98] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0xd9bb) [0x7fc5e97f59bb] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0x2a063) [0x7fc5e9812063] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size) butcher-rebalance.log:[2017-07-19 14:34:50.753306] E [ec-inode-write.c:395:ec_manager_setattr] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0xdb98) [0x7fc5e97f5b98] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0xd9bb) [0x7fc5e97f59bb] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0x2a063) [0x7fc5e9812063] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size) butcher-rebalance.log:[2017-07-19 14:35:10.113387] E [ec-inode-write.c:395:ec_manager_setattr] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0xdb98) [0x7fc5e97f5b98] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0xd9bb) [0x7fc5e97f59bb] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/disperse.so(+0x2a063) [0x7fc5e9812063] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size) [root@gqas007 glusterfs]# sosreports will be updated. *** Bug 1566417 has been marked as a duplicate of this bug. *** Status? Tested with RHGS 3.5.0 interim build ( glusterfs-6.0-15.el7rhgs ) and I see the same error messages in excess when there is active I/O during rebalance of the distributed-disperse volume [2019-10-10 12:42:50.855875] E [ec-inode-write.c:483:ec_manager_setattr] (-->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0x12378) [0x7fbdb18e7378] -->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0x1219b) [0x7fbdb18e719b] -->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0x32273) [0x7fbdb1907273] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size) [2019-10-10 12:42:50.868382] E [ec-inode-write.c:483:ec_manager_setattr] (-->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0x12378) [0x7fbdb18e7378] -->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0x1219b) [0x7fbdb18e719b] -->/usr/lib64/glusterfs/6.0/xlator/cluster/disperse.so(+0x32273) [0x7fbdb1907273] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size) [root@ ~]# grep ' E ' /var/log/glusterfs/backupvol-rebalance.log-20191011 | grep -i 'assertion failed' | wc -l 186 |