Bug 1343695

Summary: [Disperse] : Assertion Failed Error messages in rebalance log post add-brick/rebalance.
Product: Red Hat Gluster Storage Reporter: Ambarish <asoman>
Component: disperseAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED ERRATA QA Contact: Ambarish <asoman>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: amukherj, aspandey, pkarampu, rcyriac, rhinduja, rhs-bugs
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: x86_64   
OS: Linux   
Fixed In Version: glusterfs-3.8.4-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 05:35:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 1343906    
Bug Blocks: 1351522    

Description Ambarish 2016-06-07 17:28:48 UTC
Description of problem:

Started with a 1*(4+2) disperse volume.Added bricks.Rebalanced.

Rebalance log had the following setxattr assertion errors :

[2016-06-06 15:10:53.999653] E [ec-inode-write.c:395:ec_manager_setattr] (-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_resume+0x91) [0x7fd263b5e621] -->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(__ec_manager+0x57) [0x7fd263b5e807] -->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_manager_setattr+0x2c6) [0x7fd263b7be76] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-06-06 15:10:54.003509] E [ec-inode-write.c:395:ec_manager_setattr] (-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_resume+0x91) [0x7fd263b5e621] -->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(__ec_manager+0x57) [0x7fd263b5e807] -->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_manager_setattr+0x2c6) [0x7fd263b7be76] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-06-06 15:10:54.012540] E [ec-inode-write.c:395:ec_manager_setattr] (-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_resume+0x91) [0x7fd263b5e621] -->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(__ec_manager+0x57) [0x7fd263b5e807] -->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_manager_setattr+0x2c6) [0x7fd263b7be76] ) 0-: Assertion failed: ec_get_inode_size(fop, fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-06-06 15:13:15.333800] E [MSGID: 109023] 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:

1. Create a 1*(4+2) EC volume.Mount it via gNFS.

2. Add brick rebalance,.Run I/Os from various mounts while this happens.

3. Check rebal logs periodically.

Actual results:

Assertion Failure error message in logs

Expected results:

Add-brick/rebal should succeed without any problems

Additional info:
Ashish raised a bug for the same(tracked via https://bugzilla.redhat.com/show_bug.cgi?id=1339465),which was later duped to https://bugzilla.redhat.com/show_bug.cgi?id=1330997.The fix version given in tha BZ is 3.7.9-7.I am able to hit this issue on 3.7.9-8 as well.

Comment 5 Atin Mukherjee 2016-08-30 05:00:31 UTC
fix http://review.gluster.org/15008 has made into release-3.8 branch in gluster upstream and the same should be available in rhgs-3.2.0 as part of rebase.

Comment 6 Atin Mukherjee 2016-09-17 13:39:18 UTC
Upstream mainline : http://review.gluster.org/14669
Upstream 3.8 : http://review.gluster.org/15008

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 9 Ambarish 2016-10-24 08:57:22 UTC
Verified on 3.8.4-2.

Did a couple of add-brick+rebal and remove-bricks with continuous I/O from different mounts over gNFS.
Could not reproduce the reported issue.

Comment 11 errata-xmlrpc 2017-03-23 05:35:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.