Bug 1223634

Summary: glusterd could crash in remove-brick-status when local remove-brick process has just completed
Product: [Community] GlusterFS Reporter: krishnan parthasarathi <kparthas>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.0CC: amukherj, bugs, gluster-bugs, ndevos, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1223338 Environment:
Last Closed: 2015-07-06 14:00:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1223338, 1225318, 1227168    
Bug Blocks: 1233025    

Description krishnan parthasarathi 2015-05-21 05:35:02 UTC
+++ This bug was initially created as a clone of Bug #1223338 +++

Description of problem:
Glusterd process could crash while executing remove-brick-status command around the time when the local remove-brick process (i.e, rebalance process) has completed migrating data. 

Version-Release number of selected component (if applicable):
mainline

How reproducible:
Intermittent

Steps to Reproduce:
1. Create and start a volume.
2. Add files/directories as required.
3. Remove one or more bricks using remove-brick-start command.
4. Issue remove-brick-status command around the time when the local remove-brick
   process is completed.

Actual results:
glusterd process crashes.

Expected results:
glusterd shouldn't crash. It would be helpful if the remove-brick-status command failed saying that the rebalance process may have just completed with
migration of data from the bricks being removed.

Additional info:
The above steps are representative of when the issue can be seen but not really helpful if you wish to automate this. The following link leads to the regression test, as part of GlusterFS regression test suite, that has hit this problem more often. This could help those interested in automation.

https://github.com/gluster/glusterfs/blob/master/tests/bugs/glusterd/bug-974007.t

Comment 1 Niels de Vos 2015-06-02 08:20:20 UTC
The required changes to fix this bug have not made it into glusterfs-3.7.1. This bug is now getting tracked for glusterfs-3.7.2.

Comment 2 Atin Mukherjee 2015-06-19 05:18:12 UTC
http://review.gluster.org/#/c/10932/ has been merged

Comment 3 Niels de Vos 2015-06-20 10:08:23 UTC
Unfortunately glusterfs-3.7.2 did not contain a code change that was associated with this bug report. This bug is now proposed to be a blocker for glusterfs-3.7.3.

Comment 4 Atin Mukherjee 2015-06-22 04:46:28 UTC
Niels,

I think this bug has been already fixed in 3.7.1. Some how the release notes didn't capture it, not sure why?

Comment 5 Niels de Vos 2015-07-06 14:00:31 UTC
(In reply to Atin Mukherjee from comment #4)
> Niels,
> 
> I think this bug has been already fixed in 3.7.1. Some how the release notes
> didn't capture it, not sure why?

Because the BUG: tag in the patch refers to bug 1225318 and not this one.

*** This bug has been marked as a duplicate of bug 1225318 ***