1423406 – Need to improve remove-brick failure message when the brick process is down.

Bug 1423406 - Need to improve remove-brick failure message when the brick process is down.

Summary: Need to improve remove-brick failure message when the brick process is down.

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.10
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Gaurav Yadav
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1339054 1422624 1438325
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-17 09:22 UTC by Gaurav Yadav
Modified:	2017-04-03 05:14 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.10.0
Clone Of:	1422624
Environment:
Last Closed:	2017-03-06 17:46:38 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Gaurav Yadav 2017-02-17 09:22:04 UTC

+++ This bug was initially created as a clone of Bug #1422624 +++

+++ This bug was initially created as a clone of Bug #1339054 +++

Description of problem:
======================
If we try to remove offline brick, the operation is failing with error message "volume remove-brick start: failed: Found stopped brick <hostname>:/bricks/brick1/a1" and this condition is added newly in 3.7.9-6 build.

Currently we have use force option to remove the offline brick and same thing is expected in the failure message to use force option to remove the offline brick to guide the user.

With this users will know the how to remove the offline brick.


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-6


How reproducible:
=================
Always


Steps to Reproduce:
===================
1. Create a simple volume of any type and start it
2. Kill one of the volume brick
3. Try to remove the killed brick (offline brick) 
4. Check the brick failure error message //message won't convey how to remove the brick.


Failure message getting:
=======================
]# gluster volume remove-brick Dis <hostname>:/bricks/brick1/a1 start
volume remove-brick start: failed: Found stopped brick <hostname>:/bricks/brick1/a1


Actual results:
================
Failure is not saying how to remove the offline brick.

Expected results:
=================
Failure message need to have force option help message to remove the offline brick.

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-05-24 00:35:31 EDT ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-07-04 04:53:19 EDT ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Rejy M Cyriac on 2016-09-17 12:33:32 EDT ---

Moving BZ to a transitional component in preparation for removing the 'glusterd' sub-component at the 'glusterd' component

--- Additional comment from Rejy M Cyriac on 2016-09-17 12:46:25 EDT ---

Moving BZ back to the 'glusterd' component after removal of the 'glusterd' sub-component

--- Additional comment from Worker Ant on 2017-02-16 02:47:07 EST ---

REVIEW: https://review.gluster.org/16630 (glusterd : Fix for error message while removing brick) posted (#1) for review on master by Gaurav Yadav (gyadav)

--- Additional comment from Worker Ant on 2017-02-16 03:49:53 EST ---

REVIEW: https://review.gluster.org/16630 (glusterd : Fix for error message while removing brick) posted (#2) for review on master by Gaurav Yadav (gyadav)

Comment 1 Worker Ant 2017-02-17 09:31:39 UTC

REVIEW: https://review.gluster.org/16645 (glusterd : Fix for error mesage while detaching peers) posted (#2) for review on release-3.10 by Gaurav Yadav (gyadav)

Comment 2 Worker Ant 2017-02-17 15:18:53 UTC

COMMIT: https://review.gluster.org/16645 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 324268b61a197389012304ee4223629965c0261c
Author: Gaurav Yadav <gyadav>
Date:   Mon Feb 13 15:46:24 2017 +0530

    glusterd : Fix for error mesage while detaching peers
    
    When peer is detached from a cluster, an error log is being
    generated in glusterd.log -"Failed to reconfigure all daemon
    services". This log is seen in the originator node where the
    detach is issued.
    
    This happens in two cases.
    Case 1: Detach peer with no volume been created in cluster.
    Case 2: Detach peer after deleting all the volumes which were
    created but never started.
    In any one of the above two cases, in glusterd_check_files_identical()
    GlusterD fails to retrieve nfs-server.vol file from /var/lib/glusterd/nfs
    which gets created only when a volume is in place and and is started.
    
    With this fix both the above cases have been handled by added
    validation to skip reconfigure if there is no volume in started
    state.
    
    > Reviewed-on: https://review.gluster.org/16607
    > Smoke: Gluster Build System <jenkins.org>
    > Tested-by: Atin Mukherjee <amukherj>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Atin Mukherjee <amukherj>
    
    (cherry picked from commit be44a1bd519af69b21acf682b0908d4d695f868e)
    
    Change-Id: I039c0840e3d61ab54575e1e00c6a6a00874d84c0
    BUG: 1423406
    Signed-off-by: Gaurav Yadav <gyadav>
    Reviewed-on: https://review.gluster.org/16645
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Samikshan Bairagya <samikshan>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 3 Shyamsundar 2017-03-06 17:46:38 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.