1519178 – Brick Kill followed by Replace brick,shows incorrect brick status on RHGS WA

Bug 1519178 - Brick Kill followed by Replace brick,shows incorrect brick status on RHGS WA

Summary: Brick Kill followed by Replace brick,shows incorrect brick status on RHGS WA

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	web-admin-tendrl-gluster-integration
Sub Component:
Version:	rhgs-3.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Nishanth Thomas
QA Contact:	Manisha Saini
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1503134
TreeView+	depends on / blocked

Reported:	2017-11-30 10:47 UTC by Manisha Saini
Modified:	2020-03-02 07:20 UTC (History)
CC List:	6 users (show)
Fixed In Version:	tendrl-gluster-integration-1.6.1-1.el7rhgs, tendrl-api-1.6.1-1.el7rhgs.noarch.rpm, tendrl-commons-1.6.1-1.el7rhgs.noarch.rpm, tendrl-monitoring-integration-1.6.1-1.el7rhgs.noarch.rpm, tendrl-node-agent-1.6.1-1.el7, tendrl-ui-1.6.1-1.el7rhgs.noarch.rpm,
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-09-04 06:59:21 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Status After Killed Brick is been replaced by new brick (152.84 KB, image/png) 2017-11-30 10:49 UTC, Manisha Saini	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2616	0	None	None	None	2018-09-04 07:00:23 UTC

Description Manisha Saini 2017-11-30 10:47:36 UTC

Description of problem:

Brick Kill followed by Replace brick,shows incorrect brick status on  RHGS WA.
Here the brick killed earlier is been replaced by a new brick.


Version-Release number of selected component (if applicable):

# rpm -qa | grep tendrl
tendrl-collectd-selinux-1.5.4-1.el7rhgs.noarch
tendrl-gluster-integration-1.5.4-6.el7rhgs.noarch
tendrl-node-agent-1.5.4-8.el7rhgs.noarch
tendrl-commons-1.5.4-5.el7rhgs.noarch
tendrl-selinux-1.5.4-1.el7rhgs.noarch



How reproducible:


Steps to Reproduce:

1.Create a volumes 4*3 Distributed-Replicate volume

2.Kill a brick in that volume.
Wait for Web admin to reflect the correct status.It shows that particular brick down i Web tendrl

3.Replace the killed brick with another brick 

# gluster v replace-brick ManiVol dhcp42-129.lab.eng.blr.redhat.com:/gluster/brick9/ms3 dhcp42-129.lab.eng.blr.redhat.com:/gluster/brick10/new commit force
volume replace-brick: success: replace-brick commit force operation successful

4.Check the brick status on RHGS WA

Actual results:

After replace brick is performed,It still shows the killed/replaced brick in "Brick Status" in red-color.
New brick is been reflected in "Brick Status" as well.Total,Up,Down in "Bricks" Layout shows correct info i.e 12 bricks in total.


Expected results:

After replace brick is performed,It should not show the killed/replaced brick in "Brick Status"

Additional info:

------------
Volume status post Brick kill-

#kill -9 4860
-----------

# gluster v status ManiVol
Status of volume: ManiVol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp42-119.lab.eng.blr.redhat.com:/gl
uster/brick7/ms1                            49161     0          Y       29591
Brick dhcp42-127.lab.eng.blr.redhat.com:/gl
uster/brick7/ms1                            49160     0          Y       7374 
Brick dhcp42-129.lab.eng.blr.redhat.com:/gl
uster/brick7/msn1                           49160     0          Y       7050 
Brick dhcp42-129.lab.eng.blr.redhat.com:/gl
uster/brick8/ms2                            49161     0          Y       4839 
Brick dhcp42-119.lab.eng.blr.redhat.com:/gl
uster/brick8/ms2                            49162     0          Y       29610
Brick dhcp42-125.lab.eng.blr.redhat.com:/gl
uster/brick7/msn1                           49160     0          Y       27715
Brick dhcp42-125.lab.eng.blr.redhat.com:/gl
uster/brick8/ms2                            49161     0          Y       27817
Brick dhcp42-129.lab.eng.blr.redhat.com:/gl
uster/brick9/ms3                            N/A       N/A        N       N/A  
Brick dhcp42-127.lab.eng.blr.redhat.com:/gl
uster/brick8/msn2                           49161     0          Y       8328 
Brick dhcp42-127.lab.eng.blr.redhat.com:/gl
uster/brick9/ms3                            49162     0          Y       7413 
Brick dhcp42-125.lab.eng.blr.redhat.com:/gl
uster/brick9/ms3                            49162     0          Y       27839
Brick dhcp42-119.lab.eng.blr.redhat.com:/gl
uster/brick9/msn3                           49163     0          Y       32325
NFS Server on localhost                     2049      0          Y       27264
Self-heal Daemon on localhost               N/A       N/A        Y       7215 
NFS Server on dhcp42-119.lab.eng.blr.redhat
.com                                        2049      0          Y       20728
Self-heal Daemon on dhcp42-119.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       32498
NFS Server on dhcp42-125.lab.eng.blr.redhat
.com                                        2049      0          Y       16362
Self-heal Daemon on dhcp42-125.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       27906
NFS Server on dhcp42-127.lab.eng.blr.redhat
.com                                        2049      0          Y       28639
Self-heal Daemon on dhcp42-127.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       8453 
 
Task Status of Volume ManiVol
------------------------------------------------------------------------------
There are no active volume tasks


 
----------
Post replace Brick volume status
----------

# gluster v status ManiVol
Status of volume: ManiVol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp42-119.lab.eng.blr.redhat.com:/gl
uster/brick7/ms1                            49161     0          Y       29591
Brick dhcp42-127.lab.eng.blr.redhat.com:/gl
uster/brick7/ms1                            49160     0          Y       7374 
Brick dhcp42-129.lab.eng.blr.redhat.com:/gl
uster/brick7/msn1                           49160     0          Y       7050 
Brick dhcp42-129.lab.eng.blr.redhat.com:/gl
uster/brick8/ms2                            49161     0          Y       4839 
Brick dhcp42-119.lab.eng.blr.redhat.com:/gl
uster/brick8/ms2                            49162     0          Y       29610
Brick dhcp42-125.lab.eng.blr.redhat.com:/gl
uster/brick7/msn1                           49160     0          Y       27715
Brick dhcp42-125.lab.eng.blr.redhat.com:/gl
uster/brick8/ms2                            49161     0          Y       27817
Brick dhcp42-129.lab.eng.blr.redhat.com:/gl
uster/brick10/new                           49164     0          Y       16649
Brick dhcp42-127.lab.eng.blr.redhat.com:/gl
uster/brick8/msn2                           49161     0          Y       8328 
Brick dhcp42-127.lab.eng.blr.redhat.com:/gl
uster/brick9/ms3                            49162     0          Y       7413 
Brick dhcp42-125.lab.eng.blr.redhat.com:/gl
uster/brick9/ms3                            49162     0          Y       27839
Brick dhcp42-119.lab.eng.blr.redhat.com:/gl
uster/brick9/msn3                           49163     0          Y       32325
NFS Server on localhost                     2049      0          Y       9869 
Self-heal Daemon on localhost               N/A       N/A        Y       9881 
NFS Server on dhcp42-129.lab.eng.blr.redhat
.com                                        2049      0          Y       16658
Self-heal Daemon on dhcp42-129.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       16668
NFS Server on dhcp42-125.lab.eng.blr.redhat
.com                                        2049      0          Y       5713 
Self-heal Daemon on dhcp42-125.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       5721 
NFS Server on dhcp42-127.lab.eng.blr.redhat
.com                                        2049      0          Y       17428
Self-heal Daemon on dhcp42-127.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       17439
 
Task Status of Volume ManiVol
------------------------------------------------------------------------------
There are no active volume tasks

Comment 2 Manisha Saini 2017-11-30 10:49:36 UTC

Created attachment 1360856 [details]
Status After Killed Brick is been replaced by new brick

This Status is after 20-25 mins post replace brick

Comment 3 Filip Balák 2018-01-29 16:02:38 UTC

I have been unable to reproduce this. It seems fixed. I used commands you provided. For a while the health of volume was `Unknown` but after few seconds the status changed to `Up` and the new brick was correctly shown in brick list on `Volumes` dashboard and is listed in navigation for `Bricks` dashboard.

msaini Do you still see the issue?

Tested with:
tendrl-commons-1.5.4-9.el7rhgs.noarch
tendrl-api-1.5.4-4.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-14.el7rhgs.noarch
tendrl-ansible-1.5.4-7.el7rhgs.noarch
tendrl-node-agent-1.5.4-16.el7rhgs.noarch
tendrl-ui-1.5.4-6.el7rhgs.noarch
tendrl-grafana-plugins-1.5.4-14.el7rhgs.noarch
tendrl-notifier-1.5.4-6.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-4.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-gluster-integration-1.5.4-14.el7rhgs.noarch

Comment 4 Nishanth Thomas 2018-01-30 13:48:30 UTC

Since this bug is not seen, moving this to ON_QA

Comment 7 Filip Balák 2018-05-14 14:40:28 UTC

Looks ok. `Brick Status` in `Host` and `Volume` dashboards show correct bricks after brick replacement. Navigation in `Brick` dashboard looks ok too. --> VERIFIED

Tested with:
tendrl-ansible-1.6.3-3.el7rhgs.noarch
tendrl-api-1.6.3-3.el7rhgs.noarch
tendrl-api-httpd-1.6.3-3.el7rhgs.noarch
tendrl-commons-1.6.3-4.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-2.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-gluster-integration-1.6.3-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-2.el7rhgs.noarch
tendrl-node-agent-1.6.3-4.el7rhgs.noarch
tendrl-notifier-1.6.3-2.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-1.el7rhgs.noarch
glusterfs-3.12.2-9.el7rhgs.x86_64

Comment 9 errata-xmlrpc 2018-09-04 06:59:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2616

Note You need to log in before you can comment on or make changes to this bug.