1548829 – [BMux] : Stale brick processes on the nodes after vol deletion.

Bug 1548829 - [BMux] : Stale brick processes on the nodes after vol deletion.

Summary: [BMux] : Stale brick processes on the nodes after vol deletion.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	core
Sub Component:
Version:	rhgs-3.4
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Mohit Agrawal
QA Contact:	Manisha Saini
Docs Contact:
URL:
Whiteboard:	brick-multiplexing
Duplicates (1):	1563640 (view as bug list)
Depends On:
Blocks:	1503137 1549023 1549996
TreeView+	depends on / blocked

Reported:	2018-02-25 11:03 UTC by Ambarish
Modified:	2018-09-24 13:01 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.12.2-8
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1549996 (view as bug list)
Environment:
Last Closed:	2018-09-04 06:42:45 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:43:52 UTC

Description Ambarish 2018-02-25 11:03:41 UTC

Description of problem:
------------------------

Create a multi-brick EC volume.

Start it.

Delete it.

Check for running glusterfsd on all the nodes in the Trusted Storage Pool.

Nodes will still be having some stale glusterfsd processes  post vol stop/delete, which should ideally be removed when a volume is deleted.


Example :

I created two volumes - drogon and drogon2. And deleted them.

<snip>

[root@gqas007 ~]# gluster v list
No volumes present in cluster
[root@gqas007 ~]# 

[root@gqas007 ~]# 
[root@gqas007 ~]# ps -ef|grep fsd
root     21148     1  0 05:43 ?        00:00:01 /usr/sbin/glusterfsd -s gqas007 --volfile-id drogon.gqas007.bricks1-A1 -p /var/run/gluster/vols/drogon/gqas007-bricks1-A1.pid -S /var/run/gluster/e5102c357100a19ec60edec10a566e61.socket --brick-name /bricks1/A1 -l /var/log/glusterfs/bricks/bricks1-A1.log --xlator-option *-posix.glusterd-uuid=e72fdebf-3130-4d05-8cf5-966f4c4926c4 --brick-port 49152 --xlator-option drogon-server.listen-port=49152
root     21639     1  0 05:55 ?        00:00:01 /usr/sbin/glusterfsd -s gqas007 --volfile-id drogon2.gqas007.bricks1-A1 -p /var/run/gluster/vols/drogon2/gqas007-bricks1-A1.pid -S /var/run/gluster/2bd4d8669b0cdb67f9e15f99776d1e36.socket --brick-name /bricks1/A1 -l /var/log/glusterfs/bricks/bricks1-A1.log --xlator-option *-posix.glusterd-uuid=e72fdebf-3130-4d05-8cf5-966f4c4926c4 --brick-port 49153 --xlator-option drogon2-server.listen-port=49153
root     22042 19908  0 06:00 pts/0    00:00:00 grep --color=auto fsd
[root@gqas007 ~]# 

</snip>


I suspect multiplexed bricks to be the problem , and hence raising it against glusterd.Feel free to change the component if that's not the case.


Version-Release number of selected component (if applicable):
--------------------------------------------------------------

glusterfs-3.12.2-4.el7rhgs.x86_64

How reproducible:
-----------------

2/2 , same machines.

Steps to Reproduce:
--------------------

As in description.

Comment 2 Ambarish 2018-02-25 11:08:56 UTC

For some reason I see brick not found errors in brick logs :

[2018-02-25 10:56:40.535257] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks1/A1 - not found
[2018-02-25 10:56:40.535358] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks2/A1 - not found
[2018-02-25 10:56:40.535487] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks3/A1 - not found
[2018-02-25 10:56:40.535595] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks4/A1 - not found
[2018-02-25 10:56:40.536902] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks5/A1 - not found
[2018-02-25 10:56:40.538366] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks6/A1 - not found
[2018-02-25 10:56:40.538566] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks7/A1 - not found
[2018-02-25 10:56:40.538807] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks8/A1 - not found
[2018-02-25 10:56:40.539040] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks9/A1 - not found
[2018-02-25 10:56:40.539261] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks10/A1 - not found

Comment 8 Atin Mukherjee 2018-04-04 13:51:42 UTC

*** Bug 1563640 has been marked as a duplicate of this bug. ***

Comment 10 Atin Mukherjee 2018-04-20 06:08:24 UTC

https://code.engineering.redhat.com/gerrit/136232 is now merged. Moving it to MODIFIED.

Comment 14 errata-xmlrpc 2018-09-04 06:42:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.