Bug 1548829

Summary: [BMux] : Stale brick processes on the nodes after vol deletion.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ambarish <asoman>
Component: coreAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: amukherj, moagrawa, msaini, nchilaka, rhinduja, rhs-bugs, storage-qa-internal, vbellur
Target Milestone: ---Keywords: Regression
Target Release: RHGS 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: brick-multiplexing
Fixed In Version: glusterfs-3.12.2-8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1549996 (view as bug list) Environment:
Last Closed: 2018-09-04 06:42:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1503137, 1549023, 1549996    

Description Ambarish 2018-02-25 11:03:41 UTC
Description of problem:
------------------------

Create a multi-brick EC volume.

Start it.

Delete it.

Check for running glusterfsd on all the nodes in the Trusted Storage Pool.

Nodes will still be having some stale glusterfsd processes  post vol stop/delete, which should ideally be removed when a volume is deleted.


Example :

I created two volumes - drogon and drogon2. And deleted them.

<snip>

[root@gqas007 ~]# gluster v list
No volumes present in cluster
[root@gqas007 ~]# 

[root@gqas007 ~]# 
[root@gqas007 ~]# ps -ef|grep fsd
root     21148     1  0 05:43 ?        00:00:01 /usr/sbin/glusterfsd -s gqas007 --volfile-id drogon.gqas007.bricks1-A1 -p /var/run/gluster/vols/drogon/gqas007-bricks1-A1.pid -S /var/run/gluster/e5102c357100a19ec60edec10a566e61.socket --brick-name /bricks1/A1 -l /var/log/glusterfs/bricks/bricks1-A1.log --xlator-option *-posix.glusterd-uuid=e72fdebf-3130-4d05-8cf5-966f4c4926c4 --brick-port 49152 --xlator-option drogon-server.listen-port=49152
root     21639     1  0 05:55 ?        00:00:01 /usr/sbin/glusterfsd -s gqas007 --volfile-id drogon2.gqas007.bricks1-A1 -p /var/run/gluster/vols/drogon2/gqas007-bricks1-A1.pid -S /var/run/gluster/2bd4d8669b0cdb67f9e15f99776d1e36.socket --brick-name /bricks1/A1 -l /var/log/glusterfs/bricks/bricks1-A1.log --xlator-option *-posix.glusterd-uuid=e72fdebf-3130-4d05-8cf5-966f4c4926c4 --brick-port 49153 --xlator-option drogon2-server.listen-port=49153
root     22042 19908  0 06:00 pts/0    00:00:00 grep --color=auto fsd
[root@gqas007 ~]# 

</snip>


I suspect multiplexed bricks to be the problem , and hence raising it against glusterd.Feel free to change the component if that's not the case.


Version-Release number of selected component (if applicable):
--------------------------------------------------------------

glusterfs-3.12.2-4.el7rhgs.x86_64

How reproducible:
-----------------

2/2 , same machines.

Steps to Reproduce:
--------------------

As in description.

Comment 2 Ambarish 2018-02-25 11:08:56 UTC
For some reason I see brick not found errors in brick logs :

[2018-02-25 10:56:40.535257] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks1/A1 - not found
[2018-02-25 10:56:40.535358] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks2/A1 - not found
[2018-02-25 10:56:40.535487] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks3/A1 - not found
[2018-02-25 10:56:40.535595] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks4/A1 - not found
[2018-02-25 10:56:40.536902] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks5/A1 - not found
[2018-02-25 10:56:40.538366] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks6/A1 - not found
[2018-02-25 10:56:40.538566] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks7/A1 - not found
[2018-02-25 10:56:40.538807] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks8/A1 - not found
[2018-02-25 10:56:40.539040] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks9/A1 - not found
[2018-02-25 10:56:40.539261] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks10/A1 - not found

Comment 8 Atin Mukherjee 2018-04-04 13:51:42 UTC
*** Bug 1563640 has been marked as a duplicate of this bug. ***

Comment 10 Atin Mukherjee 2018-04-20 06:08:24 UTC
https://code.engineering.redhat.com/gerrit/136232 is now merged. Moving it to MODIFIED.

Comment 14 errata-xmlrpc 2018-09-04 06:42:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607