Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1549996

Summary: [BMux] : Stale brick processes on the nodes after vol deletion.
Product: [Community] GlusterFS Reporter: Mohit Agrawal <moagrawa>
Component: coreAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: amukherj, atumball, bugs, moagrawa, rhinduja, rhs-bugs, storage-qa-internal, vbellur
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: brick-multiplexing
Fixed In Version: glusterfs-5.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1548829 Environment:
Last Closed: 2018-10-24 12:21:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1548829    
Bug Blocks: 1549023    

Description Mohit Agrawal 2018-02-28 09:15:22 UTC
+++ This bug was initially created as a clone of Bug #1548829 +++

Description of problem:
------------------------

Create a multi-brick EC volume.

Start it.

Delete it.

Check for running glusterfsd on all the nodes in the Trusted Storage Pool.

Nodes will still be having some stale glusterfsd processes  post vol stop/delete, which should ideally be removed when a volume is deleted.


Example :

I created two volumes - drogon and drogon2. And deleted them.

<snip>

[root@gqas007 ~]# gluster v list
No volumes present in cluster
[root@gqas007 ~]# 

[root@gqas007 ~]# 
[root@gqas007 ~]# ps -ef|grep fsd
root     21148     1  0 05:43 ?        00:00:01 /usr/sbin/glusterfsd -s gqas007 --volfile-id drogon.gqas007.bricks1-A1 -p /var/run/gluster/vols/drogon/gqas007-bricks1-A1.pid -S /var/run/gluster/e5102c357100a19ec60edec10a566e61.socket --brick-name /bricks1/A1 -l /var/log/glusterfs/bricks/bricks1-A1.log --xlator-option *-posix.glusterd-uuid=e72fdebf-3130-4d05-8cf5-966f4c4926c4 --brick-port 49152 --xlator-option drogon-server.listen-port=49152
root     21639     1  0 05:55 ?        00:00:01 /usr/sbin/glusterfsd -s gqas007 --volfile-id drogon2.gqas007.bricks1-A1 -p /var/run/gluster/vols/drogon2/gqas007-bricks1-A1.pid -S /var/run/gluster/2bd4d8669b0cdb67f9e15f99776d1e36.socket --brick-name /bricks1/A1 -l /var/log/glusterfs/bricks/bricks1-A1.log --xlator-option *-posix.glusterd-uuid=e72fdebf-3130-4d05-8cf5-966f4c4926c4 --brick-port 49153 --xlator-option drogon2-server.listen-port=49153
root     22042 19908  0 06:00 pts/0    00:00:00 grep --color=auto fsd
[root@gqas007 ~]# 

</snip>


I suspect multiplexed bricks to be the problem , and hence raising it against glusterd.Feel free to change the component if that's not the case.


Version-Release number of selected component (if applicable):
--------------------------------------------------------------

glusterfs-3.12.2-4.el7rhgs.x86_64

How reproducible:
-----------------

2/2 , same machines.

Steps to Reproduce:
--------------------

As in description.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-02-25 06:03:46 EST ---

This bug is automatically being proposed for the release of Red Hat Gluster Storage 3 under active development and open for bug fixes, by setting the release flag 'rhgs‑3.4.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Ambarish on 2018-02-25 06:08:56 EST ---

For some reason I see brick not found errors in brick logs :

[2018-02-25 10:56:40.535257] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks1/A1 - not found
[2018-02-25 10:56:40.535358] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks2/A1 - not found
[2018-02-25 10:56:40.535487] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks3/A1 - not found
[2018-02-25 10:56:40.535595] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks4/A1 - not found
[2018-02-25 10:56:40.536902] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks5/A1 - not found
[2018-02-25 10:56:40.538366] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks6/A1 - not found
[2018-02-25 10:56:40.538566] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks7/A1 - not found
[2018-02-25 10:56:40.538807] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks8/A1 - not found
[2018-02-25 10:56:40.539040] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks9/A1 - not found
[2018-02-25 10:56:40.539261] E [glusterfsd-mgmt.c:232:glusterfs_handle_terminate] 0-glusterfs: can't terminate /bricks10/A1 - not found

--- Additional comment from Ambarish on 2018-02-25 06:13:04 EST ---

sosreports : http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1548829/

--- Additional comment from Ambarish on 2018-02-26 02:16:55 EST ---

I could not reproduce on 3.8.4-54.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-02-26 02:17:02 EST ---

This bug report has Keywords: Regression or TestBlocker.

Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release.

Please resolve ASAP.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-02-28 03:20:19 EST ---

This bug is automatically being provided 'pm_ack+' for the release flag 'rhgs‑3.4.0', having been appropriately marked for the release, and having been provided ACK from Development and QE

--- Additional comment from Red Hat Bugzilla Rules Engine on 2018-02-28 04:14:31 EST ---

Since this bug has has been approved for the RHGS 3.4.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.4.0+', and through the Internal Whiteboard entry of '3.4.0', the Target Release is being automatically set to 'RHGS 3.4.0'

Comment 1 Amar Tumballi 2018-10-24 12:21:12 UTC
https://review.gluster.org/#/c/19734/