Bug 1443843 - Brick Multiplexing :- resetting a brick bring down other bricks with same PID
Summary: Brick Multiplexing :- resetting a brick bring down other bricks with same PID
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.3
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.3.0
Assignee: Samikshan Bairagya
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard: brick-multiplexing
Depends On: 1446172 1449933 1449934
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-04-20 06:42 UTC by Karan Sandha
Modified: 2017-09-21 04:37 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.8.4-26
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1446172 (view as bug list)
Environment:
Last Closed: 2017-09-21 04:37:54 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Karan Sandha 2017-04-20 06:42:20 UTC
Description of problem:
resetting a single brick bring the other brick down with same pid

Version-Release number of selected component (if applicable):
3.8.4-22

How reproducible:
100% 

Steps to Reproduce:
1.

[root@K1 ~]# gluster v reset-brick testvol 10.70.47.60:/bricks/brick0/b3 start
volume reset-brick: success: reset-brick start operation successful

2. 

[root@K1 b3]# gluster v status testvol
Status of volume: testvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.60:/bricks/brick0/b3         N/A       N/A        N       N/A  
Brick 10.70.46.218:/bricks/brick0/b2        49152     0          Y       374  
Brick 10.70.47.61:/bricks/brick0/b3         49152     0          Y       24892
Brick 10.70.47.60:/bricks/brick1/b3         N/A       N/A        N       N/A  
Brick 10.70.46.218:/bricks/brick1/b2        49152     0          Y       374  
Brick 10.70.47.61:/bricks/brick1/b3         49152     0          Y       24892
Brick 10.70.46.218:/bricks/brick2/b2        49152     0          Y       374  
Brick 10.70.47.61:/bricks/brick2/b3         49152     0          Y       24892
Brick 10.70.47.60:/bricks/brick2/b3         49153     0          Y       1629 
NFS Server on localhost                     2049      0          Y       1653 
Self-heal Daemon on localhost               N/A       N/A        Y       1662 
NFS Server on 10.70.46.218                  2049      0          Y       698  
Self-heal Daemon on 10.70.46.218            N/A       N/A        Y       707  
NFS Server on 10.70.47.61                   2049      0          Y       25123
Self-heal Daemon on 10.70.47.61             N/A       N/A        Y       25132
 
Task Status of Volume testvol
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : e686e9ea-ad3d-4135-933d-2836075c16d7
Status               : completed           
 

3.

Actual results:
two bricks go down with same pid

Expected results:
Other brick should be unaffected.

Additional info:
logs placed at : rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>

Comment 4 Atin Mukherjee 2017-04-27 11:56:40 UTC
upstream patch : https://review.gluster.org/#/c/17128/

Comment 5 Atin Mukherjee 2017-05-15 08:30:05 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106048/

Comment 7 Nag Pavan Chilakam 2017-06-10 08:20:53 UTC
onqa validation:
reset brick now doesnt bring down any other brick using same PID, hence moving to verified. tested with steps mentioned in summary
3.8.4-27

Comment 9 errata-xmlrpc 2017-09-21 04:37:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.