Bug 1450806

Summary: Brick Multiplexing: Brick process shows as online in vol status even when brick is offline
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: coreAssignee: Atin Mukherjee <amukherj>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: amukherj, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-21 04:41:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1450630, 1458570    
Bug Blocks: 1417151    

Description Nag Pavan Chilakam 2017-05-15 08:25:44 UTC
Description of problem:
=====================
With 3.8.4-25 brick mux fixes, I see that even If I bring down a brick, vol status shows brick as online

 

Version-Release number of selected component (if applicable):
=======
3.8.4-25

How reproducible:
===
always

Steps to Reproduce:
1. enable brick mux and create 3 vols
2. now bring down a brick on say vol3 using umount -l of the lv mount
3.it can be seen that even though the brick is dow, the vol status shows brick online

in below case brought down Brick 10.70.35.45:/rhs/brick9/test3_9 
 

[root@dhcp35-45 ~]# #umount -l /rhs/brick9

[root@dhcp35-45 ~]# 
Broadcast message from systemd-journald.eng.blr.redhat.com (Mon 2017-05-15 13:45:10 IST):

rhs-brick1-test3_1[13818]: [2017-05-15 08:15:10.240221] M [MSGID: 113075] [posix-helpers.c:1893:posix_health_check_thread_proc] 0-test3_9-posix: health-check failed, going down


Message from syslogd@dhcp35-45 at May 15 13:45:10 ...
 rhs-brick1-test3_1[13818]:[2017-05-15 08:15:10.240221] M [MSGID: 113075] [posix-helpers.c:1893:posix_health_check_thread_proc] 0-test3_9-posix: health-check failed, going down

[root@dhcp35-45 ~]# 

Status of volume: test3_9
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick9/test3_9       49152     0          Y       13818
Brick 10.70.35.130:/rhs/brick9/test3_9      49152     0          Y       13162
Brick 10.70.35.122:/rhs/brick9/test3_9      49152     0          Y       12919
Self-heal Daemon on localhost               N/A       N/A        Y       14273
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       11842
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       13264
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       13509
 
Task Status of Volume test3_9
------------------------------------------------------------------------------
There are no active volume tasks

Comment 3 Atin Mukherjee 2017-05-15 17:36:24 UTC
upstream patch : https://review.gluster.org/17287

Comment 5 Atin Mukherjee 2017-05-16 04:26:15 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106263

Comment 7 Nag Pavan Chilakam 2017-06-07 14:46:22 UTC
on 3.8.4-27 not seeing the issue anymore hence moving to verified

root@dhcp35-45 ~]# gluster v status 
Status of volume: test3_31
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick31/test3_31     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick31/test3_31    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick31/test3_31    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
 
Task Status of Volume test3_31
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: test3_32
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick32/test3_32     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick32/test3_32    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick32/test3_32    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
 
Task Status of Volume test3_32
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: test3_33
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick33/test3_33     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick33/test3_33    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick33/test3_33    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
 
Task Status of Volume test3_33
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: test3_34
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick34/test3_34     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick34/test3_34    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick34/test3_34    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
 
Task Status of Volume test3_34
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: test3_35
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick35/test3_35     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick35/test3_35    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick35/test3_35    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
 
Task Status of Volume test3_35
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: test3_36
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick36/test3_36     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick36/test3_36    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick36/test3_36    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
 
Task Status of Volume test3_36
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: test3_37
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.45:/rhs/brick37/test3_37     N/A       N/A        N       N/A  
Brick 10.70.35.130:/rhs/brick37/test3_37    49152     0          Y       30495
Brick 10.70.35.122:/rhs/brick37/test3_37    49152     0          Y       14828
Self-heal Daemon on localhost               N/A       N/A        Y       27795
Self-heal Daemon on 10.70.35.23             N/A       N/A        Y       26963
Self-heal Daemon on 10.70.35.122            N/A       N/A        Y       17576
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       807  
 
Task Status of Volume test3_37
------------------------------------------------------------------------------
There are no active volume tasks

Comment 9 errata-xmlrpc 2017-09-21 04:41:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774