Bug 1446107

Summary: [Brick MUX] : Rebalance fails.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ambarish <asoman>
Component: coreAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED ERRATA QA Contact: Prasad Desala <tdesala>
Severity: low Docs Contact:
Priority: low    
Version: rhgs-3.3CC: amukherj, asoman, bturner, nbalacha, nchilaka, rhinduja, rhs-bugs, skoduri, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: brick-multiplexing
Fixed In Version: glusterfs-3.8.4-27 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1447392 (view as bug list) Environment:
Last Closed: 2017-09-21 04:39:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1447392    
Bug Blocks: 1417151    

Description Ambarish 2017-04-27 09:39:11 UTC
Description of problem:
------------------------

Created an EC volume.Enabled Brick multiplexing.Added bricks.Trigeered rebalance.

Rebalance failed.


[root@gqas009 glusterfs]# gluster v rebalance testvol status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
      gqas014.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             0             0            completed        0:00:00
volume rebalance: testvol: success
[root@gqas009 glusterfs]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0            completed        0:00:00
      gqas014.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             1             0               failed        0:00:02
      gqas015.sbu.lab.eng.bos.redhat.com                0        0Bytes             0             1             0               failed        0:00:02
volume rebalance: butcher: success
[root@gqas009 glusterfs]# 


Version-Release number of selected component (if applicable):
-------------------------------------------------------------

3.8.4-23

How reproducible:
-----------------

2/2


Actual results:
--------------

Rebal fails.

Expected results:
-----------------

Rebal should not fail.

Additional info:
----------------

[root@gqas009 glusterfs]# gluster v info
 
Volume Name: butcher
Type: Distributed-Disperse
Volume ID: 98d7434c-0466-4ff3-879b-3ee8c211c7b2
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (4 + 2) = 12
Transport-type: tcp
Bricks:
Brick1: gqas009.sbu.lab.eng.bos.redhat.com:/bricks2/e1
Brick2: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/e1
Brick3: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/e1
Brick4: gqas009.sbu.lab.eng.bos.redhat.com:/bricks1/e1
Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/e1
Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/e1
Brick7: gqas009.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick8: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick9: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick10: gqas009.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Options Reconfigured:
cluster.lookup-optimize: on
transport.address-family: inet
nfs.disable: on
cluster.brick-multiplex: enable
 
Volume Name: testvol
Type: Distribute
Volume ID: 2b12b3e7-a167-4538-b55b-9a4e181c622e
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/A
Brick2: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/a
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
cluster.brick-multiplex: enable
[root@gqas009 glusterfs]# 



[root@gqas009 glusterfs]# gluster v status
Status of volume: butcher
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gqas009.sbu.lab.eng.bos.redhat.com:/b
ricks2/e1                                   49153     0          Y       23917
Brick gqas014.sbu.lab.eng.bos.redhat.com:/b
ricks2/e1                                   49153     0          Y       23218
Brick gqas015.sbu.lab.eng.bos.redhat.com:/b
ricks2/e1                                   49153     0          Y       23687
Brick gqas009.sbu.lab.eng.bos.redhat.com:/b
ricks1/e1                                   49153     0          Y       23917
Brick gqas014.sbu.lab.eng.bos.redhat.com:/b
ricks1/e1                                   49153     0          Y       23218
Brick gqas015.sbu.lab.eng.bos.redhat.com:/b
ricks1/e1                                   49153     0          Y       23687
Brick gqas009.sbu.lab.eng.bos.redhat.com:/b
ricks6/A1                                   49153     0          Y       23917
Brick gqas014.sbu.lab.eng.bos.redhat.com:/b
ricks6/A1                                   49153     0          Y       23218
Brick gqas015.sbu.lab.eng.bos.redhat.com:/b
ricks6/A1                                   49153     0          Y       23687
Brick gqas009.sbu.lab.eng.bos.redhat.com:/b
ricks8/A1                                   49153     0          Y       23917
Brick gqas014.sbu.lab.eng.bos.redhat.com:/b
ricks8/A1                                   49153     0          Y       23218
Brick gqas015.sbu.lab.eng.bos.redhat.com:/b
ricks8/A1                                   49153     0          Y       23687
Self-heal Daemon on localhost               N/A       N/A        Y       24098
Self-heal Daemon on gqas011.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       14859
Self-heal Daemon on gqas014.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       23367
Self-heal Daemon on gqas015.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       23828
 
Task Status of Volume butcher
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 1314fe4c-0005-476a-b88c-4b52f93ffa62
Status               : failed              
 
Status of volume: testvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gqas014.sbu.lab.eng.bos.redhat.com:/b
ricks11/A                                   49153     0          Y       23218
Brick gqas014.sbu.lab.eng.bos.redhat.com:/b
ricks5/a                                    49153     0          Y       23218
 
Task Status of Volume testvol
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 2745a8f1-336b-4bf0-baec-efe661a22dde
Status               : completed           
 
[root@gqas009 glusterfs]#

Comment 10 Atin Mukherjee 2017-05-09 15:51:01 UTC
upstream patch : https://review.gluster.org/#/c/17225

Comment 13 Atin Mukherjee 2017-05-15 04:45:24 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106137

Comment 14 Atin Mukherjee 2017-05-17 05:24:27 UTC
Looks like we have an issue with this patch, moving this bug to POST.

Comment 15 Atin Mukherjee 2017-06-05 04:50:23 UTC
downstream patch :https://code.engineering.redhat.com/gerrit/#/c/108021/

Comment 17 Prasad Desala 2017-06-29 11:45:56 UTC
Verified this BZ on glusterfs version on 3.8.4-31.el7rhgs.x86_64. With the above steps in the description we are not seeing this problem anymore hence moving this bug to Verified.

Comment 19 errata-xmlrpc 2017-09-21 04:39:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774