Description of problem: ------------------------ Created an EC volume.Enabled Brick multiplexing.Added bricks.Trigeered rebalance. Rebalance failed. [root@gqas009 glusterfs]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 completed 0:00:00 volume rebalance: testvol: success [root@gqas009 glusterfs]# gluster v rebalance butcher status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 completed 0:00:00 gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 1 0 failed 0:00:02 gqas015.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 1 0 failed 0:00:02 volume rebalance: butcher: success [root@gqas009 glusterfs]# Version-Release number of selected component (if applicable): ------------------------------------------------------------- 3.8.4-23 How reproducible: ----------------- 2/2 Actual results: -------------- Rebal fails. Expected results: ----------------- Rebal should not fail. Additional info: ---------------- [root@gqas009 glusterfs]# gluster v info Volume Name: butcher Type: Distributed-Disperse Volume ID: 98d7434c-0466-4ff3-879b-3ee8c211c7b2 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: gqas009.sbu.lab.eng.bos.redhat.com:/bricks2/e1 Brick2: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/e1 Brick3: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/e1 Brick4: gqas009.sbu.lab.eng.bos.redhat.com:/bricks1/e1 Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/e1 Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/e1 Brick7: gqas009.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick8: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick9: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick10: gqas009.sbu.lab.eng.bos.redhat.com:/bricks8/A1 Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/A1 Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/A1 Options Reconfigured: cluster.lookup-optimize: on transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable Volume Name: testvol Type: Distribute Volume ID: 2b12b3e7-a167-4538-b55b-9a4e181c622e Status: Started Snapshot Count: 0 Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/A Brick2: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/a Options Reconfigured: transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable [root@gqas009 glusterfs]# [root@gqas009 glusterfs]# gluster v status Status of volume: butcher Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick gqas009.sbu.lab.eng.bos.redhat.com:/b ricks2/e1 49153 0 Y 23917 Brick gqas014.sbu.lab.eng.bos.redhat.com:/b ricks2/e1 49153 0 Y 23218 Brick gqas015.sbu.lab.eng.bos.redhat.com:/b ricks2/e1 49153 0 Y 23687 Brick gqas009.sbu.lab.eng.bos.redhat.com:/b ricks1/e1 49153 0 Y 23917 Brick gqas014.sbu.lab.eng.bos.redhat.com:/b ricks1/e1 49153 0 Y 23218 Brick gqas015.sbu.lab.eng.bos.redhat.com:/b ricks1/e1 49153 0 Y 23687 Brick gqas009.sbu.lab.eng.bos.redhat.com:/b ricks6/A1 49153 0 Y 23917 Brick gqas014.sbu.lab.eng.bos.redhat.com:/b ricks6/A1 49153 0 Y 23218 Brick gqas015.sbu.lab.eng.bos.redhat.com:/b ricks6/A1 49153 0 Y 23687 Brick gqas009.sbu.lab.eng.bos.redhat.com:/b ricks8/A1 49153 0 Y 23917 Brick gqas014.sbu.lab.eng.bos.redhat.com:/b ricks8/A1 49153 0 Y 23218 Brick gqas015.sbu.lab.eng.bos.redhat.com:/b ricks8/A1 49153 0 Y 23687 Self-heal Daemon on localhost N/A N/A Y 24098 Self-heal Daemon on gqas011.sbu.lab.eng.bos .redhat.com N/A N/A Y 14859 Self-heal Daemon on gqas014.sbu.lab.eng.bos .redhat.com N/A N/A Y 23367 Self-heal Daemon on gqas015.sbu.lab.eng.bos .redhat.com N/A N/A Y 23828 Task Status of Volume butcher ------------------------------------------------------------------------------ Task : Rebalance ID : 1314fe4c-0005-476a-b88c-4b52f93ffa62 Status : failed Status of volume: testvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick gqas014.sbu.lab.eng.bos.redhat.com:/b ricks11/A 49153 0 Y 23218 Brick gqas014.sbu.lab.eng.bos.redhat.com:/b ricks5/a 49153 0 Y 23218 Task Status of Volume testvol ------------------------------------------------------------------------------ Task : Rebalance ID : 2745a8f1-336b-4bf0-baec-efe661a22dde Status : completed [root@gqas009 glusterfs]#
upstream patch : https://review.gluster.org/#/c/17225
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106137
Looks like we have an issue with this patch, moving this bug to POST.
downstream patch :https://code.engineering.redhat.com/gerrit/#/c/108021/
Verified this BZ on glusterfs version on 3.8.4-31.el7rhgs.x86_64. With the above steps in the description we are not seeing this problem anymore hence moving this bug to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774