Description of problem: On a distribute-replicate volume added as storage domain, for VM image store, in RHEV, remove-brick operation was attempted specifying two replica pairs of bricks. But the start failed with message "Bricks not from same subvol for replica" When a single replica pair of bricks was given, the remove-brick operation started successfully Details given below. -------------------------------------------- [root@rhs-client45 ~]# gluster volume info Volume Name: RHS_VM_imagestore Type: Distributed-Replicate Volume ID: 8f97ac5c-3269-46be-a0f7-d33e61bc7128 Status: Started Number of Bricks: 12 x 2 = 24 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/brick1 Brick2: rhs-client37.lab.eng.blr.redhat.com:/brick1 Brick3: rhs-client15.lab.eng.blr.redhat.com:/brick1 Brick4: rhs-client10.lab.eng.blr.redhat.com:/brick1 Brick5: rhs-client45.lab.eng.blr.redhat.com:/brick2 Brick6: rhs-client37.lab.eng.blr.redhat.com:/brick2 Brick7: rhs-client15.lab.eng.blr.redhat.com:/brick2 Brick8: rhs-client10.lab.eng.blr.redhat.com:/brick2 Brick9: rhs-client45.lab.eng.blr.redhat.com:/brick3 Brick10: rhs-client37.lab.eng.blr.redhat.com:/brick3 Brick11: rhs-client15.lab.eng.blr.redhat.com:/brick3 Brick12: rhs-client10.lab.eng.blr.redhat.com:/brick3 Brick13: rhs-client45.lab.eng.blr.redhat.com:/brick4 Brick14: rhs-client37.lab.eng.blr.redhat.com:/brick4 Brick15: rhs-client15.lab.eng.blr.redhat.com:/brick4 Brick16: rhs-client10.lab.eng.blr.redhat.com:/brick4 Brick17: rhs-client45.lab.eng.blr.redhat.com:/brick5 Brick18: rhs-client37.lab.eng.blr.redhat.com:/brick5 Brick19: rhs-client15.lab.eng.blr.redhat.com:/brick5 Brick20: rhs-client10.lab.eng.blr.redhat.com:/brick5 Brick21: rhs-client45.lab.eng.blr.redhat.com:/brick6 Brick22: rhs-client37.lab.eng.blr.redhat.com:/brick6 Brick23: rhs-client15.lab.eng.blr.redhat.com:/brick6 Brick24: rhs-client10.lab.eng.blr.redhat.com:/brick6 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off Volume Name: RHS_extra Type: Distributed-Replicate Volume ID: bc9d2c6e-0c5e-4beb-a43d-82c9e0490da3 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/test Brick2: rhs-client37.lab.eng.blr.redhat.com:/test Brick3: rhs-client15.lab.eng.blr.redhat.com:/test Brick4: rhs-client10.lab.eng.blr.redhat.com:/test Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [root@rhs-client45 ~]# gluster volume remove-brick RHS_VM_imagestore rhs-client45.lab.eng.blr.redhat.com:/brick6 rhs-client37.lab.eng.blr.redhat.com:/brick6 rhs-client15.lab.eng.blr.redhat.com:/brick6 rhs-client10.lab.eng.blr.redhat.com:/brick6 start Bricks not from same subvol for replica [root@rhs-client45 ~]# gluster volume remove-brick RHS_VM_imagestore rhs-client45.lab.eng.blr.redhat.com:/brick1 rhs-client37.lab.eng.blr.redhat.com:/brick1 rhs-client15.lab.eng.blr.redhat.com:/brick1 rhs-client10.lab.eng.blr.redhat.com:/brick1 start Bricks not from same subvol for replica [root@rhs-client45 ~]# gluster volume remove-brick RHS_VM_imagestore rhs-client45.lab.eng.blr.redhat.com:/brick6 rhs-client37.lab.eng.blr.redhat.com:/brick6 startRemove Brick start successful [root@rhs-client45 ~]# gluster volume remove-brick RHS_VM_imagestore rhs-client45.lab.eng.blr.redhat.com:/brick6 rhs-client37.lab.eng.blr.redhat.com:/brick6 status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 1 0 9 0 in progress rhs-client37.lab.eng.blr.redhat.com 0 0 31 0 completed rhs-client15.lab.eng.blr.redhat.com 0 0 0 0 not started rhs-client10.lab.eng.blr.redhat.com 0 0 0 0 not started -------------------------------------------- Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1.On a distribute-replicate volume used for RHEV VM image store, attempt to run remove-brick operation specifying two replica pairs of bricks 2.operation does not start 3.try remove-brick operation with one replica pair, and it starts Actual results: remove-brick operation only starts when single replica pair of bricks is specified Expected results: remove-brick operation must start successfully, with any number of replica pairs of bricks given in one command Additional info:
Not specific to rhev-rhs, hence updating summary.
Issue reproducible on glusterfs-server-3.4.0.8rhs-1.el6rhs.x86_64 ------------------------------------------------------------------ [Thu May 16 17:07:06 root@rhs-client45:~ ] #gluster volume info RHEV-BigBend_extra Volume Name: RHEV-BigBend_extra Type: Distributed-Replicate Volume ID: 02a49c67-dcfd-42ef-98ee-d6690a61617b Status: Started Number of Bricks: 10 x 2 = 20 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-BigBend_extra Brick2: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-BigBend_extra Brick3: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-BigBend_extra Brick4: rhs-client4.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-BigBend_extra Brick5: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick5/RHEV-BigBend_extra Brick6: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick5/RHEV-BigBend_extra Brick7: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick5/RHEV-BigBend_extra Brick8: rhs-client4.lab.eng.blr.redhat.com:/rhs/brick5/RHEV-BigBend_extra Brick9: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick6/RHEV-BigBend_extra Brick10: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick6/RHEV-BigBend_extra Brick11: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick6/RHEV-BigBend_extra Brick12: rhs-client4.lab.eng.blr.redhat.com:/rhs/brick6/RHEV-BigBend_extra Brick13: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick8/RHEV-BigBend_extra Brick14: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick8/RHEV-BigBend_extra Brick15: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick8/RHEV-BigBend_extra Brick16: rhs-client4.lab.eng.blr.redhat.com:/rhs/brick8/RHEV-BigBend_extra Brick17: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra Brick18: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra Brick19: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra Brick20: rhs-client4.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [Thu May 16 17:07:17 root@rhs-client45:~ ] #gluster volume remove-brick RHEV-BigBend_extra rhs-client45.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client37.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client15.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client4.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra start volume remove-brick start: failed: Bricks not from same subvol for replica [Thu May 16 17:08:23 root@rhs-client45:~ ] #gluster volume remove-brick RHEV-BigBend_extra rhs-client45.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client37.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra startvolume remove-brick start: success ID: ec497c4e-4b5d-4761-990e-a6a080797116 [Thu May 16 17:10:27 root@rhs-client45:~ ] # ------------------------------------------------------------------ This behaviour is inconsistent with that on adding bricks, wherein multiple replica pairs may be added together. The above bricks were previously added with the command given below. ------------------------------------------------------------------ gluster volume add-brick RHEV-BigBend_extra rhs-client45.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client37.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client15.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra rhs-client4.lab.eng.blr.redhat.com:/rhs/brick9/RHEV-BigBend_extra ------------------------------------------------------------------
Patch Review URL: https://code.engineering.redhat.com/gerrit/#/c/11584/
Can you please review the doc text for technical accuracy?
Doc text looks good to me.
Verified on 3.4.0.57rhs-1.el6rhs.x86_64. Now multiple bricks can be removed [root@rhs-client4 mnt]# gluster v remove-brick another rhs-client39.lab.eng.blr.redhat.com:/home/another11 rhs-client9.lab.eng.blr.redhat.com:/home/another10 rhs-client4.lab.eng.blr.redhat.com:/home/another9 rhs-client39.lab.eng.blr.redhat.com:/home/another8 rhs-client9.lab.eng.blr.redhat.com:/home/another7 rhs-client4.lab.eng.blr.redhat.com:/home/another6 start volume remove-brick start: success ID: c5a8ebd3-62e6-4ab1-aa57-b0be509a9d55 [root@rhs-client4 mnt]# gluster v remove-brick another rhs-client39.lab.eng.blr.redhat.com:/home/another11 rhs-client9.lab.eng.blr.redhat.com:/home/another10 rhs-client4.lab.eng.blr.redhat.com:/home/another9 rhs-client39.lab.eng.blr.redhat.com:/home/another8 rhs-client9.lab.eng.blr.redhat.com:/home/another7 rhs-client4.lab.eng.blr.redhat.com:/home/another6 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 30 15.0MB 129 0 0 in progress 6.00 rhs-client39.lab.eng.blr.redhat.com 28 13.5MB 121 0 0 in progress 6.00 rhs-client9.lab.eng.blr.redhat.com 23 11.5MB 131 0 0 in progress 6.00
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html