Bug 1030932 - [rfe] DHT: Remove-brick- Data is migrating even from non-decommissioned bricks
[rfe] DHT: Remove-brick- Data is migrating even from non-decommissioned bricks
Status: CLOSED NOTABUG
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute (Show other bugs)
2.1
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Nithya Balachandran
storage-qa-internal@redhat.com
: FutureFeature
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-15 06:10 EST by shylesh
Modified: 2015-11-30 04:54 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-30 04:54:57 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description shylesh 2013-11-15 06:10:46 EST
Description of problem:
 while decommissioning bricks data is also migrated from non-decommissioned bricks which sometimes leads to data loss

Version-Release number of selected component (if applicable):
3.4.0.44rhs-1.el6rhs.x86_64

How reproducible:
Not always

Steps to Reproduce:
1. From a distributed-replicate volume of 11x2 configuration removed a pair of bricks using remove-brick start

2.data is also migrated from the non-decommissioned bricks
 

More info
----------
Volume Name: dist-rep   
Type: Distributed-Replicate
Volume ID: f93775df-84c4-4c3a-8883-185e94acafe4
Status: Started
Number of Bricks: 11 x 2 = 22
Transport-type: tcp
Bricks:
Brick1: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep0
Brick2: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep1
Brick3: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep2
Brick4: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep3
Brick5: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep4
Brick6: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep5
Brick7: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep6
Brick8: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep7
Brick9: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep8
Brick10: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep9
Brick11: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep10---->
Brick12: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep11---->decommissioned pair--> dist-rep-replicate-5
Brick13: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep12
Brick14: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep13
Brick15: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep14
Brick16: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep15
Brick17: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep16
Brick18: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep17
Brick19: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep18
Brick20: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep19
Brick21: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep20
Brick22: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep21
Options Reconfigured:
features.quota: off


command
--------
[root@rhs-client4 mnt]# gluster v remove-brick dist-rep rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep10 rhs-client39.lab.eng.blr.redhat.com:/
home/dist-rep11 status  
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes             0             0             0    not started             0.00
      rhs-client9.lab.eng.blr.redhat.com             1518       759.0MB          9759             0             0      completed           404.00
     rhs-client39.lab.eng.blr.redhat.com              961       480.5MB          9330             0             0      completed           386.00


looking at the rebalance logs from node rhs-client39.lab.eng.blr.redhat.com
----------------------------



[2013-11-15 09:25:23.281339] I [dht-rebalance.c:672:dht_migrate_file] 0-dist-rep-dht: /5/5/4/1/file.0: attempting to move from dist-rep-replicate
-10 to dist-rep-replicate-1

[2013-11-15 09:25:24.399435] I [dht-rebalance.c:881:dht_migrate_file] 0-dist-rep-dht: completed migration of /5/5/4/5/file.0 from subvolume dist-
rep-replicate-1 to dist-rep-replicate-0

[2013-11-15 09:25:25.252144] I [dht-rebalance.c:881:dht_migrate_file] 0-dist-rep-dht: completed migration of /5/5/5/2/file.0 from subvolume dist-
rep-replicate-10 to dist-rep-replicate-1


Cluster info
------------
rhs-client9.lab.eng.blr.redhat.com
rhs-client39.lab.eng.blr.redhat.com
rhs-client4.lab.eng.blr.redhat.com


Mounted on 
----------
rhs-client4.lab.eng.blr.redhat.com:/mnt


attached the sosreports
Comment 3 Amar Tumballi 2013-12-02 05:02:48 EST
because the layout changes for existing directories after remove-brick, it does migrate data from even the non-decommissioned bricks.

If this is not expected, then the way we handle remove brick should change.

Note You need to log in before you can comment on or make changes to this bug.