Bug 1030932 - [rfe] DHT: Remove-brick- Data is migrating even from non-decommissioned bricks
Summary: [rfe] DHT: Remove-brick- Data is migrating even from non-decommissioned bricks
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: 2.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Nithya Balachandran
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-15 11:10 UTC by shylesh
Modified: 2015-11-30 09:54 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-30 09:54:57 UTC
Embargoed:


Attachments (Terms of Use)

Description shylesh 2013-11-15 11:10:46 UTC
Description of problem:
 while decommissioning bricks data is also migrated from non-decommissioned bricks which sometimes leads to data loss

Version-Release number of selected component (if applicable):
3.4.0.44rhs-1.el6rhs.x86_64

How reproducible:
Not always

Steps to Reproduce:
1. From a distributed-replicate volume of 11x2 configuration removed a pair of bricks using remove-brick start

2.data is also migrated from the non-decommissioned bricks
 

More info
----------
Volume Name: dist-rep   
Type: Distributed-Replicate
Volume ID: f93775df-84c4-4c3a-8883-185e94acafe4
Status: Started
Number of Bricks: 11 x 2 = 22
Transport-type: tcp
Bricks:
Brick1: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep0
Brick2: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep1
Brick3: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep2
Brick4: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep3
Brick5: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep4
Brick6: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep5
Brick7: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep6
Brick8: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep7
Brick9: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep8
Brick10: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep9
Brick11: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep10---->
Brick12: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep11---->decommissioned pair--> dist-rep-replicate-5
Brick13: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep12
Brick14: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep13
Brick15: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep14
Brick16: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep15
Brick17: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep16
Brick18: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep17
Brick19: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep18
Brick20: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep19
Brick21: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep20
Brick22: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep21
Options Reconfigured:
features.quota: off


command
--------
[root@rhs-client4 mnt]# gluster v remove-brick dist-rep rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep10 rhs-client39.lab.eng.blr.redhat.com:/
home/dist-rep11 status  
                                    Node Rebalanced-files          size       scanned      failures       skipped         status run-time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes             0             0             0    not started             0.00
      rhs-client9.lab.eng.blr.redhat.com             1518       759.0MB          9759             0             0      completed           404.00
     rhs-client39.lab.eng.blr.redhat.com              961       480.5MB          9330             0             0      completed           386.00


looking at the rebalance logs from node rhs-client39.lab.eng.blr.redhat.com
----------------------------



[2013-11-15 09:25:23.281339] I [dht-rebalance.c:672:dht_migrate_file] 0-dist-rep-dht: /5/5/4/1/file.0: attempting to move from dist-rep-replicate
-10 to dist-rep-replicate-1

[2013-11-15 09:25:24.399435] I [dht-rebalance.c:881:dht_migrate_file] 0-dist-rep-dht: completed migration of /5/5/4/5/file.0 from subvolume dist-
rep-replicate-1 to dist-rep-replicate-0

[2013-11-15 09:25:25.252144] I [dht-rebalance.c:881:dht_migrate_file] 0-dist-rep-dht: completed migration of /5/5/5/2/file.0 from subvolume dist-
rep-replicate-10 to dist-rep-replicate-1


Cluster info
------------
rhs-client9.lab.eng.blr.redhat.com
rhs-client39.lab.eng.blr.redhat.com
rhs-client4.lab.eng.blr.redhat.com


Mounted on 
----------
rhs-client4.lab.eng.blr.redhat.com:/mnt


attached the sosreports

Comment 3 Amar Tumballi 2013-12-02 10:02:48 UTC
because the layout changes for existing directories after remove-brick, it does migrate data from even the non-decommissioned bricks.

If this is not expected, then the way we handle remove brick should change.


Note You need to log in before you can comment on or make changes to this bug.