Bug 1573227 - [Remove-brick] Files are not migrated when they are renamed during a remove-brick operation
Summary: [Remove-brick] Files are not migrated when they are renamed during a remove-b...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Susant Kumar Palai
QA Contact: Sayalee
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-30 14:24 UTC by Prasad Desala
Modified: 2020-01-06 09:19 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-06 09:19:40 UTC
Embargoed:


Attachments (Terms of Use)

Description Prasad Desala 2018-04-30 14:24:03 UTC
Description of problem:
=======================
On a distribute volume, many files were not migrated from the decommissioned bricks.

Version-Release number of selected component (if applicable):
3.12.2-8.el7rhgs.x86_64

How reproducible:
=================
2/2

Steps to Reproduce:
====================
1) Create a 4 brick distribute volume and start it.
2) FUSE mount it on multiple clients.
3) 
* From one client start creating files and directories on the mount point
python /home/file_dir_ops.py create_deep_dirs_with_files -d 5 -l 5 -f 50 /mnt/dist
* From other client create files on / of mount point
for i in {1..5000};do cat /etc/redhat-release > new_cat_$i;done
4) Once step-3 is completed, start renaming all files and directories on the mount point
for i in `ls`; do mv $i $i+1;done
5) Once rename completes, remove a brick and wait till remove-brick completes.

Actual results:
===============
There are no file migration failures but many files were not migrated from the decommissioned bricks

Expected results:
=================
All files from the decommissioned bricks should get migrated successfully.

Comment 4 Nithya Balachandran 2018-05-02 06:02:25 UTC
(In reply to Prasad Desala from comment #0)
> Description of problem:
> =======================
> On a distribute volume, many files were not migrated from the decommissioned
> bricks.
> 
> Version-Release number of selected component (if applicable):
> 3.12.2-8.el7rhgs.x86_64
> 
> How reproducible:
> =================
> 2/2
> 
> Steps to Reproduce:
> ====================
> 1) Create a 4 brick distribute volume and start it.
> 2) FUSE mount it on multiple clients.
> 3) 
> * From one client start creating files and directories on the mount point
> python /home/file_dir_ops.py create_deep_dirs_with_files -d 5 -l 5 -f 50
> /mnt/dist
> * From other client create files on / of mount point
> for i in {1..5000};do cat /etc/redhat-release > new_cat_$i;done
> 4) Once step-3 is completed, start renaming all files and directories on the
> mount point
> for i in `ls`; do mv $i $i+1;done
> 5) Once rename completes, remove a brick and wait till remove-brick
> completes.
> 


It looks like the remove-brick was performed _before_ the renames completed - the rebalance logs show error messages with the older names:


[2018-04-30 14:13:40.901213] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_28 lookup failed [No such file or directory]
[2018-04-30 14:13:43.883442] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_239 lookup failed [No such file or directory]
[2018-04-30 14:13:43.899613] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_240 lookup failed [No such file or directory]
[2018-04-30 14:13:43.900365] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_241 lookup failed [No such file or directory]
[2018-04-30 14:13:43.907045] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_244 lookup failed [No such file or directory]
[2018-04-30 14:13:43.908748] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_245 lookup failed [No such file or directory]
[2018-04-30 14:13:43.915455] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_246 lookup failed [No such file or directory]
[2018-04-30 14:13:43.916884] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_254 lookup failed [No such file or directory]
[2018-04-30 14:13:43.931054] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_255 lookup failed [No such file or directory]
[2018-04-30 14:13:43.931458] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_258 lookup failed [No such file or directory]
[2018-04-30 14:13:43.937588] E [MSGID: 109023] [dht-rebalance.c:2658:gf_defrag_migrate_single_file] 0-dist-dht: Migrate file failed: /new_cat_260 lookup failed [No such file or directory]



Are you sure the renames had completed before the remove-brick?


Note You need to log in before you can comment on or make changes to this bug.