Bug 1577051

Summary: [Remove-brick+Rename] Failure count shows zero though there are file migration failures
Product: Red Hat Gluster Storage Reporter: Prasad Desala <tdesala>
Component: distributeAssignee: Susant Kumar Palai <spalai>
Status: CLOSED ERRATA QA Contact: Prasad Desala <tdesala>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: amukherj, bkunal, nbalacha, rgowdapp, rhs-bugs, sankarshan, sheggodu, spalai, srmukher, storage-qa-internal, vdas
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: dht-data-loss
Fixed In Version: glusterfs-3.12.2-13 Doc Type: Bug Fix
Doc Text:
Previously, the remove-brick process did not show any failure during a lookup failure. It is recommended to check the decommissioned brick before doing a "remove-brick commit" for any left out files. With this fix, the remove brick status shows failure count.
Story Points: ---
Clone Of:
: 1580269 (view as bug list) Environment:
Last Closed: 2018-09-04 06:48:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1503137, 1580269    

Description Prasad Desala 2018-05-11 05:40:44 UTC
Description of problem:
When remove-brick is started while renames are in-progress, though there are file migration failures remove-brick status shows the failures count as 0.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1) Create a 4 brick distribute volume and start it.
2) FUSE mount it on multiple clients.
* From one client start creating files and directories on the mount point
python /home/file_dir_ops.py create_deep_dirs_with_files -d 5 -l 5 -f 50 /mnt/dist
* From other client create files on / of mount point
for i in {1..5000};do cat /etc/redhat-release > new_cat_$i;done
4) Once step-3 is completed, start renaming all files and directories on the mount point
for i in `ls`; do mv $i $i+1;done
5) While renames are in-progress, remove a brick and wait till remove-brick completes.

Actual results:
Failure count shows zero though there are file migration failures.

Expected results:
Failure count should not be shown as 0 as there are file migration failures.

Comment 9 Susant Kumar Palai 2018-05-21 06:56:08 UTC
Upstream patch: https://review.gluster.org/#/c/20044/

Comment 16 Prasad Desala 2018-07-16 06:35:58 UTC
Verified this BZ on glusterfs version 3.12.2-13.el7rhgs.x86_64.

When remove-brick is started while rename in-progress, now remove-brick status is showing the failure count for migration failures.

[root@node1]# gluster v remove-brick dist node1:/bricks/brick8/dist-b8 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             1053       107.2MB          1209            44             0            completed        0:00:53

Moving this BZ to Verified.

Comment 17 Srijita Mukherjee 2018-09-03 15:42:09 UTC
Have updated the doc text. Kindly review and confirm.

Comment 18 Susant Kumar Palai 2018-09-04 06:35:50 UTC
(In reply to Srijita Mukherjee from comment #17)
> Have updated the doc text. Kindly review and confirm.


Comment 19 errata-xmlrpc 2018-09-04 06:48:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.