Bug 1577051 - [Remove-brick+Rename] Failure count shows zero though there are file migration failures
Summary: [Remove-brick+Rename] Failure count shows zero though there are file migratio...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.4.0
Assignee: Susant Kumar Palai
QA Contact: Prasad Desala
URL:
Whiteboard: dht-data-loss
Depends On:
Blocks: 1503137 1580269
TreeView+ depends on / blocked
 
Reported: 2018-05-11 05:40 UTC by Prasad Desala
Modified: 2018-09-17 07:27 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.12.2-13
Doc Type: Bug Fix
Doc Text:
Previously, the remove-brick process did not show any failure during a lookup failure. It is recommended to check the decommissioned brick before doing a "remove-brick commit" for any left out files. With this fix, the remove brick status shows failure count.
Clone Of:
: 1580269 (view as bug list)
Environment:
Last Closed: 2018-09-04 06:48:05 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 06:49:57 UTC

Description Prasad Desala 2018-05-11 05:40:44 UTC
Description of problem:
=======================
When remove-brick is started while renames are in-progress, though there are file migration failures remove-brick status shows the failures count as 0.

Version-Release number of selected component (if applicable):
3.12.2-9.el7rhgs.x86_64

How reproducible:
always

Steps to Reproduce:
====================
1) Create a 4 brick distribute volume and start it.
2) FUSE mount it on multiple clients.
3) 
* From one client start creating files and directories on the mount point
python /home/file_dir_ops.py create_deep_dirs_with_files -d 5 -l 5 -f 50 /mnt/dist
* From other client create files on / of mount point
for i in {1..5000};do cat /etc/redhat-release > new_cat_$i;done
4) Once step-3 is completed, start renaming all files and directories on the mount point
for i in `ls`; do mv $i $i+1;done
5) While renames are in-progress, remove a brick and wait till remove-brick completes.

Actual results:
==============
Failure count shows zero though there are file migration failures.

Expected results:
=================
Failure count should not be shown as 0 as there are file migration failures.

Comment 9 Susant Kumar Palai 2018-05-21 06:56:08 UTC
Upstream patch: https://review.gluster.org/#/c/20044/

Comment 16 Prasad Desala 2018-07-16 06:35:58 UTC
Verified this BZ on glusterfs version 3.12.2-13.el7rhgs.x86_64.

When remove-brick is started while rename in-progress, now remove-brick status is showing the failure count for migration failures.

[root@node1]# gluster v remove-brick dist node1:/bricks/brick8/dist-b8 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             1053       107.2MB          1209            44             0            completed        0:00:53

Moving this BZ to Verified.

Comment 17 Srijita Mukherjee 2018-09-03 15:42:09 UTC
Have updated the doc text. Kindly review and confirm.

Comment 18 Susant Kumar Palai 2018-09-04 06:35:50 UTC
(In reply to Srijita Mukherjee from comment #17)
> Have updated the doc text. Kindly review and confirm.

ack.

Comment 19 errata-xmlrpc 2018-09-04 06:48:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.