1264310 – DHT: Rebalance hang while migrating the files of disperse volume

Bug 1264310 - DHT: Rebalance hang while migrating the files of disperse volume

Summary: DHT: Rebalance hang while migrating the files of disperse volume

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	disperse
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Ashish Pandey
QA Contact:	Prasad Desala
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1304988 1322299 1351522
TreeView+	depends on / blocked

Reported:	2015-09-18 07:15 UTC by RajeshReddy
Modified:	2017-03-23 05:23 UTC (History)
CC List:	9 users (show)
Fixed In Version:	glusterfs-3.8.4-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1304988 (view as bug list)
Environment:
Last Closed:	2017-03-23 05:23:56 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description RajeshReddy 2015-09-18 07:15:40 UTC

Description of problem:
=============================
DHT: Rebalance hang while migrating the files of disperse volume 

Version-Release number of selected component (if applicable):
====================
glusterfs-fuse-3.7.1-14

Steps to Reproduce:
=======================
1.Create EC 2X(4+2) volume and mount it client and do IO
2.Create 100K files on mount and untar the Linux kernel
3.Run the script to rename 100k files, at the same time add 6 brick and run the rebalance, but rebalance process is hang 


Expected results:
==================
Rebalance should complete without hang 

Notes:
=========
[root@rhs-client39 ~]# gluster vol rebalance ECVOL4 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            82364       226.5MB        220915             0             0          in progress          247743.00
      rhs-client9.lab.eng.blr.redhat.com                0        0Bytes             0             0             0            completed           16267.00
volume rebalance: ECVOL4: success:

Comment 2 RajeshReddy 2015-09-18 08:37:56 UTC

Logs are available @ following location 
/home/repo/sosreports/bug.1264310

Comment 3 Sakshi 2015-09-21 12:55:40 UTC

From the statedump on the bricks, it seems that two clients (rename and rebalance) are trying to acquire inodelk on the same disperse subvol. One of them is granted and the other is blocked which in turn blocks the rebalance process.

Comment 4 Sakshi 2015-09-21 12:57:12 UTC

Here is an extract of the statedump:
[xlator.features.locks.e-locks.inode]
path=/
mandatory=0
inodelk-count=2
lock-dump.domain.domain=dht.layout.heal
lock-dump.domain.domain=e-disperse-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 29918, owner=0c6a3764407f0000, client=0x7f6150001150, connection-id=dhcp42-202.lab.eng.blr.redhat.com-24842-2015/09/21-17:03:29:980708-e-client-0-0-0, granted at 2015-09-21 17:26:16
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 18446744073709551613, owner=ec320160687f0000, client=0x7f6150081400, connection-id=dhcp42-202.lab.eng.blr.redhat.com-30069-2015/09/21-17:26:24:439084-e-client-0-0-0, blocked at 2015-09-21 17:26:29

Comment 5 Sakshi 2015-09-24 06:57:51 UTC

A very simple test case to reproduce the issue:

1) Create a disperse volume
2) FUSE mount
3) Create 100 files (touch ec_mnt/file{1..100}) and few other folders
4) Run this script which renames the files in continuous loop:

#!/bin/bash

echo 'Renaming files'
while :
do
        for i in {1..100}; do mv file$i newfile$i; done
        for i in {1..100}; do mv newfile$i file$i; done
done


5) Add few more bricks.
6) Start rebalance on the volume. It will remain hung.
7) Stop the script - rebalance resumes.

After discussion with Pranith, these are some observations:

1) Ec takes blocking inodelk during rename. During the rename of a particular file (ec is holding blocking inodelk on the parent directory), if the rename of another file under the same directory comes. EC does not release the lock and goes ahead and renames the "new" file with the "already held lock". 

2) Hence a rebalance is not getting hung but rather getting blocked on a lock, which the ec is holding to rename multiple files (without unlocking).

3) As soon as the rename is stopped, lock is released and rebalance continues.

Comment 11 Atin Mukherjee 2016-09-17 15:21:09 UTC

Upstream mainline : http://review.gluster.org/13460
Upstream 3.8 : http://review.gluster.org/15061

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 14 Prasad Desala 2016-11-28 05:59:30 UTC

Verified this BZ using glusterfs version: 3.8.4-5.el7rhgs.x86_64.

Below are the steps that were followed to verify this BZ,
1) Created a EC 2X(4+2) volume and started it.
2) FUSE mounted the volume.
3) Created 100K files on the mount and untarred Linux kernel package.
4) Ran script to rename 100k files, at the same time added 6 bricks and triggered rebalance.
Did not see any hang in the rebalance process. Rebalance and rename completed successfully without any issues.

Hence, moving this BZ to Verified.

Comment 16 errata-xmlrpc 2017-03-23 05:23:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.