Description of problem: ============================= DHT: Rebalance hang while migrating the files of disperse volume Version-Release number of selected component (if applicable): ==================== glusterfs-fuse-3.7.1-14 Steps to Reproduce: ======================= 1.Create EC 2X(4+2) volume and mount it client and do IO 2.Create 100K files on mount and untar the Linux kernel 3.Run the script to rename 100k files, at the same time add 6 brick and run the rebalance, but rebalance process is hang Expected results: ================== Rebalance should complete without hang Notes: ========= [root@rhs-client39 ~]# gluster vol rebalance ECVOL4 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 82364 226.5MB 220915 0 0 in progress 247743.00 rhs-client9.lab.eng.blr.redhat.com 0 0Bytes 0 0 0 completed 16267.00 volume rebalance: ECVOL4: success:
Logs are available @ following location /home/repo/sosreports/bug.1264310
From the statedump on the bricks, it seems that two clients (rename and rebalance) are trying to acquire inodelk on the same disperse subvol. One of them is granted and the other is blocked which in turn blocks the rebalance process.
Here is an extract of the statedump: [xlator.features.locks.e-locks.inode] path=/ mandatory=0 inodelk-count=2 lock-dump.domain.domain=dht.layout.heal lock-dump.domain.domain=e-disperse-0 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 29918, owner=0c6a3764407f0000, client=0x7f6150001150, connection-id=dhcp42-202.lab.eng.blr.redhat.com-24842-2015/09/21-17:03:29:980708-e-client-0-0-0, granted at 2015-09-21 17:26:16 inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 18446744073709551613, owner=ec320160687f0000, client=0x7f6150081400, connection-id=dhcp42-202.lab.eng.blr.redhat.com-30069-2015/09/21-17:26:24:439084-e-client-0-0-0, blocked at 2015-09-21 17:26:29
A very simple test case to reproduce the issue: 1) Create a disperse volume 2) FUSE mount 3) Create 100 files (touch ec_mnt/file{1..100}) and few other folders 4) Run this script which renames the files in continuous loop: #!/bin/bash echo 'Renaming files' while : do for i in {1..100}; do mv file$i newfile$i; done for i in {1..100}; do mv newfile$i file$i; done done 5) Add few more bricks. 6) Start rebalance on the volume. It will remain hung. 7) Stop the script - rebalance resumes. After discussion with Pranith, these are some observations: 1) Ec takes blocking inodelk during rename. During the rename of a particular file (ec is holding blocking inodelk on the parent directory), if the rename of another file under the same directory comes. EC does not release the lock and goes ahead and renames the "new" file with the "already held lock". 2) Hence a rebalance is not getting hung but rather getting blocked on a lock, which the ec is holding to rename multiple files (without unlocking). 3) As soon as the rename is stopped, lock is released and rebalance continues.
Upstream mainline : http://review.gluster.org/13460 Upstream 3.8 : http://review.gluster.org/15061 And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.
Verified this BZ using glusterfs version: 3.8.4-5.el7rhgs.x86_64. Below are the steps that were followed to verify this BZ, 1) Created a EC 2X(4+2) volume and started it. 2) FUSE mounted the volume. 3) Created 100K files on the mount and untarred Linux kernel package. 4) Ran script to rename 100k files, at the same time added 6 bricks and triggered rebalance. Did not see any hang in the rebalance process. Rebalance and rename completed successfully without any issues. Hence, moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html