1570475 – Rebalance on few nodes doesn't seem to complete - stuck at FUTEX_WAIT

Bug 1570475 - Rebalance on few nodes doesn't seem to complete - stuck at FUTEX_WAIT

Summary: Rebalance on few nodes doesn't seem to complete - stuck at FUTEX_WAIT

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	3.12
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1565119 1568348 1570476
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-23 03:08 UTC by Nithya Balachandran
Modified:	2018-10-23 14:41 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.12.10
Clone Of:	1568348
Environment:
Last Closed:	2018-10-23 14:41:47 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nithya Balachandran 2018-04-23 03:08:13 UTC

+++ This bug was initially created as a clone of Bug #1568348 +++

Concurrent directory renames and fix-layouts can deadlock.

Steps to reproduce this with upstream master:


1. Create a 5 brick pure distribute volume 
2. Mount the volume on 2 different mount points (/mnt/1 and /mnt2)
3. From /mnt/1 create 3 levels of directories (mkdir -p d0/d1/d2)
4. Add bricks to the volume 
5. Gdb into the mount point process for /mnt/1 and set a breakpoint at dht_rename_dir_lock1_cbk
6. From /mnt/1 run 'mv d0/d1 d0/d1_a"
In this particular example, the name of the hashed subvol of the /d0/d1 is alphabetically greater than that of /d0/d1_a
[2018-04-17 09:24:02.267020] I [MSGID: 109066] [dht-rename.c:1751:dht_rename] 2-dlock-dht: renaming /d0/d1 (hash=dlock-client-3/cache=dlock-client-0) => /d0/d1_a (hash=dlock-client-0/cache=<nul>)

7. Once the breakpoint is hit in the /mnt/1 process, run the following on the other mount point /mnt/2

setfattr -n "distribute.fix.layout" -v "1" d0


8.Allow gdb to continue

Both processes are now deadlocked.




[root@rhgs313-6 ~]# gluster v create dlock server1:/bricks/brick1/deadlock-{1..5} force
volume create: dlock: success: please start the volume to access data
[root@rhgs313-6 ~]# gluster v start dlock
volume start: dlock: success
[root@rhgs313-6 ~]# mount -t glusterfs -s server1:dlock /mnt/fuse1
[root@rhgs313-6 ~]# mount -t glusterfs -s server1:dlock /mnt/fuse2
[root@rhgs313-6 ~]# cd /mnt/fuse1
[root@rhgs313-6 fuse1]# l
total 0
[root@rhgs313-6 fuse1]# 
[root@rhgs313-6 fuse1]# 
[root@rhgs313-6 fuse1]# 
[root@rhgs313-6 fuse1]# mkdir -p d0/d1/d2
[root@rhgs313-6 fuse1]# ll -lR
.:
total 4
drwxr-xr-x. 3 root root 4096 Apr 17 14:49 d0

./d0:
total 4
drwxr-xr-x. 3 root root 4096 Apr 17 14:49 d1

./d0/d1:
total 4
drwxr-xr-x. 2 root root 4096 Apr 17 14:49
[root@rhgs313-6 fuse1]# gluster v add-brick dlock server1:/bricks/brick1/deadlock-{6..7} force
volume add-brick: success

GDB into process, set breakpoint etc etc


[root@rhgs313-6 fuse1]# mv d0/d1 d0/d1_a


Once gdb breaks at the breakpoint,
[root@rhgs313-6 brick1]# cd /mnt/fuse2/
[root@rhgs313-6 fuse2]# ll
total 4
drwxr-xr-x. 3 root root 4096 Apr 17 14:49 d0
[root@rhgs313-6 fuse2]# setfattr -n "distribute.fix.layout" -v "1" d0



This will hang. Allow gdb to continue. /mnt/fuse1 will also hang.

--- Additional comment from Worker Ant on 2018-04-17 06:12:59 EDT ---

REVIEW: https://review.gluster.org/19886 (cluster/dht: Fix dht_rename lock order) posted (#1) for review on master by N Balachandran

--- Additional comment from Worker Ant on 2018-04-22 21:43:39 EDT ---

COMMIT: https://review.gluster.org/19886 committed in master by "Raghavendra G" <rgowdapp> with a commit message- cluster/dht: Fix dht_rename lock order

Fixed dht_order_rename_lock to use the same inodelk ordering
as that of the dht selfheal locks (dictionary order of
lock subvolumes).

Change-Id: Ia3f8353b33ea2fd3bc1ba7e8e777dda6c1d33e0d
fixes: bz#1568348
Signed-off-by: N Balachandran <nbalacha>

Comment 1 Worker Ant 2018-04-23 03:28:15 UTC

REVIEW: https://review.gluster.org/19922 (cluster/dht: Fix dht_rename lock order) posted (#1) for review on release-3.12 by N Balachandran

Comment 2 Worker Ant 2018-05-09 05:04:47 UTC

COMMIT: https://review.gluster.org/19922 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- cluster/dht: Fix dht_rename lock order

Fixed dht_order_rename_lock to use the same inodelk ordering
as that of the dht selfheal locks (dictionary order of
lock subvolumes).

Change-Id: Ia3f8353b33ea2fd3bc1ba7e8e777dda6c1d33e0d
BUG: 1570475
Signed-off-by: N Balachandran <nbalacha>

Comment 3 Shyamsundar 2018-10-23 14:41:47 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.10, please open a new bug report.

glusterfs-3.12.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-May/000099.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.