Description of problem: ======================= rm -rvf * command on fuse mount got hung. It hung for more than 1.30hr and still in same state. The volume is mounted on 4 clients but rm is issued from only one client. Client dump: ============= Mar 29 15:32:37 dhcp41-254 kernel: rm D ffff927aa92adee0 0 13643 13625 0x00000084 Mar 29 15:32:37 dhcp41-254 kernel: Call Trace: Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff87912f49>] schedule+0x29/0x70 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffffc066c4e5>] __fuse_request_send+0xf5/0x2e0 [fuse] Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff872bbe20>] ? wake_up_atomic_t+0x30/0x30 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffffc066c6e2>] fuse_request_send+0x12/0x20 [fuse] Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffffc06727eb>] fuse_lookup_name+0x12b/0x2c0 [fuse] Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffffc06729ce>] fuse_lookup+0x4e/0x130 [fuse] Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff874259b3>] lookup_real+0x23/0x60 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff874263c2>] __lookup_hash+0x42/0x60 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8790ac07>] lookup_slow+0x42/0xa7 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8742a248>] path_lookupat+0x838/0x8b0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff872cebfc>] ? try_to_wake_up+0x18c/0x350 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8742af9d>] ? do_last+0x66d/0x12c0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff873f7275>] ? kmem_cache_alloc+0x35/0x1f0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8742c58f>] ? getname_flags+0x4f/0x1a0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8742a2eb>] filename_lookup+0x2b/0xc0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8742d727>] user_path_at_empty+0x67/0xc0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffffc0674118>] ? fuse_file_put+0x48/0x80 [fuse] Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff8742d791>] user_path_at+0x11/0x20 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff87420763>] vfs_fstatat+0x63/0xc0 Mar 29 15:32:37 dhcp41-254 kernel: [<ffffffff87420d84>] SYSC_newfstatat+0x24/0x60 fuse mount bt: ============== (gdb) thread apply all bt Thread 11 (Thread 0x7fc24f126700 (LWP 13051)): #0 0x00007fc256bbbeed in nanosleep () from /lib64/libpthread.so.0 #1 0x00007fc257d64126 in gf_timer_proc (data=0x56104907fb00) at timer.c:165 #2 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 10 (Thread 0x7fc24e925700 (LWP 13052)): #0 0x00007fc256bbc411 in sigwait () from /lib64/libpthread.so.0 #1 0x0000561047e3a40b in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2149 #2 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7fc24e124700 (LWP 13053)): #0 0x00007fc2564444fd in nanosleep () from /lib64/libc.so.6 #1 0x00007fc256444394 in sleep () from /lib64/libc.so.6 #2 0x00007fc257d7e93d in pool_sweeper (arg=<optimized out>) at mem-pool.c:481 #3 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7fc24d923700 (LWP 13054)): #0 0x00007fc256bb8cf2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007fc257d91568 in syncenv_task (proc=proc@entry=0x561049080320) at syncop.c:603 #2 0x00007fc257d92430 in syncenv_processor (thdata=0x561049080320) at syncop.c:695 #3 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7fc24d122700 (LWP 13055)): #0 0x00007fc256bb8cf2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007fc257d91568 in syncenv_task (proc=proc@entry=0x5610490806e0) at syncop.c:603 #2 0x00007fc257d92430 in syncenv_processor (thdata=0x5610490806e0) at syncop.c:695 #3 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7fc24ae64700 (LWP 13057)): #0 0x00007fc25647e113 in epoll_wait () from /lib64/libc.so.6 #1 0x00007fc257db3252 in event_dispatch_epoll_worker (data=0x5610490a6990) at event-epoll.c:649 #2 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7fc2580d8700 (LWP 13058)): #0 0x00007fc256bb8cf2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007fc249088d3c in iot_worker (data=0x7fc2440b2dc0) at io-threads.c:193 #2 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7fc242d80700 (LWP 13059)): #0 0x00007fc25647e113 in epoll_wait () from /lib64/libc.so.6 #1 0x00007fc257db3252 in event_dispatch_epoll_worker (data=0x7fc244164b80) at event-epoll.c:649 #2 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007fc25647db3d in clone () from /lib64/libc.so.6 ---Type <return> to continue, or q <return> to quit--- Thread 3 (Thread 0x7fc242240700 (LWP 13062)): #0 0x00007fc2564745f0 in readv () from /lib64/libc.so.6 #1 0x00007fc257d7fd05 in sys_readv (fd=<optimized out>, iov=<optimized out>, iovcnt=<optimized out>) at syscall.c:295 #2 0x00007fc24f144a3a in fuse_thread_proc (data=0x56104906c8b0) at fuse-bridge.c:4984 #3 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7fc241a3f700 (LWP 13063)): #0 0x00007fc256bb8945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007fc24f12f953 in notify_kernel_loop (data=<optimized out>) at fuse-bridge.c:4004 #2 0x00007fc256bb4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007fc25647db3d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7fc25823a780 (LWP 13050)): #0 0x00007fc256bb5f47 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007fc257db39b8 in event_dispatch_epoll (event_pool=0x561049065210) at event-epoll.c:746 #2 0x0000561047e370e7 in main (argc=4, argv=<optimized out>) at glusterfsd.c:2562 Version-Release number of selected component (if applicable): 3.12.2-6.el7rhgs.x86_64 How reproducible: Reported at first occurrence Steps to Reproduce: ================== Not sure on how to reproduce this issue, as usual I was trying to remove the data set on the mount point. I have left the system in the same state and can be used for live debugging. Please see below comments for system details. Dataset on mount point: 5000 cat files, linux-4.9.27.tar.gz, linux-4.9.27 (kernel untar) Actual results: =============== rm hung. Expected results: ================ There should not be any hung.
Prasad, Can you please attach statedumps of client and bricks? regards, Raghavendra
Created attachment 1415446 [details] client statedump
Note that dht_selfheal_dir_mkdir doesn't call dht_selfheal_dir_finish on failures. So, any failure in dht_selfheal_dir_mkdir would result in a hung frame. Also dht_protect_namespace requires a non-NULL hashed subvol. If absent it returns a failure which in turn causes dht_selfheal_dir_mkdir to fail. I found dht_protect_namespace logging errors in rmdir codepath: [2018-03-29 08:47:05.511404] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-distrepx3-dht: no subvolume for hash (value) = 2788654309 [2018-03-29 08:47:05.511640] W [MSGID: 109011] [dht-common.c:8247:dht_rmdir_do] 0-distrepx3-dht: Failed to get hashed subvol for /1/linux-4.9.27/arch/arm/mach-bcm (gfid = e27e7994-31bc-4c71-be27-337e39d16401) [2018-03-29 08:47:05.515732] E [dht-lock.c:1271:dht_protect_namespace] (-->/usr/lib64/glusterfs/3.12.2/xlator/cluster/distribute.so(+0x521a7) [0x7fc249f431a7] -->/usr/lib64/glusterfs/3.12.2/xlator/cluster/distribute.so(+0x5029e) [0x7fc249f4129e] -->/usr/lib64/glusterfs/3.12.2/xlator/cluster/distribute.so(+0x79395) [0x7fc249f6a395] ) 0-distrepx3-dht: invalid argument: subvol [Invalid argument] [2018-03-29 08:47:05.515882] W [fuse-bridge.c:1457:fuse_unlink_cbk] 0-glusterfs-fuse: 5360: RMDIR() /1/linux-4.9.27/arch/arm/mach-bcm => -1 (Invalid argument) I am wondering a similar condition of not being able to find hashed subvol might've resulted in failure in dht_selfheal_dir_mkdir resulting in a hung frame. Will send a patch to fix this.
[root@xyz glusterfs]# grep "hashed subvolume not found" mnt-distrepx3.log* mnt-distrepx3.log-20180401:[2018-03-29 08:56:34.746447] W [MSGID: 109011] [dht-selfheal.c:1531:dht_selfheal_dir_mkdir] 0-distrepx3-dht: ( O���=B������X��/block) (path: /1/linux-4.9.27/block): hashed subvolume not found [Invalid argument] This proves that hung frame is due to dht_selfheal_dir_mkdir and hypothesis is correct.
https://review.gluster.org/#/c/19953/
On 3.12.2-13.el7rhgs.x86_64, ran rm -rf * on FUSE mount multiples times and I didn't see any rm hungs. Moving the BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607