Description of problem: NFS hangs on large directory access. Version-Release number of selected component (if applicable): Centos Stream 8 updated How reproducible: always Steps to Reproduce: 1. Mount NFS 2. CD to a large directory (~500000 files) 3. LS Actual results: Never returns listings. Unable to interrupt using Ctrl-C Message is syslog : Feb 1 09:49:45 centos50 kernel: INFO: task bash:1326 blocked for more than 120 seconds. Feb 1 09:49:45 centos50 kernel: Not tainted 4.18.0-448.el8.x86_64 #1 Feb 1 09:49:45 centos50 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 1 09:49:45 centos50 kernel: task:bash state:D stack: 0 pid: 1326 ppid: 1325 flags:0x00004084 Feb 1 09:49:45 centos50 kernel: Call Trace: Feb 1 09:49:45 centos50 kernel: __schedule+0x2d1/0x870 Feb 1 09:49:45 centos50 kernel: schedule+0x55/0xf0 Feb 1 09:49:45 centos50 kernel: io_schedule+0x12/0x40 Feb 1 09:49:45 centos50 kernel: __lock_page+0x12d/0x230 Feb 1 09:49:45 centos50 kernel: ? file_fdatawait_range+0x20/0x20 Feb 1 09:49:45 centos50 kernel: pagecache_get_page+0x1e6/0x310 Feb 1 09:49:45 centos50 kernel: nfs_readdir_page_get_locked+0x38/0xe0 [nfs] Feb 1 09:49:45 centos50 kernel: nfs_readdir_page_filler+0x215/0x410 [nfs] Feb 1 09:49:45 centos50 kernel: nfs_readdir_xdr_to_array+0x2d9/0x310 [nfs] Feb 1 09:49:45 centos50 kernel: nfs_readdir+0x26a/0xda0 [nfs] Feb 1 09:49:45 centos50 kernel: ? update_load_avg+0x7e/0x710 Feb 1 09:49:45 centos50 kernel: iterate_dir+0x144/0x1a0 Feb 1 09:49:45 centos50 kernel: ksys_getdents64+0x9c/0x130 Feb 1 09:49:45 centos50 kernel: ? iterate_dir+0x1a0/0x1a0 Feb 1 09:49:45 centos50 kernel: __x64_sys_getdents64+0x16/0x20 Feb 1 09:49:45 centos50 kernel: do_syscall_64+0x5b/0x1b0 Feb 1 09:49:45 centos50 kernel: entry_SYSCALL_64_after_hwframe+0x61/0xc6 Feb 1 09:49:45 centos50 kernel: RIP: 0033:0x7fbe68a4436b Feb 1 09:49:45 centos50 kernel: Code: Unable to access opcode bytes at RIP 0x7fbe68a44341. Feb 1 09:49:45 centos50 kernel: RSP: 002b:00007fffe85a09a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d9 Feb 1 09:49:45 centos50 kernel: RAX: ffffffffffffffda RBX: 000055f0ef7af4e0 RCX: 00007fbe68a4436b Feb 1 09:49:45 centos50 kernel: RDX: 0000000000100000 RSI: 000055f0ef7af510 RDI: 0000000000000003 Feb 1 09:49:45 centos50 kernel: RBP: 000055f0ef7af510 R08: 0000000000000005 R09: 00007fbe68d0ebc0 Feb 1 09:49:45 centos50 kernel: R10: 0000000000000007 R11: 0000000000000246 R12: ffffffffffffff78 Feb 1 09:49:45 centos50 kernel: R13: 0000000000000000 R14: 000055f0ef8af4e3 R15: 0000000000000000 Expected results: Directory listing after reasonable time Additional info: Happens with kernel 4.18.0-448.el8.x86_64 only For comparison : - vanilla kernel 4.19.271 is OK. - previous kernel-4.18.0-408.el8.x86_64 is OK
Reproduced, we're waiting on a page we already locked. .. needs upstream: 648a4548d622 NFS: Don't deadlock when cookie hashes collide
No new issue found in kernel-4.18.0-463.el8 https://beaker.engineering.redhat.com/jobs/7518804 https://beaker.engineering.redhat.com/jobs/7518803 https://beaker.engineering.redhat.com/jobs/7518802
No new issue found from the regression tests.
As of today Centos Stream 8 still provides 4.18.0-448.el8 (affected by NFS bug). Do you have an expected date of release for kernel-4.18.0-463.el8 for Stream8 ?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2951