RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2166364 - NFS hang on large dirs with kenel 4.18.0-448.el8.x86_64
Summary: NFS hang on large dirs with kenel 4.18.0-448.el8.x86_64
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: CentOS Stream
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 8.8
Assignee: Benjamin Coddington
QA Contact: Yongcheng Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-02-01 15:35 UTC by dm
Modified: 2023-05-16 10:56 UTC (History)
9 users (show)

Fixed In Version: kernel-4.18.0-463.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-16 09:01:04 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/kernel rhel-8 merge_requests 4188 0 None None None 2023-02-01 21:28:41 UTC
Red Hat Issue Tracker RHELPLAN-147322 0 None None None 2023-02-01 15:36:14 UTC
Red Hat Product Errata RHSA-2023:2951 0 None None None 2023-05-16 09:02:24 UTC

Description dm 2023-02-01 15:35:16 UTC
Description of problem:
NFS hangs on large directory access.

Version-Release number of selected component (if applicable):
Centos Stream 8 updated

How reproducible: always


Steps to Reproduce:
1. Mount NFS
2. CD to a large directory (~500000 files)
3. LS

Actual results:
Never returns listings.
Unable to interrupt using Ctrl-C

Message is syslog :

Feb  1 09:49:45 centos50 kernel: INFO: task bash:1326 blocked for more than 120 seconds.
Feb  1 09:49:45 centos50 kernel:      Not tainted 4.18.0-448.el8.x86_64 #1
Feb  1 09:49:45 centos50 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb  1 09:49:45 centos50 kernel: task:bash            state:D stack:    0 pid: 1326 ppid:  1325 flags:0x00004084
Feb  1 09:49:45 centos50 kernel: Call Trace:
Feb  1 09:49:45 centos50 kernel: __schedule+0x2d1/0x870
Feb  1 09:49:45 centos50 kernel: schedule+0x55/0xf0
Feb  1 09:49:45 centos50 kernel: io_schedule+0x12/0x40
Feb  1 09:49:45 centos50 kernel: __lock_page+0x12d/0x230
Feb  1 09:49:45 centos50 kernel: ? file_fdatawait_range+0x20/0x20
Feb  1 09:49:45 centos50 kernel: pagecache_get_page+0x1e6/0x310
Feb  1 09:49:45 centos50 kernel: nfs_readdir_page_get_locked+0x38/0xe0 [nfs]
Feb  1 09:49:45 centos50 kernel: nfs_readdir_page_filler+0x215/0x410 [nfs]
Feb  1 09:49:45 centos50 kernel: nfs_readdir_xdr_to_array+0x2d9/0x310 [nfs]
Feb  1 09:49:45 centos50 kernel: nfs_readdir+0x26a/0xda0 [nfs]
Feb  1 09:49:45 centos50 kernel: ? update_load_avg+0x7e/0x710
Feb  1 09:49:45 centos50 kernel: iterate_dir+0x144/0x1a0
Feb  1 09:49:45 centos50 kernel: ksys_getdents64+0x9c/0x130
Feb  1 09:49:45 centos50 kernel: ? iterate_dir+0x1a0/0x1a0
Feb  1 09:49:45 centos50 kernel: __x64_sys_getdents64+0x16/0x20
Feb  1 09:49:45 centos50 kernel: do_syscall_64+0x5b/0x1b0
Feb  1 09:49:45 centos50 kernel: entry_SYSCALL_64_after_hwframe+0x61/0xc6
Feb  1 09:49:45 centos50 kernel: RIP: 0033:0x7fbe68a4436b
Feb  1 09:49:45 centos50 kernel: Code: Unable to access opcode bytes at RIP 0x7fbe68a44341.
Feb  1 09:49:45 centos50 kernel: RSP: 002b:00007fffe85a09a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d9
Feb  1 09:49:45 centos50 kernel: RAX: ffffffffffffffda RBX: 000055f0ef7af4e0 RCX: 00007fbe68a4436b
Feb  1 09:49:45 centos50 kernel: RDX: 0000000000100000 RSI: 000055f0ef7af510 RDI: 0000000000000003
Feb  1 09:49:45 centos50 kernel: RBP: 000055f0ef7af510 R08: 0000000000000005 R09: 00007fbe68d0ebc0
Feb  1 09:49:45 centos50 kernel: R10: 0000000000000007 R11: 0000000000000246 R12: ffffffffffffff78
Feb  1 09:49:45 centos50 kernel: R13: 0000000000000000 R14: 000055f0ef8af4e3 R15: 0000000000000000



Expected results:
Directory listing after reasonable time


Additional info:
Happens with kernel 4.18.0-448.el8.x86_64 only

For comparison :
- vanilla kernel 4.19.271 is OK.
- previous kernel-4.18.0-408.el8.x86_64 is OK

Comment 1 Benjamin Coddington 2023-02-01 21:17:56 UTC
Reproduced, we're waiting on a page we already locked.

.. needs upstream:
648a4548d622 NFS: Don't deadlock when cookie hashes collide

Comment 14 Yongcheng Yang 2023-02-16 02:24:01 UTC
No new issue found from the regression tests.

Comment 15 dm 2023-03-09 14:12:38 UTC
As of today Centos Stream 8 still provides 4.18.0-448.el8 (affected by NFS bug).

Do you have an expected date of release for kernel-4.18.0-463.el8 for Stream8 ?

Comment 18 errata-xmlrpc 2023-05-16 09:01:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2951


Note You need to log in before you can comment on or make changes to this bug.