RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2160443 - [RHEL 9][NFS] Crash in file_has_perm() when dereferencing a NULL file->f_security
Summary: [RHEL 9][NFS] Crash in file_has_perm() when dereferencing a NULL file->f_secu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kernel
Version: 9.1
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Yongcheng Yang
URL:
Whiteboard:
: 2164820 2164822 2164887 2165199 (view as bug list)
Depends On:
Blocks: 2144442
TreeView+ depends on / blocked
 
Reported: 2023-01-12 12:52 UTC by Stan Saner
Modified: 2023-06-07 10:06 UTC (History)
8 users (show)

Fixed In Version: kernel-5.14.0-253.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-09 08:11:43 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src/kernel centos-stream-9 merge_requests 1942 0 None opened nfsd: don't free files unconditionally in __nfsd_file_cache_purge 2023-01-24 21:37:53 UTC
Red Hat Issue Tracker RHELPLAN-144938 0 None None None 2023-01-12 12:52:50 UTC
Red Hat Product Errata RHSA-2023:2458 0 None None None 2023-05-09 08:12:10 UTC

Description Stan Saner 2023-01-12 12:52:29 UTC
Description of problem:
-----------------------

System crashes with the console messages and kernel stack trace:

message buffer:
---------------
...
[ 7886.767600] Leaked POSIX lock on dev=0xfd:0x9 ino=0x33364eeb93  fl_owner=000000006de0bef7 fl_flags=0x1001 fl_type=0x1 fl_pid=8168
[ 7886.775029] Leaked POSIX lock on dev=0xfd:0x5 ino=0x45077a84e8  fl_owner=00000000ae1b5970 fl_flags=0x1001 fl_type=0x1 fl_pid=8134
[ 7886.779697] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 7886.779733] #PF: supervisor read access in kernel mode
[ 7886.780079] Leaked POSIX lock on dev=0xfd:0x9 ino=0x1c0c7cba46  fl_owner=00000000cd073bee fl_flags=0x1 fl_type=0x1 fl_pid=8198
[ 7886.780621] #PF: error_code(0x0000) - not-present page
[ 7886.784231] PGD 0 P4D 0 
[ 7886.785181] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 7886.786061] CPU: 34 PID: 8199 Comm: nfsd Kdump: loaded Not tainted 5.14.0-229.jlayton.nfsd92.2.el9.x86_64 #1
[ 7886.786949] Hardware name: Supermicro Super Server/H11SSL-NC, BIOS 1.0b 04/27/2018
[ 7886.787882] RIP: 0010:file_has_perm+0x52/0xd0
[ 7886.788750] Code: 25 28 00 00 00 48 89 44 24 20 31 c0 48 03 96 c0 00 00 00 48 63 05 8e d9 03 01 c6 04 24 0c 48 03 47 78 8b 70 04 4c 89 74 24 08 <8b> 12 39 f2 74 29 49 89 e1 41 b8 01 00 00 00 b9 09 00 00 00 48 c7
[ 7886.790440] RSP: 0018:ffff9edfd49f7bf0 EFLAGS: 00010282
[ 7886.790443] RAX: ffff8bba98d9de20 RBX: ffffffffb8ee5a28 RCX: 0000000000000617
[ 7886.790446] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8bb952237080
[ 7886.793505] RBP: ffff8bb952237080 R08: 0000000000000000 R09: ffff8bbf86388af8
[ 7886.794345] R10: ffff8bbc0f7a8700 R11: ffff8bba98d9de20 R12: 0000000000000040
[ 7886.795180] R13: ffff8bbb41d32900 R14: ffff8bb975b7bd00 R15: 0000000000000000
[ 7886.796016] FS:  0000000000000000(0000) GS:ffff8bbd0f880000(0000) knlGS:0000000000000000
[ 7886.796852] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7886.797662] CR2: 0000000000000000 CR3: 000000047d0a8000 CR4: 00000000003506e0
[ 7886.798472] Call Trace:
[ 7886.799267]  <TASK>
[ 7886.800050]  security_file_lock+0x28/0x40
[ 7886.800835]  generic_setlease+0x6b/0x2f0
[ 7886.801613]  nfs4_set_delegation+0x30b/0x710 [nfsd]
[ 7886.802412]  ? nfs4_get_vfs_file+0x1ba/0x360 [nfsd]
[ 7886.803209]  nfs4_open_delegation+0xca/0x1e0 [nfsd]
[ 7886.803993]  nfsd4_process_open2+0x53b/0x9f0 [nfsd]
[ 7886.804485] Leaked POSIX lock on dev=0xfd:0x9 ino=0xe6003d9a4a  fl_owner=00000000a40c3abd fl_flags=0x1001 fl_type=0x1 fl_pid=8156
[ 7886.804775]  ? fh_verify+0x1ea/0x260 [nfsd]
[ 7886.807915]  nfsd4_open+0x3ce/0x4b0 [nfsd]
[ 7886.808876]  nfsd4_proc_compound+0x44b/0x6f0 [nfsd]
[ 7886.809757]  nfsd_dispatch+0x15e/0x290 [nfsd]
[ 7886.810576]  svc_process_common+0x3bc/0x5e0 [sunrpc]
[ 7886.811378]  ? nfsd_svc+0x190/0x190 [nfsd]
[ 7886.812133]  ? nfsd_shutdown_threads+0xa0/0xa0 [nfsd]
[ 7886.812883]  svc_process+0xb7/0xf0 [sunrpc]
[ 7886.813634]  nfsd+0xd5/0x190 [nfsd]
[ 7886.814361]  kthread+0xd9/0x100
[ 7886.815059]  ? kthread_complete_and_exit+0x20/0x20
[ 7886.815761]  ret_from_fork+0x22/0x30
[ 7886.816462]  </TASK>

kernel panic stack trace:
-------------------------

crash> bt
PID: 8199     TASK: ffff8bc13a93b900  CPU: 34   COMMAND: "nfsd"
 #0 [ffff9edfd49f7978] machine_kexec at ffffffffb7a6b117
 #1 [ffff9edfd49f79d0] __crash_kexec at ffffffffb7bc2abd
 #2 [ffff9edfd49f7a98] crash_kexec at ffffffffb7bc3ca8
 #3 [ffff9edfd49f7aa0] oops_end at ffffffffb7a2830b
 #4 [ffff9edfd49f7ac0] page_fault_oops at ffffffffb7a7aefb
 #5 [ffff9edfd49f7b18] exc_page_fault at ffffffffb851e542
 #6 [ffff9edfd49f7b40] asm_exc_page_fault at ffffffffb8600b62
    [exception RIP: file_has_perm+0x52]
    RIP: ffffffffb7ea9662  RSP: ffff9edfd49f7bf0  RFLAGS: 00010282
    RAX: ffff8bba98d9de20  RBX: ffffffffb8ee5a28  RCX: 0000000000000617
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: ffff8bb952237080
    RBP: ffff8bb952237080   R8: 0000000000000000   R9: ffff8bbf86388af8
    R10: ffff8bbc0f7a8700  R11: ffff8bba98d9de20  R12: 0000000000000040
    R13: ffff8bbb41d32900  R14: ffff8bb975b7bd00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff9edfd49f7c38] security_file_lock at ffffffffb7ea20d8
 #8 [ffff9edfd49f7c58] generic_setlease at ffffffffb7e302db
 #9 [ffff9edfd49f7ca8] nfs4_set_delegation at ffffffffc136196b [nfsd]
#10 [ffff9edfd49f7d18] nfs4_open_delegation at ffffffffc1361e3a [nfsd]
#11 [ffff9edfd49f7d50] nfsd4_process_open2 at ffffffffc1366e5b [nfsd]
#12 [ffff9edfd49f7dd8] nfsd4_open at ffffffffc135255e [nfsd]
#13 [ffff9edfd49f7e20] nfsd4_proc_compound at ffffffffc1352a8b [nfsd]
#14 [ffff9edfd49f7e68] nfsd_dispatch at ffffffffc133826e [nfsd]
#15 [ffff9edfd49f7e90] svc_process_common at ffffffffc0edc69c [sunrpc]
#16 [ffff9edfd49f7ee0] svc_process at ffffffffc0edc977 [sunrpc]
#17 [ffff9edfd49f7ef8] nfsd at ffffffffc1337cd5 [nfsd]
#18 [ffff9edfd49f7f18] kthread at ffffffffb7b1e9b9
#19 [ffff9edfd49f7f50] ret_from_fork at ffffffffb7a01f82



Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHEL 9 kernel 5.14.0-229.jlayton.nfsd92.2.el9.x86_64


How reproducible:
-----------------
Unsure whether the problem is solidly reproducible. 
The customer triggered it by running 'exportfs -ar' after changing the exports file.


Actual results:
---------------
System crashes with the above documented kernel stack trace


Expected results:
-----------------
No crash, stable system


Additional info:
----------------
The system was running a test kernel with a fix for the Bug 2152473, but the cause appears to be unrelated to the problem tracked there.
Further details to follow.

Comment 4 Jeff Layton 2023-01-12 13:57:13 UTC
More random noteS:

- the fact that fi_deleg_file and fi_fds[0] is different may suggest that there was some sort of conflicting access that caused the filecache to unhash 0xffff8bb992c763a8 and create a new nfsd_file entry instead of reusing it. That could mean that there was at least one delegation recall involving this nfsd_file.

Comment 5 Jeff Layton 2023-01-16 17:57:43 UTC
Looking more at the logs, the first hint that we have that something is wrong are these:

[ 7886.767600] Leaked POSIX lock on dev=0xfd:0x9 ino=0x33364eeb93  fl_owner=000000006de0bef7 fl_flags=0x1001 fl_type=0x1 fl_pid=8168
[ 7886.775029] Leaked POSIX lock on dev=0xfd:0x5 ino=0x45077a84e8  fl_owner=00000000ae1b5970 fl_flags=0x1001 fl_type=0x1 fl_pid=8134
[ 7886.780079] Leaked POSIX lock on dev=0xfd:0x9 ino=0x1c0c7cba46  fl_owner=00000000cd073bee fl_flags=0x1 fl_type=0x1 fl_pid=8198

The fl_flags (0x1001) indicate that these are non-OFD POSIX locks. The first two are locks that were reclaimed when the server was last rebooted (FL_RECLAIM == 0x1000).

Comment 10 Jeff Layton 2023-01-19 17:40:44 UTC
I had dismissed the info about exportfs as coincidental, but I went back and had a look and there is a call to nfsd_file_cache_purge that occurs when the exports table is flushed. So, I set up a host running pynfs and then ran 'exportfs -rva' in a tight loop, and got this. This crash involves an open stateid and not a delegation, but it's a very similar problem. It's probably more likely to happen with delegations as they tend to be long-lived. I'm working on a patch for this now.

[  131.763247] NFSD: Using nfsdcld client tracking operations.
[  131.764965] NFSD: no clients to reclaim, skipping NFSv4 grace period (net f0000000)
[  337.962027] ------------[ cut here ]------------
[  337.963823] refcount_t: underflow; use-after-free.
[  337.965502] WARNING: CPU: 6 PID: 3401 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[  337.967999] Modules linked in: nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ext4(E) crc16(E) cirrus(E) kvm_intel(E) 9p(E) mbcache(E) joydev(E) virtio_net(E) drm_shmem_helper(E) net_failover(E) kvm(E) jbd2(E) netfs(E) psmouse(E) evdev(E) pcspkr(E) failover(E) irqbypass(E) virtio_balloon(E) drm_kms_helper(E) 9pnet_virtio(E) button(E) drm(E) configfs(E) zram(E) zsmalloc(E) crct10dif_pclmul(E) crc32_pclmul(E) nvme(E) ghash_clmulni_intel(E) virtio_blk(E) sha512_ssse3(E) sha512_generic(E) nvme_core(E) t10_pi(E) virtio_pci(E) virtio(E) crc64_rocksoft_generic(E) aesni_intel(E) crypto_simd(E) crc64_rocksoft(E) virtio_pci_legacy_dev(E) i6300esb(E) cryptd(E) serio_raw(E) crc64(E) virtio_pci_modern_dev(E) virtio_ring(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) crc32c_intel(E) autofs4(E)
[  337.992040] CPU: 6 PID: 3401 Comm: nfsd Tainted: G            E      6.2.0-rc3+ #11
[  337.994701] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014
[  337.998046] RIP: 0010:refcount_warn_saturate+0xba/0x110
[  337.999852] Code: 01 01 e8 83 e5 4f 00 0f 0b c3 cc cc cc cc 80 3d 60 f4 05 01 00 75 85 48 c7 c7 30 b5 e1 9d c6 05 50 f4 05 01 01 e8 60 e5 4f 00 <0f> 0b c3 cc cc cc cc 80 3d 3b f4 05 01 00 0f 85 5e ff ff ff 48 c7
[  338.005245] RSP: 0018:ffffa36802e4bd50 EFLAGS: 00010282
[  338.006621] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
[  338.008273] RDX: 0000000000000001 RSI: ffffffff9de03ef5 RDI: 00000000ffffffff
[  338.009804] RBP: 0000000000000003 R08: 0000000000000000 R09: ffffa36802e4bc00
[  338.011719] R10: 0000000000000003 R11: ffffffff9e0bfdc8 R12: ffff9578da461b80
[  338.013533] R13: 0000000000000001 R14: ffff9578da422280 R15: ffff9578da461b80
[  338.015238] FS:  0000000000000000(0000) GS:ffff957a37d00000(0000) knlGS:0000000000000000
[  338.017179] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  338.018680] CR2: 00007f324c1e1c08 CR3: 000000020360a004 CR4: 0000000000060ee0
[  338.020377] Call Trace:
[  338.021190]  <TASK>
[  338.021956]  release_all_access+0x96/0x120 [nfsd]
[  338.023192]  ? _raw_spin_unlock+0x15/0x30
[  338.024192]  nfsd4_close+0x275/0x3d0 [nfsd]
[  338.025468]  ? fh_verify+0x45e/0x780 [nfsd]
[  338.027535]  ? __pfx_nfsd4_encode_noop+0x10/0x10 [nfsd]
[  338.028775]  ? nfsd4_encode_operation+0xae/0x280 [nfsd]
[  338.030593]  nfsd4_proc_compound+0x3ae/0x6f0 [nfsd]
[  338.032341]  nfsd_dispatch+0x16a/0x270 [nfsd]
[  338.034667]  svc_process_common+0x2eb/0x660 [sunrpc]
[  338.036614]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[  338.038827]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  338.040267]  svc_process+0xad/0x100 [sunrpc]
[  338.041981]  nfsd+0xd5/0x190 [nfsd]
[  338.043362]  kthread+0xe9/0x110
[  338.044680]  ? __pfx_kthread+0x10/0x10
[  338.046376]  ret_from_fork+0x2c/0x50
[  338.047892]  </TASK>
[  338.049067] ---[ end trace 0000000000000000 ]---
[  760.792789] BUG: kernel NULL pointer dereference, address: 0000000000000078
[  760.795933] #PF: supervisor read access in kernel mode
[  760.797477] #PF: error_code(0x0000) - not-present page
[  760.799120] PGD 0 P4D 0 
[  760.800140] Oops: 0000 [#1] PREEMPT SMP PTI
[  760.801383] CPU: 2 PID: 3401 Comm: nfsd Tainted: G        W   E      6.2.0-rc3+ #11
[  760.803120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014
[  760.805018] RIP: 0010:filp_close+0x23/0x70
[  760.806099] Code: 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 53 48 8b 47 38 48 85 c0 0f 84 41 e1 6d 00 48 8b 47 28 48 89 fb 48 89 f5 45 31 e4 <48> 8b 40 78 48 85 c0 74 08 e8 6f 70 72 00 41 89 c4 f6 43 45 40 75
[  760.809737] RSP: 0018:ffffa36802e4bc78 EFLAGS: 00010246
[  760.811084] RAX: 0000000000000000 RBX: ffff9578c7d4d600 RCX: 0000000000000000
[  760.812540] RDX: 000000000000098d RSI: 0000000000000000 RDI: ffff9578c7d4d600
[  760.814433] RBP: 0000000000000000 R08: 0000011335048e60 R09: ffff9578f82f1540
[  760.816089] R10: ffffa36802e4bcd0 R11: ffffa36802e4bcd8 R12: 0000000000000000
[  760.817529] R13: 0000000000000001 R14: dead000000000100 R15: ffff9578f82f1558
[  760.818982] FS:  0000000000000000(0000) GS:ffff957a37c80000(0000) knlGS:0000000000000000
[  760.820544] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  760.821734] CR2: 0000000000000078 CR3: 00000001565ce002 CR4: 0000000000060ee0
[  760.823141] Call Trace:
[  760.823808]  <TASK>
[  760.824419]  nfsd_file_free+0xe9/0x210 [nfsd]
[  760.825610]  release_all_access+0x96/0x120 [nfsd]
[  760.826680]  nfs4_free_ol_stateid+0x22/0x60 [nfsd]
[  760.827747]  free_ol_stateid_reaplist+0x61/0x90 [nfsd]
[  760.828858]  release_openowner+0x258/0x2a0 [nfsd]
[  760.829792]  __destroy_client+0x183/0x290 [nfsd]
[  760.830694]  nfsd4_setclientid_confirm+0x1a3/0x4f0 [nfsd]
[  760.831763]  nfsd4_proc_compound+0x3ae/0x6f0 [nfsd]
[  760.832717]  nfsd_dispatch+0x16a/0x270 [nfsd]
[  760.833576]  svc_process_common+0x2eb/0x660 [sunrpc]
[  760.834587]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[  760.835576]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  760.836462]  svc_process+0xad/0x100 [sunrpc]
[  760.837317]  nfsd+0xd5/0x190 [nfsd]
[  760.838133]  kthread+0xe9/0x110
[  760.838862]  ? __pfx_kthread+0x10/0x10
[  760.839755]  ret_from_fork+0x2c/0x50
[  760.840534]  </TASK>
[  760.841167] Modules linked in: nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ext4(E) crc16(E) cirrus(E) kvm_intel(E) 9p(E) mbcache(E) joydev(E) virtio_net(E) drm_shmem_helper(E) net_failover(E) kvm(E) jbd2(E) netfs(E) psmouse(E) evdev(E) pcspkr(E) failover(E) irqbypass(E) virtio_balloon(E) drm_kms_helper(E) 9pnet_virtio(E) button(E) drm(E) configfs(E) zram(E) zsmalloc(E) crct10dif_pclmul(E) crc32_pclmul(E) nvme(E) ghash_clmulni_intel(E) virtio_blk(E) sha512_ssse3(E) sha512_generic(E) nvme_core(E) t10_pi(E) virtio_pci(E) virtio(E) crc64_rocksoft_generic(E) aesni_intel(E) crypto_simd(E) crc64_rocksoft(E) virtio_pci_legacy_dev(E) i6300esb(E) cryptd(E) serio_raw(E) crc64(E) virtio_pci_modern_dev(E) virtio_ring(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) crc32c_intel(E) autofs4(E)
[  760.853527] CR2: 0000000000000078
[  760.854340] ---[ end trace 0000000000000000 ]---
[  760.855261] RIP: 0010:filp_close+0x23/0x70
[  760.856185] Code: 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 53 48 8b 47 38 48 85 c0 0f 84 41 e1 6d 00 48 8b 47 28 48 89 fb 48 89 f5 45 31 e4 <48> 8b 40 78 48 85 c0 74 08 e8 6f 70 72 00 41 89 c4 f6 43 45 40 75
[  760.859350] RSP: 0018:ffffa36802e4bc78 EFLAGS: 00010246
[  760.860356] RAX: 0000000000000000 RBX: ffff9578c7d4d600 RCX: 0000000000000000
[  760.861628] RDX: 000000000000098d RSI: 0000000000000000 RDI: ffff9578c7d4d600
[  760.862898] RBP: 0000000000000000 R08: 0000011335048e60 R09: ffff9578f82f1540
[  760.864172] R10: ffffa36802e4bcd0 R11: ffffa36802e4bcd8 R12: 0000000000000000
[  760.865438] R13: 0000000000000001 R14: dead000000000100 R15: ffff9578f82f1558
[  760.866692] FS:  0000000000000000(0000) GS:ffff957a37c80000(0000) knlGS:0000000000000000
[  760.868053] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  760.869102] CR2: 0000000000000078 CR3: 00000001565ce002 CR4: 0000000000060ee0

Comment 12 Jeff Layton 2023-01-19 19:38:36 UTC
Patch sent to the linux-nfs mailing list:

    https://lore.kernel.org/linux-nfs/20230119192021.83578-1-jlayton@kernel.org/T/#u

For QA, the reproducer is to run a bunch of NFSv4 activity against the server (pynfs is fine, but probably any rw-heavy file-based workload will do) while it's also running "exportfs -ra" in a tight loop. The current 9.2 kernels will crash rather quickly, but with the patch it seems to survive. 

It's probably also possible to hit this with NFSv3 too, but that's a little more tricky since v3-only struct nfsd_files don't tend to live that long.

I'm building a test kernel now and will post a link to it once it's done.

Comment 22 Jeff Layton 2023-01-26 16:28:24 UTC
*** Bug 2164822 has been marked as a duplicate of this bug. ***

Comment 23 Jeff Layton 2023-01-26 16:31:42 UTC
*** Bug 2164820 has been marked as a duplicate of this bug. ***

Comment 24 Jeff Layton 2023-01-26 20:50:03 UTC
*** Bug 2164887 has been marked as a duplicate of this bug. ***

Comment 25 Jeff Layton 2023-01-26 21:01:57 UTC
Centos 9 MR is here: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1942

Test kernels are available there.

Comment 26 Jeff Layton 2023-01-28 10:43:15 UTC
*** Bug 2165199 has been marked as a duplicate of this bug. ***

Comment 27 Yongcheng Yang 2023-01-28 13:37:30 UTC
(In reply to Jeff Layton from comment #25)
> Centos 9 MR is here:
> https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1942
> 
> Test kernels are available there.

Verified that MR build can fix this issue:

https://beaker.engineering.redhat.com/jobs/7475454
https://beaker.engineering.redhat.com/jobs/7475453

Comment 32 Yongcheng Yang 2023-02-08 01:49:28 UTC
Verified in the latest kernel version 5.14.0-261.el9:
https://beaker.engineering.redhat.com/jobs/7510052

There is a boot warning in 5.14.0-253.el9 but no panic occur:
https://beaker.engineering.redhat.com/jobs/7510054

Reproduced in kernel 5.14.0-252.el9:
https://beaker.engineering.redhat.com/jobs/7510053

Comment 33 daryl herzmann 2023-02-17 18:00:15 UTC
FWIW, 5.14.0-267.el9.x86_64 has fixed this issue for me.  Thanks everyone.

Comment 36 errata-xmlrpc 2023-05-09 08:11:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2458


Note You need to log in before you can comment on or make changes to this bug.