Bug 2138605

Summary:	[Regression] kernel BUG at lib/list_debug.c:26! RIP: 0010:__list_add_valid.cold+0x3a/0x3c
Product:	Red Hat Enterprise Linux 9	Reporter:	Zhi Li <yieli>
Component:	kernel	Assignee:	Jeff Layton <jlayton>
kernel sub component:	NFS	QA Contact:	Zhi Li <yieli>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	bxue, chorn, chuck.lever, jiyin, jlayton, xzhou, yoyang
Version:	9.2	Keywords:	Regression, Triaged
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	kernel-5.14.0-239.el9	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-05-09 08:05:48 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Zhi Li 2022-10-30 04:19:40 UTC

Description of problem:
  Encounter the following panic when running nfs regression testcase on RHEL9 debug kernel.

[ 8180.785996] ------------[ cut here ]------------ 
[ 8180.786686] refcount_t: underflow; use-after-free. 
[ 8180.787593] WARNING: CPU: 0 PID: 24579 at lib/refcount.c:28 refcount_warn_saturate+0xae/0x150 
[ 8180.788679] Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common isst_if_common nfit libnvdimm cirrus drm_shmem_helper joydev drm_kms_helper pcspkr syscopyarea virtio_balloon sysfillrect sysimgblt fb_sys_fops i2c_piix4 drm fuse xfs libcrc32c ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel net_failover failover serio_raw virtio_blk dm_mirror dm_region_hash dm_log dm_mod 
[ 8180.794885] CPU: 0 PID: 24579 Comm: nfsd Kdump: loaded Tainted: G        W        --------- ---  5.14.0-178.el9.x86_64+debug #1 
[ 8180.796298] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 
[ 8180.797013] RIP: 0010:refcount_warn_saturate+0xae/0x150 
[ 8180.797697] Code: b9 10 4b 03 01 e8 8a 74 1d 01 0f 0b eb d1 80 3d a7 10 4b 03 00 75 c8 48 c7 c7 e0 41 f1 b2 c6 05 97 10 4b 03 01 e8 6a 74 1d 01 <0f> 0b eb b1 80 3d 85 10 4b 03 00 75 a8 48 c7 c7 a0 42 f1 b2 c6 05 
[ 8180.799964] RSP: 0018:ffffc9000154fb70 EFLAGS: 00010286 
[ 8180.800726] RAX: 0000000000000000 RBX: ffff8881046a36e0 RCX: 0000000000000000 
[ 8180.801645] RDX: 0000000000000001 RSI: ffffffffb2f1bd80 RDI: fffff520002a9f60 
[ 8180.802541] RBP: 0000000000000003 R08: 0000000000000001 R09: ffff88810bdf1c47 
[ 8180.803438] R10: ffffed10217be388 R11: 0000000000000001 R12: 1ffff920002a9f71 
[ 8180.804338] R13: ffff8881046a3690 R14: ffff888014148a68 R15: ffff888014148b18 
[ 8180.805236] FS:  0000000000000000(0000) GS:ffff88810bc00000(0000) knlGS:0000000000000000 
[ 8180.806241] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[ 8180.806951] CR2: 0000560a3a275160 CR3: 0000000106442005 CR4: 00000000007706f0 
[ 8180.807847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
[ 8180.808743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 
[ 8180.809636] PKRU: 55555554 
[ 8180.809988] Call Trace: 
[ 8180.810344]  destroy_unhashed_deleg+0x209/0x270 [nfsd] 
[ 8180.811104]  ? release_openowner+0x170/0x170 [nfsd] 
[ 8180.811804]  ? nfsd4_delegreturn+0x370/0x3e0 [nfsd] 
[ 8180.812511]  nfsd4_delegreturn+0x381/0x3e0 [nfsd] 
[ 8180.813153]  ? nfsd4_close+0xb90/0xb90 [nfsd] 
[ 8180.813824]  nfsd4_proc_compound+0xe4c/0x2890 [nfsd] 
[ 8180.814765]  nfsd_dispatch+0x4dc/0xcd0 [nfsd] 
[ 8180.815411]  svc_process_common+0x1137/0x1c30 [sunrpc] 
[ 8180.816194]  ? svc_generic_rpcbind_set+0x520/0x520 [sunrpc] 
[ 8180.816987]  ? nfsd_svc+0x510/0x510 [nfsd] 
[ 8180.817604]  svc_process+0x38b/0x590 [sunrpc] 
[ 8180.818262]  nfsd+0x2fe/0x5b0 [nfsd] 
[ 8180.818788]  ? nfsd_shutdown_threads+0x2a0/0x2a0 [nfsd] 
[ 8180.819524]  kthread+0x361/0x420 
[ 8180.819991]  ? _raw_spin_unlock_irq+0x24/0x50 
[ 8180.820642]  ? set_kthread_struct+0x110/0x110 
[ 8180.821285]  ret_from_fork+0x1f/0x30 
[ 8180.821792] irq event stamp: 281486605 
[ 8180.822292] hardirqs last  enabled at (281486615): [<ffffffffb079f06b>] __up_console_sem+0x6b/0x80 
[ 8180.823407] hardirqs last disabled at (281486624): [<ffffffffb079f050>] __up_console_sem+0x50/0x80 
[ 8180.824516] softirqs last  enabled at (281486538): [<ffffffffb2a00621>] __do_softirq+0x621/0x9a8 
[ 8180.825611] softirqs last disabled at (281486533): [<ffffffffb061b2ec>] __irq_exit_rcu+0x1fc/0x2b0 
[ 8180.826725] ---[ end trace 5f4f8aff65ccb20e ]--- 
[ 8180.842173] ================================================================== 
[ 8180.843371] BUG: KASAN: use-after-free in __list_del_entry_valid+0xda/0x100 
[ 8180.844309] Read of size 8 at addr ffff8881046a36a0 by task nfsd/24579 
[ 8180.845186]  
[ 8180.845402] CPU: 0 PID: 24579 Comm: nfsd Kdump: loaded Tainted: G        W        --------- ---  5.14.0-178.el9.x86_64+debug #1 
[ 8180.846767] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 
[ 8180.847468] Call Trace: 
[ 8180.847799]  dump_stack_lvl+0x57/0x81 
[ 8180.848275]  print_address_description.constprop.0+0x1f/0x140 
[ 8180.848988]  ? __list_del_entry_valid+0xda/0x100 
[ 8180.849564]  __kasan_report.cold+0x7f/0x122 
[ 8180.850100]  ? __list_del_entry_valid+0xda/0x100 
[ 8180.850683]  kasan_report+0x38/0x50 
[ 8180.851132]  __list_del_entry_valid+0xda/0x100 
[ 8180.851695]  list_lru_del+0x16e/0x4d0 
[ 8180.852173]  nfsd_file_lru_remove+0x1c/0x270 [nfsd] 
[ 8180.852858]  nfsd_file_put_noref+0x1d9/0x2f0 [nfsd] 
[ 8180.853513]  destroy_unhashed_deleg+0x209/0x270 [nfsd] 
[ 8180.854210]  ? release_openowner+0x170/0x170 [nfsd] 
[ 8180.854879]  ? nfsd4_delegreturn+0x370/0x3e0 [nfsd] 
[ 8180.855553]  nfsd4_delegreturn+0x381/0x3e0 [nfsd] 
[ 8180.856195]  ? nfsd4_close+0xb90/0xb90 [nfsd] 
[ 8180.856829]  nfsd4_proc_compound+0xe4c/0x2890 [nfsd] 
[ 8180.857531]  nfsd_dispatch+0x4dc/0xcd0 [nfsd] 
[ 8180.858145]  svc_process_common+0x1137/0x1c30 [sunrpc] 
[ 8180.858869]  ? svc_generic_rpcbind_set+0x520/0x520 [sunrpc] 
[ 8180.859614]  ? nfsd_svc+0x510/0x510 [nfsd] 
[ 8180.860199]  svc_process+0x38b/0x590 [sunrpc] 
[ 8180.860828]  nfsd+0x2fe/0x5b0 [nfsd] 
[ 8180.861337]  ? nfsd_shutdown_threads+0x2a0/0x2a0 [nfsd] 
[ 8180.862038]  kthread+0x361/0x420 
[ 8180.862450]  ? _raw_spin_unlock_irq+0x24/0x50 
[ 8180.863002]  ? set_kthread_struct+0x110/0x110 
[ 8180.863554]  ret_from_fork+0x1f/0x30 
[ 8180.864053]  
[ 8180.864266] Allocated by task 24579: 
[ 8180.864726]  kasan_save_stack+0x1e/0x50 
[ 8180.865207]  __kasan_slab_alloc+0x66/0x80 
[ 8180.865709]  kmem_cache_alloc+0x161/0x310 
[ 8180.866207]  nfsd_file_do_acquire+0x468/0x16c0 [nfsd] 
[ 8180.866882]  nfs4_get_vfs_file+0x8b4/0xc60 [nfsd] 
[ 8180.867514]  nfsd4_process_open2+0xcd2/0x1e10 [nfsd] 
[ 8180.868181]  nfsd4_open+0xc7e/0x1160 [nfsd] 
[ 8180.868751]  nfsd4_proc_compound+0xe4c/0x2890 [nfsd] 
[ 8180.869409]  nfsd_dispatch+0x4dc/0xcd0 [nfsd] 
[ 8180.870001]  svc_process_common+0x1137/0x1c30 [sunrpc] 
[ 8180.870698]  svc_process+0x38b/0x590 [sunrpc] 
[ 8180.871306]  nfsd+0x2fe/0x5b0 [nfsd] 
[ 8180.871809]  kthread+0x361/0x420 
[ 8180.872222]  ret_from_fork+0x1f/0x30 
[ 8180.872677]  
[ 8180.872894] Freed by task 24579: 
[ 8180.873305]  kasan_save_stack+0x1e/0x50 
[ 8180.873797]  kasan_set_track+0x21/0x30 
[ 8180.874267]  kasan_set_free_info+0x20/0x40 
[ 8180.874779]  __kasan_slab_free+0xec/0x120 
[ 8180.875278]  slab_free_freelist_hook+0xa3/0x1d0 
[ 8180.875847]  kmem_cache_free+0x118/0x4b0 
[ 8180.876335]  rcu_do_batch+0x3c2/0xdc0 
[ 8180.876798]  rcu_core+0x3de/0x5a0 
[ 8180.877221]  __do_softirq+0x2d0/0x9a8 
[ 8180.877681]  
[ 8180.877898] Last potentially related work creation: 
[ 8180.878495]  kasan_save_stack+0x1e/0x50 
[ 8180.878981]  __kasan_record_aux_stack+0xb2/0xc0 
[ 8180.879543]  call_rcu+0xee/0x890 
[ 8180.879961]  nfsd_file_free+0x1ff/0x470 [nfsd] 
[ 8180.880647]  __nfs4_file_put_access+0x12a/0x180 [nfsd] 
[ 8180.881419]  release_all_access+0x13b/0x230 [nfsd] 
[ 8180.882063]  nfs4_free_ol_stateid+0x40/0x90 [nfsd] 
[ 8180.882704]  nfs4_put_stid+0x29f/0x430 [nfsd] 
[ 8180.883299]  nfsd4_close+0x3ce/0xb90 [nfsd] 
[ 8180.883873]  nfsd4_proc_compound+0xe4c/0x2890 [nfsd] 
[ 8180.884532]  nfsd_dispatch+0x4dc/0xcd0 [nfsd] 
[ 8180.885124]  svc_process_common+0x1137/0x1c30 [sunrpc] 
[ 8180.885820]  svc_process+0x38b/0x590 [sunrpc] 
[ 8180.886428]  nfsd+0x2fe/0x5b0 [nfsd] 
[ 8180.886932]  kthread+0x361/0x420 
[ 8180.887345]  ret_from_fork+0x1f/0x30 
[ 8180.887800]  
[ 8180.888012] Second to last potentially related work creation: 
[ 8180.888716]  kasan_save_stack+0x1e/0x50 
[ 8180.889194]  __kasan_record_aux_stack+0xb2/0xc0 
[ 8180.889760]  call_rcu+0xee/0x890 
[ 8180.890173]  nfsd_file_free+0x1ff/0x470 [nfsd] 
[ 8180.890777]  __nfs4_file_put_access+0x12a/0x180 [nfsd] 
[ 8180.891456]  release_all_access+0x13b/0x230 [nfsd] 
[ 8180.892098]  nfs4_free_ol_stateid+0x40/0x90 [nfsd] 
[ 8180.892741]  nfs4_put_stid+0x29f/0x430 [nfsd] 
[ 8180.893335]  nfsd4_close+0x3ce/0xb90 [nfsd] 
[ 8180.893905]  nfsd4_proc_compound+0xe4c/0x2890 [nfsd] 
[ 8180.894564]  nfsd_dispatch+0x4dc/0xcd0 [nfsd] 
[ 8180.895157]  svc_process_common+0x1137/0x1c30 [sunrpc] 
[ 8180.895857]  svc_process+0x38b/0x590 [sunrpc] 
[ 8180.896464]  nfsd+0x2fe/0x5b0 [nfsd] 
[ 8180.896965]  kthread+0x361/0x420 
[ 8180.897376]  ret_from_fork+0x1f/0x30 
[ 8180.897833]  
[ 8180.898046] The buggy address belongs to the object at ffff8881046a3690 
[ 8180.898046]  which belongs to the cache nfsd_file of size 104 
[ 8180.899504] The buggy address is located 16 bytes inside of 
[ 8180.899504]  104-byte region [ffff8881046a3690, ffff8881046a36f8) 
[ 8180.900887] The buggy address belongs to the page: 
[ 8180.901478] page:000000005914c49c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1046a3 
[ 8180.902599] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) 
[ 8180.903440] raw: 0017ffffc0000200 ffffea0004127600 dead000000000002 ffff888002adcdc0 
[ 8180.904371] raw: 0000000000000000 0000000000180018 00000001ffffffff 0000000000000000 
[ 8180.905299] page dumped because: kasan: bad access detected 
[ 8180.905978]  
[ 8180.906190] Memory state around the buggy address: 
[ 8180.906781]  ffff8881046a3580: fb fb fb fb fb fc fc fc fc fc fc fc fc fa fb fb 
[ 8180.907647]  ffff8881046a3600: fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc 
[ 8180.908522] >ffff8881046a3680: fc fc fa fb fb fb fb fb fb fb fb fb fb fb fb fc 
[ 8180.909392]                                ^ 
[ 8180.909932]  ffff8881046a3700: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb 
[ 8180.910804]  ffff8881046a3780: fb fb fb fb fc fc fc fc fc fc fc fc fa fb fb fb 
[ 8180.911674] ================================================================== 
[ 8180.912543] Disabling lock debugging due to kernel taint 
[ 8180.913279] list_add corruption. prev->next should be next (ffff888104723248), but was efa0000500006003. (prev=ffff8881046a3698). 
[ 8180.914770] ------------[ cut here ]------------ 
[ 8180.915467] kernel BUG at lib/list_debug.c:26! 
[ 8180.916138] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI 
[ 8180.916876] CPU: 1 PID: 24578 Comm: nfsd Kdump: loaded Tainted: G    B   W        --------- ---  5.14.0-178.el9.x86_64+debug #1 
[ 8180.918273] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 
[ 8180.918993] RIP: 0010:__list_add_valid.cold+0x3a/0x3c 
[ 8180.919628] Code: 0b 48 89 f2 4c 89 e1 48 89 ee 48 c7 c7 c0 c0 f1 b2 e8 c4 df fd ff 0f 0b 48 89 f1 48 c7 c7 40 c0 f1 b2 4c 89 e6 e8 b0 df fd ff <0f> 0b 48 89 ee 48 c7 c7 e0 c1 f1 b2 e8 9f df fd ff 0f 0b 4c 89 ea 
[ 8180.921893] RSP: 0018:ffffc9000153fa08 EFLAGS: 00010286 
[ 8180.922548] RAX: 0000000000000075 RBX: ffff8881046a3698 RCX: 0000000000000000 
[ 8180.923435] RDX: 0000000000000002 RSI: ffffffffb2f1bd80 RDI: fffff520002a7f34 
[ 8180.924318] RBP: ffff8881049d85f0 R08: 0000000000000075 R09: ffff88810c1f1c47 
[ 8180.925203] R10: ffffed102183e388 R11: 0000000000000001 R12: ffff888104723248 
[ 8180.926095] R13: ffff888104723200 R14: ffff888104723250 R15: ffff888104723248 
[ 8180.927007] FS:  0000000000000000(0000) GS:ffff88810c000000(0000) knlGS:0000000000000000 
[ 8180.928004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[ 8180.928723] CR2: 00007f2f54fd1a80 CR3: 0000000056a2c006 CR4: 00000000007706e0 
[ 8180.929605] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
[ 8180.930487] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 
[ 8180.931374] PKRU: 55555554 
[ 8180.931808] Call Trace: 
[ 8180.932181]  list_lru_add+0x2a6/0x550 
[ 8180.932725]  nfsd_file_put+0x4a/0x330 [nfsd] 
[ 8180.933350]  nfsd_file_close+0xe/0x40 [nfsd] 
[ 8180.933951]  __nfs4_file_put_access+0x12a/0x180 [nfsd] 
[ 8180.934651]  release_all_access+0x13b/0x230 [nfsd] 
[ 8180.935315]  nfs4_free_ol_stateid+0x40/0x90 [nfsd] 
[ 8180.935979]  nfs4_put_stid+0x29f/0x430 [nfsd] 
[ 8180.936594]  nfsd4_close+0x3ce/0xb90 [nfsd] 
[ 8180.937272]  ? nfsd_acceptable+0x220/0x220 [nfsd] 
[ 8180.937992]  ? nfsd4_open_downgrade+0xbf0/0xbf0 [nfsd] 
[ 8180.938696]  ? nfsd4_encode_getattr+0xcd/0x180 [nfsd] 
[ 8180.939387]  ? nfsd4_encode_operation+0x2bf/0xdc0 [nfsd] 
[ 8180.940112]  nfsd4_proc_compound+0xe4c/0x2890 [nfsd] 
[ 8180.940830]  nfsd_dispatch+0x4dc/0xcd0 [nfsd] 
[ 8180.941508]  svc_process_common+0x1137/0x1c30 [sunrpc] 
[ 8180.942294]  ? svc_generic_rpcbind_set+0x520/0x520 [sunrpc] 
[ 8180.943052]  ? nfsd_svc+0x510/0x510 [nfsd] 
[ 8180.943621]  svc_process+0x38b/0x590 [sunrpc] 
[ 8180.944248]  nfsd+0x2fe/0x5b0 [nfsd] 
[ 8180.944764]  ? nfsd_shutdown_threads+0x2a0/0x2a0 [nfsd] 
[ 8180.945465]  kthread+0x361/0x420 
[ 8180.945886]  ? _raw_spin_unlock_irq+0x24/0x50 
[ 8180.946443]  ? set_kthread_struct+0x110/0x110 
[ 8180.947002]  ret_from_fork+0x1f/0x30 
[ 8180.947472] Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common isst_if_common nfit libnvdimm cirrus drm_shmem_helper joydev drm_kms_helper pcspkr syscopyarea virtio_balloon sysfillrect sysimgblt fb_sys_fops i2c_piix4 drm fuse xfs libcrc32c ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel net_failover failover serio_raw virtio_blk dm_mirror dm_region_hash dm_log dm_mod 
[ 8180.953635] ---[ end trace 5f4f8aff65ccb20f ]--- 

Version-Release number of selected component (if applicable):
5.14.0-178.el9.x86_64+debug 

How reproducible:
100%

Steps to Reproduce:
1. clone https://beaker.engineering.redhat.com/recipes/12854277#task152085229
2.
3.

Actual results:
kernel BUG at lib/list_debug.c:26! 

Expected results:
No Panic

Additional info:
Test fails on 5.14.0-178.el9:
Beaker job:
https://beaker.engineering.redhat.com/jobs/7182376
Console log:
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71823/7182376/12859627/console.log

beaker job:
https://beaker.engineering.redhat.com/jobs/7182375
Console log:
https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/10/71823/7182375/12859625/console.log


Test passes on 5.14.0-177.el9:
https://beaker.engineering.redhat.com/jobs/7182282
https://beaker.engineering.redhat.com/jobs/7182281

Comment 1 Jeff Layton 2022-10-31 12:54:48 UTC

This is one of several bugs that have cropped up around the filecache handling. I'm pursuing a set of fixes upstream that I'm hoping will fix these.

Comment 4 Jeff Layton 2022-12-06 11:42:33 UTC

Some notes. I backported a pile of patches from Chuck Lever's upstream tree that I hoped would help this, but hit a different looking list corruption when testing with NFSv3:

[ 4330.196752] list_del corruption. next->prev should be ffff96a4bedc2bd0, but was ffff96a46e7aa480

crash> bt
PID: 4893     TASK: ffff96a48e570000  CPU: 4    COMMAND: "nfsd"
 #0 [ffffba4c81c6b898] machine_kexec at ffffffff9b069d87
 #1 [ffffba4c81c6b8f0] __crash_kexec at ffffffff9b1c1e0d
 #2 [ffffba4c81c6b9b8] crash_kexec at ffffffff9b1c2ff8
 #3 [ffffba4c81c6b9c0] oops_end at ffffffff9b02827b
 #4 [ffffba4c81c6b9e0] do_trap at ffffffff9b024abe
 #5 [ffffba4c81c6ba30] do_error_trap at ffffffff9b024b75
 #6 [ffffba4c81c6ba70] exc_invalid_op at ffffffff9bb116be
 #7 [ffffba4c81c6ba90] asm_exc_invalid_op at ffffffff9bc00af6
    [exception RIP: __list_del_entry_valid.cold+0x1d]
    RIP: ffffffff9badbb86  RSP: ffffba4c81c6bb48  RFLAGS: 00010246
    RAX: 0000000000000054  RBX: ffff96a4bedc2bd0  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffff96a5efb198a0  RDI: ffff96a5efb198a0
    RBP: ffffffffc0d70b60   R8: 0000000000000000   R9: 00000000ffff7fff
    R10: ffffba4c81c6b9e8  R11: ffffffff9cde9748  R12: ffff96a4d0e34100
    R13: 0000000000000000  R14: 0000000000000000  R15: ffff96a4d0e34108
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffba4c81c6bb40] __list_del_entry_valid.cold at ffffffff9badbb86
 #9 [ffffba4c81c6bb48] list_lru_del at ffffffff9b307325
#10 [ffffba4c81c6bb80] nfsd_file_lru_remove at ffffffffc0cfab72 [nfsd]
#11 [ffffba4c81c6bb98] nfsd_file_unhash_and_queue at ffffffffc0cfb947 [nfsd]
#12 [ffffba4c81c6bbc8] __nfsd_file_close_inode at ffffffffc0cfba1b [nfsd]
#13 [ffffba4c81c6bc50] nfsd_file_close_inode at ffffffffc0cfbb48 [nfsd]
#14 [ffffba4c81c6bc98] nfsd_file_fsnotify_handle_event at ffffffffc0cfbd28 [nfsd]
#15 [ffffba4c81c6bcb0] send_to_group at ffffffff9b40dbd1
#16 [ffffba4c81c6bd10] fsnotify at ffffffff9b40deb3
#17 [ffffba4c81c6bdb0] vfs_unlink at ffffffff9b3cad50
#18 [ffffba4c81c6bdf0] nfsd_unlink at ffffffffc0cf3a1d [nfsd]
#19 [ffffba4c81c6be38] nfsd3_proc_remove at ffffffffc0cfdaff [nfsd]
#20 [ffffba4c81c6be68] nfsd_dispatch at ffffffffc0cec419 [nfsd]
#21 [ffffba4c81c6be90] svc_process_common at ffffffffc0b649ac [sunrpc]
#22 [ffffba4c81c6bee0] svc_process at ffffffffc0b64c87 [sunrpc]
#23 [ffffba4c81c6bef8] nfsd at ffffffffc0cebe95 [nfsd]
#24 [ffffba4c81c6bf18] kthread at ffffffff9b11de29
#25 [ffffba4c81c6bf50] ret_from_fork at ffffffff9b001f82

...the nfsd_file that it complained on is in 0xffff96a4bedc2bc8:

crash> struct nfsd_file ffff96a4bedc2bc8
struct nfsd_file {
  nf_rhash = {
    next = 0xffff96a282e02321
  },
  nf_lru = {
    next = 0xffff96a46e7aa480,
    prev = 0xffff96a47f849d70
  },
  nf_rcu = {
    next = 0xffff96a282807400,
    func = 0x0
  },
  nf_file = 0xffff96a291704500,
  nf_cred = 0xffff96a4bf23c180,
  nf_net = 0xffffffff9da0ad40 <init_net>,
  nf_flags = 0xc,
  nf_inode = 0xffff96a546051500,
  nf_ref = {
    refs = {
      counter = 0x1
    }
  },
  nf_may = 0x2,
  nf_mark = 0xffff96a4bedc13c0,
  nf_birthtime = 0x3ef5d89b78d
}

...which looks fine, but the next one on the LRU has its nf_lru list pointing to itself.

crash> struct -l nfsd_file.nf_lru nfsd_file 0xffff96a46e7aa480
struct nfsd_file {
  nf_rhash = {
    next = 0xffff96a522c0e410
  },
  nf_lru = {
    next = 0xffff96a46e7aa480,
    prev = 0xffff96a46e7aa480
  },
  nf_rcu = {
    next = 0xffff96a47e20d700,
    func = 0x0
  },
  nf_file = 0xffff96a2820c7300,
  nf_cred = 0xffff96a500d0e3c0,
  nf_net = 0xffffffff9da0ad40 <init_net>,
  nf_flags = 0xc,
  nf_inode = 0xffff96a546050d00,
  nf_ref = {
    refs = {
      counter = 0x1
    }
  },
  nf_may = 0x4,
  nf_mark = 0xffff96a4bf162960,
  nf_birthtime = 0x3f01c5477f8
}

...and thus list corruption. Both entries have the same flag (GC and REFERENCED). The HASHED bit is set when the thing is newly allocated, so this entry was in the hash and then removed at some point. Looking at the LRU itself:

crash> struct list_lru_node 0xffff96a4d0e34100
struct list_lru_node {
  lock = {
    {
      rlock = {
        raw_lock = {
          {
            val = {
              counter = 0x1
            },
            {
              locked = 0x1,
              pending = 0x0
            },
            {
              locked_pending = 0x1,
              tail = 0x0
            }
          }
        }
      }
    }
  },
  lru = {
    list = {
      next = 0xffff96a47f849d70,
      prev = 0xffff96a47fbe2210
    },
    nr_items = 0x4
  },
  nr_items = 0x4
}

The list is clearly corrupt, but walking from the prev pointer shows all 4 entries:

crash> list 0xffff96a47fbe2210
ffff96a47fbe2210
ffff96a4d0e34108    << head of the list
ffff96a47f849d70
ffff96a4bedc2bd0
ffff96a46e7aa480

Comment 5 Jeff Layton 2022-12-06 12:20:47 UTC

So in summary, we have 4 nfsd_file entries on the LRU list, two of them with NFSD_FILE_LRU clear (indicating that they are just waiting to be removed). One of those entries has an empty nf_lru list_head.

I don't see how that can happen given the way the list_lru spinlocking works, so this looks a lot like a memory scribble or UAF.

Comment 6 Jeff Layton 2022-12-06 15:51:37 UTC

I might see one way this could occur. In nfsd_file_close_inode, we do this after queueing the thing to the dispose list:

-----------------8<--------------------
        list_for_each_entry_safe(nf, tmp, &dispose, nf_lru) {
                trace_nfsd_file_closing(nf);
                if (!refcount_dec_and_test(&nf->nf_ref))
                        list_del_init(&nf->nf_lru);
        }
-----------------8<--------------------

...but, while it's on the dispose list, we don't necessarily hold the only reference to it. Something else could race in and try to put this object on the LRU just before we add it to the dispose list.

I suspect that our use of nf_lru for both the LRU and the dispose list is racy. I'm looking at ways to shore that up now. The simplest may be to just add a new nf_dispose list_head to struct nfsd_file.

In any case, I think the crash I hit yesterday is different from the originally reported bug here, as the backported patches I'm testing have changed this code significantly.

Comment 7 Jeff Layton 2022-12-06 21:10:30 UTC

My current theory is that the problem is in nfsd_file_unhash_and_queue, specifically when called from __nfsd_file_close_inode. That code attempts to take a reference to the nf and then queue it to a dispose list_head using the nf_lru list_head in the struct. Once it's on a dispose list, the nf_lru is manipulated locklessly. The problem is that we don't know for certain that we hold the last reference in this case. Someone else could be attempting to manipulate the nf_lru at the same time.

I think the solution is to ensure that we don't attempt to repurpose the nf_lru for anything but the actual LRU unless the last reference is put. I'm working on a patch for that now.

Comment 8 Jeff Layton 2022-12-06 21:48:45 UTC

> How reproducible:
> 100%

Is this really 100% reproducible with this reproducer? I'd be thrilled if so, but I'd like to confirm that as this problem has been very elusive so far.

Comment 16 Benjamin Coddington 2022-12-19 15:56:05 UTC

*** Bug 2154740 has been marked as a duplicate of this bug. ***

Comment 35 Zhi Li 2023-01-28 03:03:20 UTC

Moving to VERIFIED according to comment#34.

Comment 37 errata-xmlrpc 2023-05-09 08:05:48 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2458