RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2164887 - [nfs] refcount_t: underflow; use-after-free lib/refcount.c:28 refcount_warn_saturat
Summary: [nfs] refcount_t: underflow; use-after-free lib/refcount.c:28 refcount_warn_s...
Keywords:
Status: CLOSED DUPLICATE of bug 2160443
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kernel
Version: CentOS Stream
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: nfs-maint
QA Contact: Filesystem QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-01-26 20:38 UTC by daryl herzmann
Modified: 2023-01-26 20:50 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-26 20:50:03 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-146613 0 None None None 2023-01-26 20:41:11 UTC
Red Hat Issue Tracker RHELPLAN-146614 0 None None None 2023-01-26 20:41:13 UTC

Description daryl herzmann 2023-01-26 20:38:06 UTC
Description of problem:

I have been fighting a reproducible NFS kernel panic with a CS9 Stream + ZFS host.  I saw a bunch of NFS4 fixes went into 5.14.0-239, so I got excited, but was able to reproduce the crash by just running `exportfs -a`

Version-Release number of selected component (if applicable):

current centos 9 stream
5.14.0-239.el9.x86_64

How reproducible:

Seemingly always. ;(

Steps to Reproduce:
1. exportfs -a

Actual results:

[  122.093687] ------------[ cut here ]------------
[  122.093723] refcount_t: underflow; use-after-free.
[  122.093749] WARNING: CPU: 18 PID: 5275 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[  122.093786] Modules linked in: rpcsec_gss_krb5 tls vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE nft_chain_nat nf_nat rpcrdma rdma_cm iw_cm ib_cm ib_core bridge stp llc nft_counter ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables dell_rbu nfnetlink ledtrig_audio rfkill intel_rapl_msr video dcdbas intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl pcspkr dell_smbios dell_wmi_descriptor wmi_bmof ipmi_ssif k10temp i2c_piix4 ptdma acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter vfat fat zfs(POE) zunicode(POE) zzstd(OE) ext4 zlua(OE) zavl(POE) icp(POE) mbcache zcommon(POE) jbd2 joydev znvpair(POE) spl(OE) nfsd auth_rpcgss nfs_acl lockd grace sunrpc fuse xfs libcrc32c sd_mod sg mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper nvme syscopyarea ahci sysfillrect nvme_core sysimgblt libahci crct10dif_pclmul fb_sys_fops crc32_pclmul crc32c_intel nvme_common drm libata ghash_clmulni_intel tg3
[  122.093859]  megaraid_sas ccp t10_pi sp5100_tco wmi dm_mirror dm_region_hash dm_log dm_mod
[  122.094095] CPU: 18 PID: 5275 Comm: nfsd Kdump: loaded Tainted: P           OE    --------- ---  5.14.0-239.el9.x86_64 #1
[  122.094140] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.8.4 06/23/2022
[  122.094164] RIP: 0010:refcount_warn_saturate+0xba/0x110
[  122.094183] Code: 01 01 e8 49 a9 56 00 0f 0b e9 22 98 89 00 80 3d 4a 3e 9b 01 00 75 85 48 c7 c7 88 cb c4 a1 c6 05 3a 3e 9b 01 01 e8 26 a9 56 00 <0f> 0b e9 ff 97 89 00 80 3d 25 3e 9b 01 00 0f 85 5e ff ff ff 48 c7
[  122.094233] RSP: 0018:ffffa6e1cf28fcc8 EFLAGS: 00010282
[  122.094268] RAX: 0000000000000000 RBX: ffff94789aef8208 RCX: 0000000000000027
[  122.094291] RDX: ffff9493bf4998a8 RSI: 0000000000000001 RDI: ffff9493bf4998a0
[  122.094313] RBP: ffff947739115280 R08: 0000000000000000 R09: 00000000ffff7fff
[  122.094335] R10: ffffa6e1cf28fb68 R11: ffffffffa25e96c8 R12: ffff94789aefe024
[  122.094358] R13: ffff94789aef8208 R14: 0000000000000000 R15: ffff9476712e4fd0
[  122.094380] FS:  0000000000000000(0000) GS:ffff9493bf480000(0000) knlGS:0000000000000000
[  122.094406] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  122.094425] CR2: 00007f10fb4d6e10 CR3: 00000020b2c32000 CR4: 0000000000350ee0
[  122.094448] Call Trace:
[  122.094461]  <TASK>
[  122.094471]  nfsd_file_free+0x225/0x230 [nfsd]
[  122.094512]  nfs4_file_put_access+0x7e/0x130 [nfsd]
[  122.094550]  release_all_access+0x6a/0x80 [nfsd]
[  122.094584]  nfs4_free_ol_stateid+0x22/0x60 [nfsd]
[  122.094618]  nfs4_put_stid+0xb1/0x100 [nfsd]
[  122.094652]  nfsd4_close+0x1e3/0x3c0 [nfsd]
[  122.094687]  ? nfsd4_encode_getattr+0x28/0x30 [nfsd]
[  122.094723]  ? nfsd4_encode_operation+0xdc/0x270 [nfsd]
[  122.094759]  nfsd4_proc_compound+0x446/0x6f0 [nfsd]
[  122.094796]  nfsd_dispatch+0x15e/0x290 [nfsd]
[  122.094831]  svc_process_common+0x3bc/0x5e0 [sunrpc]
[  122.094877]  ? nfsd_svc+0x190/0x190 [nfsd]
[  122.094910]  ? nfsd_shutdown_threads+0xa0/0xa0 [nfsd]
[  122.095603]  svc_process+0xb7/0xf0 [sunrpc]
[  122.096261]  nfsd+0xd5/0x190 [nfsd]
[  122.096904]  kthread+0xd9/0x100
[  122.097535]  ? kthread_complete_and_exit+0x20/0x20
[  122.098162]  ret_from_fork+0x22/0x30
[  122.098781]  </TASK>
[  122.099380] ---[ end trace 7a8b3c06a65fce64 ]---
[  122.122108] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  122.122923] #PF: supervisor instruction fetch in kernel mode
[  122.123543] #PF: error_code(0x0010) - not-present page
[  122.124136] PGD 208ac34067 P4D 208ac34067 PUD 208ac32067 PMD 0 
[  122.124720] Oops: 0010 [#1] PREEMPT SMP NOPTI
[  122.125286] CPU: 18 PID: 0 Comm: swapper/18 Kdump: loaded Tainted: P        W  OE    --------- ---  5.14.0-239.el9.x86_64 #1
[  122.125860] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.8.4 06/23/2022
[  122.126423] RIP: 0010:0x0
[  122.126971] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[  122.127514] RSP: 0018:ffffa6e1ccaacee8 EFLAGS: 00010202
[  122.128044] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00000000105ba012
[  122.128569] RDX: ffff94789aefe048 RSI: bbb4bbef1e6de0fa RDI: ffff94789aef8220
[  122.129085] RBP: ffff94744628b900 R08: ffffffffa2259364 R09: 0000000000000101
[  122.129593] R10: 0000000000000040 R11: ffffffffa2206100 R12: ffff9493bf4abb40
[  122.130096] R13: 0000000000000003 R14: 000000000000000a R15: 0000000000000000
[  122.130588] FS:  0000000000000000(0000) GS:ffff9493bf480000(0000) knlGS:0000000000000000
[  122.131074] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  122.131553] CR2: ffffffffffffffd6 CR3: 00000020b2c32000 CR4: 0000000000350ee0
[  122.132035] Call Trace:
[  122.132507]  <IRQ>
[  122.132975]  rcu_do_batch+0x1ae/0x4d0
[  122.133449]  rcu_core+0x26a/0x410
[  122.133911]  __do_softirq+0xca/0x2ac
[  122.134366]  __irq_exit_rcu+0xb5/0xe0
[  122.134814]  sysvec_apic_timer_interrupt+0x72/0x90
[  122.135258]  </IRQ>
[  122.135688]  <TASK>
[  122.136117]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[  122.136556] RIP: 0010:mwait_idle+0x51/0x80
[  122.136997] Code: 31 d2 48 89 d1 65 48 8b 04 25 40 8f 01 00 0f 01 c8 48 8b 00 a8 08 75 14 eb 07 0f 00 2d c4 8e 4d 00 31 c0 48 89 c1 fb 0f 01 c9 <eb> 01 fb 65 48 8b 04 25 40 8f 01 00 f0 80 60 02 df e9 79 fc 2c 00
[  122.137929] RSP: 0018:ffffa6e1c8217ed0 EFLAGS: 00000246
[  122.138401] RAX: 0000000000000000 RBX: ffff94744628b900 RCX: 0000000000000000
[  122.138881] RDX: 0000000000000000 RSI: 0000000000000012 RDI: 0000000000164ed2
[  122.139361] RBP: 0000000000000000 R08: 0000001c6eb41521 R09: ffff9474502e5600
[  122.139842] R10: 00000000000002e4 R11: ffff94744a8d0e10 R12: 0000000000000000
[  122.140323] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  122.140804]  default_idle_call+0x33/0xe0
[  122.141285]  cpuidle_idle_call+0x15d/0x1c0
[  122.141762]  ? ktime_get+0x38/0xa0
[  122.142238]  do_idle+0x7b/0xe0
[  122.142709]  cpu_startup_entry+0x19/0x20
[  122.143180]  start_secondary+0x116/0x140
[  122.143653]  secondary_startup_64_no_verify+0xe5/0xeb
[  122.144129]  </TASK>
[  122.144597] Modules linked in: rpcsec_gss_krb5 tls vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE nft_chain_nat nf_nat rpcrdma rdma_cm iw_cm ib_cm ib_core bridge stp llc nft_counter ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables dell_rbu nfnetlink ledtrig_audio rfkill intel_rapl_msr video dcdbas intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl pcspkr dell_smbios dell_wmi_descriptor wmi_bmof ipmi_ssif k10temp i2c_piix4 ptdma acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter vfat fat zfs(POE) zunicode(POE) zzstd(OE) ext4 zlua(OE) zavl(POE) icp(POE) mbcache zcommon(POE) jbd2 joydev znvpair(POE) spl(OE) nfsd auth_rpcgss nfs_acl lockd grace sunrpc fuse xfs libcrc32c sd_mod sg mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper nvme syscopyarea ahci sysfillrect nvme_core sysimgblt libahci crct10dif_pclmul fb_sys_fops crc32_pclmul crc32c_intel nvme_common drm libata ghash_clmulni_intel tg3
[  122.144637]  megaraid_sas ccp t10_pi sp5100_tco wmi dm_mirror dm_region_hash dm_log dm_mod
[  122.149557] CR2: 0000000000000000

I have a vmcore to share, if anybody is interested more in this...

Comment 1 Jeff Layton 2023-01-26 20:50:03 UTC

*** This bug has been marked as a duplicate of bug 2160443 ***


Note You need to log in before you can comment on or make changes to this bug.