Bug 2215429
| Summary: | Panic in __percpu_counter_sum via nfsd_reply_cache_stats_show | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Eirik Fuller <efuller> | ||||
| Component: | kernel | Assignee: | Jeff Layton <jlayton> | ||||
| kernel sub component: | NFS | QA Contact: | JianHong Yin <jiyin> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | unspecified | ||||||
| Priority: | unspecified | CC: | chuck.lever, jiyin, jlayton, nfs-team, vbenes, xzhou, yieli, yoyang | ||||
| Version: | 9.2 | Keywords: | Regression, Triaged | ||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | kernel-5.14.0-334.el9 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-11-07 08:48:08 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Eirik Fuller
2023-06-16 01:52:51 UTC
Making this bug public, and cc'ing Chuck in case he has thoughts. Looking at nfsd_umount:
static void nfsd_umount(struct super_block *sb)
{
struct net *net = sb->s_fs_info;
nfsd_shutdown_threads(net);
kill_litter_super(sb);
put_net(net);
}
It looks like we hold a reference to the net while the nfsdfs superblock is mounted, so it seems unlikely that we'd tear down the net while someone was still scraping the file. Does this means that we're presenting the "filecache" file in nfsdfs before the reply cache has been initialized?
Eirik, are you able to tell from the core whether this happened while the net was being brought up or shut down?
If it happens while the net was coming up then it should be sufficient to just make nfsd_reply_cache_stats_show check nfsd_net_up before trying to touch any of the stats. If it's happening on net shutdown though, then things are a little trickier, but I'm thinking that's not possible given that the superblock apparently holds a ref to the net. Ok, I think this is a regression from this patch:
commit f5f9d4a314da88c0a5faa6d168bf69081b7a25ae
Author: Jeff Layton <jlayton>
Date: Wed Jan 11 11:19:59 2023 -0500
nfsd: move reply cache initialization into nfsd startup
There's no need to start the reply cache before nfsd is up and running,
and doing so means that we register a shrinker for every net namespace
instead of just the ones where nfsd is running.
Move it to the per-net nfsd startup instead.
Reported-by: Dai Ngo <dai.ngo>
Signed-off-by: Jeff Layton <jlayton>
Signed-off-by: Chuck Lever <chuck.lever>
The filecache file is available before nfsd is started, so you can probably reproduce this by just creating a new net namespace, mounting up its /proc/fs/nfsd, and then read its filecache file without ever starting nfsd. We need to move the initialization of the percpu variables back into nfsd_init_net. I'll see if I can reproduce this and then send a patch upstream.
I'll mark this bug for 9.3, but I think we'll need a backport to 9.2.z as well.
Nice catch, Eirik!
This panic did not occur during Fedora Rawhide testing, with kernel 6.4.0-0.rc6.20230614gitb6dad5178cea.49.fc39; it's not yet clear whether that's due to an upstream fix, or the intermittent nature of this problem, but the panics have been consistent enough with RHEL9 kernels to suggest the former. The panic did not occur with kernel 5.14.0-325.bz2215429.el9, which added just the following patch to kernel 5.14.0-325.el9: diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 041faa13b852..d02c2ac37cec 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -614,7 +614,7 @@ int nfsd_reply_cache_stats_show(struct seq_file *m, void *v) atomic_read(&nn->num_drc_entries)); seq_printf(m, "hash buckets: %u\n", 1 << nn->maskbits); seq_printf(m, "mem usage: %lld\n", - percpu_counter_sum_positive(&nn->counter[NFSD_NET_DRC_MEM_USAGE])); + nn->nfsd_net_up ? percpu_counter_sum_positive(&nn->counter[NFSD_NET_DRC_MEM_USAGE]) : 0); seq_printf(m, "cache hits: %lld\n", percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_HITS])); seq_printf(m, "cache misses: %lld\n", @@ -622,7 +622,7 @@ int nfsd_reply_cache_stats_show(struct seq_file *m, void *v) seq_printf(m, "not cached: %lld\n", percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_RC_NOCACHE])); seq_printf(m, "payload misses: %lld\n", - percpu_counter_sum_positive(&nn->counter[NFSD_NET_PAYLOAD_MISSES])); + nn->nfsd_net_up ? percpu_counter_sum_positive(&nn->counter[NFSD_NET_PAYLOAD_MISSES]) : 0); seq_printf(m, "longest chain len: %u\n", nn->longest_chain); seq_printf(m, "cachesize at longest: %u\n", nn->longest_chain_cachesize); return 0; Created attachment 1971173 [details]
nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
Eirik, would you be able to test this patch? I think this should fix the underlying issue.
Yes, I'll test the patch. The following command sequence triggers a panic with kernel 5.14.0-284.11.1.el9_2: mount -t nfsd nfsd /proc/fs/nfsd cat /proc/fs/nfsd/reply_cache_stats That seems to be easily reproducible. The first time I tried those commands on Fedora Rawhide I saw the following. [root@localhost ~]# mount -t nfsd nfsd /proc/fs/nfsd [root@localhost ~]# cat /proc/fs/nfsd/ clients/ export_stats max_block_size nfsv4leasetime pool_threads supported_krb5_enctypes unlock_ip export_features filecache max_connections nfsv4recoverydir portlist threads v4_end_grace exports filehandle nfsv4gracetime pool_stats reply_cache_stats unlock_filesystem versions [root@localhost ~]# cat /proc/fs/nfsd/reply_cache_stats Segmentation fault [root@localhost ~]# (that intermediate output is bash from tab completion) However, the second time I tried those commands, the system panicked (kernel 6.4.0-0.rc6.20230614gitb6dad5178cea.49.fc39.aarch64). Strange. I can't reproduce this on Fedora kernels on x86_64. Basically, it appears that nn->counter is never NULL in my testing, though it seems like it should be. I added some trace_printk's to nfsd_reply_cache_stats_init (which is what should initialize nn->counter) and nfsd_reply_cache_stats_show. It looks like nn->counter is already initialized on this host, even though the nfsd_reply_cache_stats_init trace_printk doesn't fire until you start up the nfs server. Is this issue arch-specific? Maybe percpu vars are implemented differently on x86 vs. ARM? FWIW, I also can't seem to reproduce this on centos9 x86_64 kernels. I'll see if I can check out an ARM host to test with. Bug 2031604 comment 11 suggests a possible reason null percpu pointers don't trigger panics with x86_64, the upshot of which is "passing NULL to this_cpu_ptr returns fixed_percpu_data on x86_64", i.e. the resulting pointer is mapped (but pity the fool who tries to modify memory accessed that way :) I have a scratch build with the patch from attachment 1971173 [details] ready for testing. I'll report the results of that testing here. (In reply to Eirik Fuller from comment #9) > Red HatBug 2031604 comment 11 suggests a possible reason null percpu > pointers don't trigger panics with x86_64, the upshot of which is "passing > NULL to this_cpu_ptr returns fixed_percpu_data on x86_64", i.e. the > resulting pointer is mapped (but pity the fool who tries to modify memory > accessed that way :) > Dear lord! That's certainly unexpected. I wonder if that's intentional, or if the behavior of this_cpu_ptr in this case is just a "happy" accident? Many thanks for testing the patch. I'll plan to post it upstream if it looks good. The NetworkManager-ci tests are still happily churning away on ampere-mtsnow-altra-02-vm-03, but in the meantime: [root@ampere-mtsnow-altra-02-vm-03 ~]# uname -a Linux ampere-mtsnow-altra-02-vm-03.lab.eng.rdu2.redhat.com 5.14.0-327.bz2215429.el9.aarch64 #1 SMP PREEMPT_DYNAMIC Fri Jun 16 11:26:40 EDT 2023 aarch64 aarch64 aarch64 GNU/Linux [root@ampere-mtsnow-altra-02-vm-03 ~]# mount -t nfsd nfsd /proc/fs/nfsd [root@ampere-mtsnow-altra-02-vm-03 ~]# cat /proc/fs/nfsd/reply_cache_stats max entries: 0 num entries: 0 hash buckets: 1 mem usage: 0 cache hits: 0 cache misses: 0 not cached: 0 payload misses: 0 longest chain len: 0 cachesize at longest: 0 [root@ampere-mtsnow-altra-02-vm-03 ~]# umount /proc/fs/nfsd [root@ampere-mtsnow-altra-02-vm-03 ~]# Thanks for testing it. Patch posted upstream:
https://lore.kernel.org/linux-nfs/20230616191744.202292-1-jlayton@kernel.org/T/#u
Once this goes into mainline I'll make a MR for 9.3.
I'll also nominate this for 9.2.z as it looks like we took that patch that caused the regression into kernel-5.14.0-284.3.1.el9_2.
This panic also occurs on ppc64le: [ 20.115939] Kernel attempted to read user page (ffd2e0000) - exploit attempt? (uid: 0) [ 20.115946] BUG: Unable to handle kernel data access on read at 0xffd2e0000 [ 20.115947] Faulting instruction address: 0xc00000000086fa50 [ 20.115950] Oops: Kernel access of bad area, sig: 11 [#1] [ 20.115968] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries [ 20.115973] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace bonding tls rfkill sunrpc pseries_rng drm fuse drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror dm_region_hash dm_log dm_mod [ 20.116002] CPU: 3 PID: 4049 Comm: cat Kdump: loaded Not tainted 5.14.0-327.el9.ppc64le #1 [ 20.116007] NIP: c00000000086fa50 LR: c00000000086fa6c CTR: c00000000086f9e0 [ 20.116011] REGS: c000000006f93810 TRAP: 0300 Not tainted (5.14.0-327.el9.ppc64le) [ 20.116015] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 48222401 XER: 20040000 [ 20.116025] CFAR: c00000000086fa7c DAR: 0000000ffd2e0000 DSISR: 40000000 IRQMASK: 1 [ 20.116025] GPR00: c00000000086fa10 c000000006f93ab0 c000000002b3bf00 0000000000000000 [ 20.116025] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 20.116025] GPR08: c000000002b73060 0000000000000000 0000000ffd2e0000 c0080000033200c8 [ 20.116025] GPR12: c00000000086f9e0 c000000fffffd480 0000000000000000 0000000000000000 [ 20.116025] GPR16: 0000000020000000 0000000000000000 0000000000020000 0000000000000001 [ 20.116025] GPR20: 00007ffffcb56068 0000000000400cc0 c00000009458eb88 000000007fff0000 [ 20.116025] GPR24: fffffffffffff000 0000000000000000 c00000009458eb78 0000000000000000 [ 20.116025] GPR28: c000000002b771b8 c000000002b778b0 c00000000b9a8e18 0000000000000000 [ 20.116074] NIP [c00000000086fa50] __percpu_counter_sum+0x70/0xe0 [ 20.116082] LR [c00000000086fa6c] __percpu_counter_sum+0x8c/0xe0 [ 20.116087] Call Trace: [ 20.116089] [c000000006f93ab0] [c00000000b9a8c00] 0xc00000000b9a8c00 (unreliable) [ 20.116096] [c000000006f93b00] [c0080000032d5ff8] nfsd_reply_cache_stats_show+0xb0/0x1f0 [nfsd] [ 20.116126] [c000000006f93b80] [c0000000005ccafc] seq_read_iter+0x25c/0x6b0 [ 20.116132] [c000000006f93c60] [c0000000005cd038] seq_read+0xe8/0x150 [ 20.116137] [c000000006f93d10] [c000000000584648] vfs_read+0xc8/0x240 [ 20.116142] [c000000006f93d60] [c000000000584e14] ksys_read+0x84/0x140 [ 20.116148] [c000000006f93db0] [c00000000002f434] system_call_exception+0x164/0x310 [ 20.116154] [c000000006f93e10] [c00000000000bfe8] system_call_vectored_common+0xe8/0x278 [ 20.116161] --- interrupt: 3000 at 0x7fff9013a8b4 [ 20.116164] NIP: 00007fff9013a8b4 LR: 0000000000000000 CTR: 0000000000000000 [ 20.116167] REGS: c000000006f93e80 TRAP: 3000 Not tainted (5.14.0-327.el9.ppc64le) [ 20.116171] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42222408 XER: 00000000 [ 20.116183] IRQMASK: 0 [ 20.116183] GPR00: 0000000000000003 00007ffffcb559b0 0000000109d87f00 0000000000000003 [ 20.116183] GPR04: 00007fff8ffd0000 0000000000020000 0000000000000022 0000000000000000 [ 20.116183] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 20.116183] GPR12: 0000000000000000 00007fff903fa5c0 0000000000000000 0000000000000000 [ 20.116183] GPR16: 0000000020000000 0000000000000000 0000000000020000 0000000000000000 [ 20.116183] GPR20: 00007ffffcb56068 0000000000000002 0000000000000000 0000000109d61e50 [ 20.116183] GPR24: 0000000109d80110 0000000000020000 0000000109d680e0 000000007ff00000 [ 20.116183] GPR28: 0000000000000003 00007fff8ffd0000 0000000000020000 0000000000000003 [ 20.116228] NIP [00007fff9013a8b4] 0x7fff9013a8b4 [ 20.116231] LR [0000000000000000] 0x0 [ 20.116234] --- interrupt: 3000 [ 20.116236] Instruction dump: [ 20.116239] 3d220004 7c7b1b78 3860ffff 3ba9b9b0 48000028 60000000 60000000 60000000 [ 20.116249] 3d220003 39097160 e93e0020 7d48502a <7d2a4aaa> 7fff4a14 38a30001 38800800 The trigger for that panic was the following command. mount -t nfsd nfsd /proc/fs/nfsd && cat /proc/fs/nfsd/reply_cache_stats && umount /proc/fs/nfsd Makes total sense that ppc64 would also fail. The x86 behavior was making me think I must be crazy. Thanks for explaining it! I expected ppc64le to be susceptible, based on past experience (bug 2031604 describes various ppc64le panics). As for the x86_64 behavior ... a case could be made for "that behavior is what's crazy" :) I discovered additional details about the Segmentation fault mentioned in comment 6. With kernel 6.4.0-0.rc6.20230614gitb6dad5178cea.49.fc39.aarch64 (Fedora Rawhide) I repeated the Segmentation fault (strace revealed, not surprisingly, that the SIGSEGV occurred during a read on the file descriptor returned by openat for "/proc/fs/nfsd/reply_cache_stats"), but dmesg output revealed the usual panic messages, [24178.543628] Unable to handle kernel paging request at virtual address ffff5d359e56f000 [24178.544152] Mem abort info: [24178.544331] ESR = 0x0000000096000004 [24178.544569] EC = 0x25: DABT (current EL), IL = 32 bits [24178.544904] SET = 0, FnV = 0 [24178.545097] EA = 0, S1PTW = 0 [24178.545297] FSC = 0x04: level 0 translation fault [24178.545604] Data abort info: [24178.545788] ISV = 0, ISS = 0x00000004 [24178.546032] CM = 0, WnR = 0 [24178.546220] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000004f7cd000 [24178.546643] [ffff5d359e56f000] pgd=0000000000000000, p4d=0000000000000000 [24178.547072] Internal error: Oops: 0000000096000004 [#1] SMP [24178.547425] Modules linked in: veth qrtr rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs wireguard libcurve25519_generic ip_gre gre ipip ip_tunnel act_gact cls_flower act_mirred cls_matchall sch_ingress sch_tbf sch_prio sch_sfq nf_conntrack_netbios_ns nf_conntrack_broadcast echainiv ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 ipcomp ipcomp6 xfrm6_tunnel xfrm_ipcomp tunnel6 chacha20poly1305 camellia_generic xcbc sha512_arm64 des_generic libdes af_key sch_netem openvswitch nsh nf_conncount macsec xt_conntrack xt_comment xt_MASQUERADE iptable_nat ip_tables tun rdma_ucm ib_uverbs rpcrdma rdma_cm iw_cm ib_cm nfsd auth_rpcgss nfs_acl lockd grace raid0 vrf vxlan ip6_udp_tunnel udp_tunnel macvlan ipt_REJECT nft_compat nft_reject_ipv4 nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp binfmt_misc nft_meta_bridge team_mode_random team_mode_activebackup team_mode_broadcast team_mode_loadbalance [24178.547500] team_mode_roundrobin team nft_masq ppp_deflate bsd_comp ppp_async pppoe pppox ppp_generic slhc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set ib_core bluetooth bonding tls 8021q garp mrp nf_tables nfnetlink bridge stp llc dummy rfkill sunrpc vfat fat virtio_net net_failover failover fuse loop zram crct10dif_ce polyval_ce polyval_generic ghash_ce virtio_console virtio_blk virtio_mmio qemu_fw_cfg [24178.553277] Unloaded tainted modules: netdevsim(OE):4 [last unloaded: veth] [24178.556855] CPU: 1 PID: 663311 Comm: cat Tainted: G OE ------- --- 6.4.0-0.rc6.20230614gitb6dad5178cea.49.fc39.aarch64 #1 [24178.557664] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [24178.558117] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [24178.558577] pc : __percpu_counter_sum+0x60/0xb8 [24178.558881] lr : __percpu_counter_sum+0x7c/0xb8 [24178.559187] sp : ffff80000c5cbab0 [24178.559407] x29: ffff80000c5cbab0 x28: ffff0000c68824e8 x27: 0000000000400cc0 [24178.559877] x26: 000000007ffff000 x25: 0000000000000000 x24: ffffa2cde0174f30 [24178.560348] x23: ffffa2cde016f050 x22: ffffa2cde016ec50 x21: ffffa2cde016f850 [24178.560819] x20: ffff0002eac26608 x19: 0000000000000000 x18: ffffffffffffffff [24178.561292] x17: 0000000000000000 x16: ffffa2cdddfb6388 x15: ffff80000c5cb9b0 [24178.561762] x14: 0000000000000000 x13: ffff000273aa204e x12: 20203a7374656b63 [24178.562234] x11: 0000000000000001 x10: 000000000000000a x9 : ffffa2cdddfb63bc [24178.562705] x8 : 000000000000000a x7 : 0000000000000004 x6 : 0000000000000000 [24178.563178] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000001 [24178.563649] x2 : 0000000000000003 x1 : ffffa2cde016f050 x0 : ffff5d359e56f000 [24178.564120] Call trace: [24178.564284] __percpu_counter_sum+0x60/0xb8 [24178.564564] nfsd_reply_cache_stats_show+0x94/0x1a0 [nfsd] [24178.564970] seq_read_iter+0xe4/0x490 [24178.565216] seq_read+0x98/0xd8 [24178.565425] vfs_read+0xc8/0x2a8 [24178.565640] ksys_read+0x78/0x118 [24178.565861] __arm64_sys_read+0x24/0x38 [24178.566115] invoke_syscall+0x78/0x100 [24178.566365] el0_svc_common.constprop.0+0xd4/0x100 [24178.566682] do_el0_svc+0x34/0x50 [24178.566903] el0_svc+0x34/0x108 [24178.567114] el0t_64_sync_handler+0x114/0x120 [24178.567404] el0t_64_sync+0x194/0x198 [24178.567647] Code: 52800003 14000005 f860db00 f9401284 (b8a46800) [24178.568047] ---[ end trace 0000000000000000 ]--- [24178.568353] note: cat[663311] exited with irqs disabled [24178.568726] note: cat[663311] exited with preempt_count 1 but with no accompanying vmcore (nor reboot). This experiment, like the one in comment 6, was performed in the reservation of a Beaker recipe in which the NetworkManager-ci tests ran to completion without a panic. A second attempt to "cat /proc/fs/nfsd/reply_cache_stats" resulted in the system becoming unresponsive. Checking the console logs revealed a number of soft lockups in sshd, and a second look at the console log from the comment 6 revealed the same thing there (that system never did become responsive). In short, with newer kernels this issue does not seem to result in a clean panic, by which I mean the system does not reboot on its own, nor does it save a vmcore. In any case I'd expect the patch linked in comment 12 (or possibly a future incarnation of it) to prevent such trouble, but it might be worthwhile to ponder why Fedora Rawhide kernels apparently don't panic cleanly. A Beaker recipe is running the NetworkManager-ci tests with kernel 5.14.0-289.el9.aarch64 (which predates the commit linked in comment 3). I expect that to not panic, based on the following output from the same system. [root@ampere-mtsnow-altra-02-vm-06 ~]# mount -t nfsd nfsd /proc/fs/nfsd && cat /proc/fs/nfsd/reply_cache_stats && umount /proc/fs/nfsd max entries: 119488 num entries: 0 hash buckets: 2048 mem usage: 0 cache hits: 0 cache misses: 0 not cached: 0 payload misses: 0 longest chain len: 0 cachesize at longest: 0 [root@ampere-mtsnow-altra-02-vm-06 ~]# uname -r 5.14.0-289.el9.aarch64 [root@ampere-mtsnow-altra-02-vm-06 ~]# The command at the end of comment 13 triggers a panic with kernel 5.14.0-332.el9.ppc64le and with kernel 5.14.0-332.el9.ppc64le but the merge request kernels do not panic in response to the same command. [root@ibm-p9z-15-lp3 ~]# mount -t nfsd nfsd /proc/fs/nfsd && cat /proc/fs/nfsd/reply_cache_stats && umount /proc/fs/nfsd max entries: 0 num entries: 0 hash buckets: 1 mem usage: 0 cache hits: 0 cache misses: 0 not cached: 0 payload misses: 0 longest chain len: 0 cachesize at longest: 0 [root@ibm-p9z-15-lp3 ~]# uname -a Linux ibm-p9z-15-lp3.khw3.lab.eng.bos.redhat.com 5.14.0-332.2735_913139648.el9.ppc64le #1 SMP Tue Jun 27 11:56:05 UTC 2023 ppc64le ppc64le ppc64le GNU/Linux [root@ibm-p9z-15-lp3 ~]# [root@ampere-mtsnow-altra-01-vm-01 ~]# mount -t nfsd nfsd /proc/fs/nfsd && cat /proc/fs/nfsd/reply_cache_stats && umount /proc/fs/nfsd max entries: 0 num entries: 0 hash buckets: 1 mem usage: 0 cache hits: 0 cache misses: 0 not cached: 0 payload misses: 0 longest chain len: 0 cachesize at longest: 0 [root@ampere-mtsnow-altra-01-vm-01 ~]# uname -a Linux ampere-mtsnow-altra-01-vm-01.lab.eng.rdu2.redhat.com 5.14.0-332.2735_913139648.el9.aarch64 #1 SMP PREEMPT_DYNAMIC Tue Jun 27 12:00:11 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux [root@ampere-mtsnow-altra-01-vm-01 ~]# I might also run the NetworkManager-ci tests on those systems with the merge request kernel. The NetworkManager-ci tests finished with kernel 5.14.0-332.2735_913139648.el9 on the two systems which ran the tests in comment 25. There were assorted failures, but no panics. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6583 |