Bug 2207969

Summary: [regression] kernel BUG at fs/attr.c:377! RIP: 0010:notify_change+0xbd8/0xd40
Product: Red Hat Enterprise Linux 9 Reporter: Zhi Li <yieli>
Component: kernelAssignee: Jeff Layton <jlayton>
kernel sub component: NFS QA Contact: Zhi Li <yieli>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bxue, chuck.lever, cmaiolin, jiyin, jlayton, nfs-team, xzhou, yoyang
Version: 9.3Keywords: Regression, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-5.14.0-325.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:45:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zhi Li 2023-05-17 13:15:45 UTC
Description of problem:

The following panic was triggered when running NFS luster-racer test on rhel9.3 debug kernel.

[  524.315212] ------------[ cut here ]------------ 
[  524.315220] kernel BUG at fs/attr.c:377! 
[  524.315234] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI 
[  524.331018] CPU: 19 PID: 8733 Comm: nfsd Kdump: loaded Not tainted 5.14.0-313.el9.x86_64+debug #1 
[  524.340942] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE172BUS-3.30]- 06/23/2020 
[  524.351349] RIP: 0010:notify_change+0xbd8/0xd40 
[  524.356428] Code: f5 ff ff 48 89 df e8 c7 a0 eb ff e9 6b f6 ff ff 89 45 a0 e8 1a a1 eb ff 8b 45 a0 e9 91 f6 ff ff e8 6d a1 eb ff e9 47 f8 ff ff <0f> 0b e8 b1 a0 eb ff e9 9f f7 ff ff 48 89 4d c0 e8 b3 a0 eb ff 48 
[  524.377402] RSP: 0018:ffffc90008ecfad0 EFLAGS: 00010202 
[  524.383254] RAX: dffffc0000000000 RBX: ffff88810ac14e28 RCX: 0000000000000000 
[  524.391232] RDX: 1ffff11122ae7835 RSI: 0000000000000001 RDI: ffff88891573c1a8 
[  524.399213] RBP: ffffc90008ecfb40 R08: 000000006464b630 R09: 000000001e46aa78 
[  524.407195] R10: fffffbfff2db322c R11: 0000000000000001 R12: 0000000000001847 
[  524.415178] R13: ffff8888a37f56f8 R14: ffff88891573c180 R15: 0000000000000000 
[  524.423161] FS:  0000000000000000(0000) GS:ffff888fddc00000(0000) knlGS:0000000000000000 
[  524.432211] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[  524.438640] CR2: 00007f4defbe4028 CR3: 00000002044f0005 CR4: 00000000000606e0 
[  524.446621] Call Trace: 
[  524.449360]  <TASK> 
[  524.451710]  ? cpuusage_write+0x380/0x380 
[  524.456215]  ? nfsd_setattr+0x4c4/0xbc0 [nfsd] 
[  524.461300]  nfsd_setattr+0x4c4/0xbc0 [nfsd] 
[  524.466186]  ? nfsd_access+0x340/0x340 [nfsd] 
[  524.471155]  ? find_held_lock+0x33/0x120 
[  524.475550]  ? __lock_release+0x4c1/0xa00 
[  524.480042]  ? lock_downgrade+0x130/0x130 
[  524.484525]  ? mark_held_locks+0xa5/0xf0 
[  524.488927]  nfsd3_proc_setattr+0x272/0x390 [nfsd] 
[  524.494391]  ? nfsd3_proc_access+0x2f0/0x2f0 [nfsd] 
[  524.499951]  ? do_raw_spin_unlock+0x55/0x1f0 
[  524.504738]  ? nfsd_cache_lookup+0xc1b/0x15f0 [nfsd] 
[  524.510401]  ? svcxdr_decode_sattr3+0x50d/0xab0 [nfsd] 
[  524.516250]  ? svcxdr_decode_nfs_fh3+0xfa/0x140 [nfsd] 
[  524.522111]  nfsd_dispatch+0x149/0x7f0 [nfsd] 
[  524.527092]  svc_process_common+0xbec/0x1c80 [sunrpc] 
[  524.532889]  ? svc_generic_rpcbind_set+0x510/0x510 [sunrpc] 
[  524.539253]  ? ktime_get+0x14e/0x180 
[  524.543260]  ? nfsd_svc+0x4c0/0x4c0 [nfsd] 
[  524.547950]  ? lockdep_hardirqs_on+0x79/0x100 
[  524.552830]  ? recalibrate_cpu_khz+0x10/0x10 
[  524.557617]  svc_process+0x57e/0x870 [sunrpc] 
[  524.562627]  ? __validate_process_creds+0xf5/0x310 
[  524.567993]  nfsd+0x2fe/0x5b0 [nfsd] 
[  524.572105]  ? nfsd_shutdown_threads+0x2a0/0x2a0 [nfsd] 
[  524.578050]  kthread+0x2a7/0x350 
[  524.581667]  ? kthread_complete_and_exit+0x20/0x20 
[  524.587020]  ret_from_fork+0x22/0x30 
[  524.591040]  </TASK> 
[  524.593485] Modules linked in: nfsv3 ext4 mbcache jbd2 loop rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl iTCO_wdt iTCO_vendor_support intel_cstate mgag200 ipmi_ssif drm_shmem_helper cdc_ether drm_kms_helper usbnet syscopyarea sysfillrect intel_uncore mii sysimgblt fb_sys_fops ipmi_si pcspkr ipmi_devintf lpc_ich ipmi_msghandler ioatdma drm fuse xfs libcrc32c sd_mod t10_pi sg wmi crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel igb megaraid_sas i2c_algo_bit dca dm_mirror dm_region_hash dm_log dm_mod 
[  524.665311] ---[ end trace 6662ebcbed98b4e5 ]--- 


Version-Release number of selected component (if applicable):
kernel-5.14.0-313.el9+debug

How reproducible:
reliable

Steps to Reproduce:
1.clone https://beaker.engineering.redhat.com/jobs/7857408
2.
3.

Actual results:
kernel BUG at fs/attr.c:377!

Expected results:
No panic


Additional info:
I could have already hit this on -312.el9 and not trigger this on -311.el9. So it's likely a regression introduced in kernel-5.14.0-312.el9. Please help to confirm.

Beaker jobs:
test on kernel-5.14.0-312.el9:  panic
https://beaker.engineering.redhat.com/jobs/7857257
https://beaker.engineering.redhat.com/jobs/7857259

test on kernel-5.14.0-311.el9:
https://beaker.engineering.redhat.com/jobs/7857265
https://beaker.engineering.redhat.com/jobs/7857264

Comment 1 Jeff Layton 2023-05-17 13:44:42 UTC
This is the BUG that tripped:

        /*                                                                                          
         * We now pass ATTR_KILL_S*ID to the lower level setattr function so                        
         * that the function has the ability to reinterpret a mode change                           
         * that's due to these bits. This adds an implicit restriction that                         
         * no function will ever call notify_change with both ATTR_MODE and                         
         * ATTR_KILL_S*ID set.                                                                      
         */                                                                                         
        if ((ia_valid & (ATTR_KILL_SUID|ATTR_KILL_SGID)) &&                                         
            (ia_valid & ATTR_MODE))                                                                 
                BUG();    

I suspect this means that nfsd_sanitize_attrs is being called too early. I think we need to do that closer to the end, after everything else has had a change to set up the ia_valid. Why this started happening with the latest MR though, I'm not sure. I'll have to do so me before and after comparison.

Stay tuned...

Comment 2 Jeff Layton 2023-05-17 14:48:37 UTC
Have you seen this happen more than once? Did you happen to collect a vmcore?

I'm asking because I don't see a way that we could legitimately get into this situation given the current NFSv3 setattr code in nfsd.
Also, when I look at the test log, there is a KASAN warning just after this BUG() was called. It may be fallout from the BUG() call itself, but it makes me wonder if we got into a situation where this was affected by some sort of memory corruption.

Comment 3 Jeff Layton 2023-05-17 14:56:15 UTC
Nevermind! I think I see what probably happened.

It turns out that notify_change can alter the ia_valid field. Now that we can end up retrying that call due to a delegation, we need to account for that. I'll need to run this by the maintainer upstream to see what he wants to do. Stay tuned!

Comment 4 Chuck Lever 2023-05-17 15:25:43 UTC
notify_change() has modified the passed-in attributes since before the git era. That seems like a brittle API. It would be nicer if that parameter was const. Maybe we can propose that to the VFS maintainers.

Barring acceptance of that idea, I guess __nfsd_setattr has to save a copy of size_attr, and restore that copy if it has to retry.

Comment 5 Chuck Lever 2023-05-17 15:27:07 UTC
Err. OK, it is not __nfsd_setattr() that would save it, it would be nfsd_setattr().

Comment 6 Jeff Layton 2023-05-17 15:44:55 UTC
(In reply to Chuck Lever from comment #5)
> Err. OK, it is not __nfsd_setattr() that would save it, it would be
> nfsd_setattr().

I took a quick look at the other callers and I don't see any others that use the structure for anything after the call. We probably could make it const, but we'd likely just need to clone the iattr in notify_change itself. Several of the functions it calls rely on the alterations it makes.

For now, I think we ought to just make a copy in nfsd_setattr. It's not a super high performance codepath anyway, and it's only 80 bytes on my machine. I'm testing that now and will send it out soon (assuming it works).

Comment 23 errata-xmlrpc 2023-11-07 08:45:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6583