Bug 2181403
Summary: | [RHEL 9] BUG nfsd_file: Objects remaining in nfsd_file on __kmem_cache_shutdown() | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Zhi Li <yieli> |
Component: | kernel | Assignee: | Jeff Layton <jlayton> |
kernel sub component: | NFS | QA Contact: | Zhi Li <yieli> |
Status: | CLOSED DUPLICATE | Docs Contact: | |
Severity: | unspecified | ||
Priority: | unspecified | CC: | chuck.lever, jiyin, nfs-team, xzhou, yoyang |
Version: | 9.2 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-05-15 14:38:18 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Zhi Li
2023-03-24 03:08:49 UTC
(In reply to JianHong Yin from comment #6) > ''' > nfs]$ cat regression/bz1227851-open-loop-on-NFSv4/nfsd_read_bad_stateid.stp > global c = 0 > probe module("nfsd").function("nfsd4_read").return > { > # BAD_STATEID == 10025; > if (c == 0) { > $return = 0x29270000; > exit() > } > } > ''' (cc'ing Chuck from oracle who is upstream nfsd maintainer) Oh! That script looks unsafe, and could cause just the symptoms you're seeing here. This is the bottom bit of nfsd4_read: -------------------8<--------------------- /* check stateid */ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh, &read->rd_stateid, RD_STATE, &read->rd_nf, NULL); read->rd_rqstp = rqstp; read->rd_fhp = &cstate->current_fh; return status; } -------------------8<--------------------- It calls nfs4_preprocess_stateid which, if successful, will then take a reference to an nfsd_file and then fill out rd_nf with a pointer to it. The expectation is that if nfs4_preprocess_stateid_op fails, then rd_nf will not be filled out. This is only overriding the return code however and not releasing the reference to rd_nf, which would leave outstanding references to those nfsd_files and cause this warning when we go to tear down the cache. The nfsd code is not completely blameless here however. In the NFSv4 compound processing, the op_release function is not called if op_func returns an error! It looks that might also cause a memory leak in the layoutget code too if that hits an error in an inopportune place. So while I suspect this systemtap script is the cause of the problem you're seeing, it sort of points out some potential memory leaks in other places. Let's turn this bug into one to fix that structural issue, and make sure we call op_release regardless of success or failure of op_func. We'll need to audit all of the op_release functions and make sure they're safe to call even when op_func fails, but there aren't that many of them. That should also allow this systemtap script to work as expected. Patch posted to linux-nfs mailing list: https://lore.kernel.org/linux-nfs/20230327102137.15412-1-jlayton@kernel.org/T/#u *** This bug has been marked as a duplicate of bug 2183621 *** |