Bug 2110576

Summary: RHEL-9 nfsd server post_wcc fixes - clients see increased revalidations
Product: Red Hat Enterprise Linux 9 Reporter: Benjamin Coddington <bcodding>
Component: kernelAssignee: Benjamin Coddington <bcodding>
kernel sub component: NFS QA Contact: Yongcheng Yang <yoyang>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: nfs-maint, xzhou, yoyang
Version: 9.0Keywords: Regression, Triaged
Target Milestone: rc   
Target Release: 9.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-5.14.0-138.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 11:10:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Deadline: 2022-08-29   

Description Benjamin Coddington 2022-07-25 16:23:52 UTC
While testing a RHEL-9 server, I observed increased RPC counts and decreased performance of the NFS client as compared to RHEL-8.  On closer inspection, I saw the NFS client unable to maintain a working access cache.  The client would send repeated duplicate ACCESS RPC calls for the same inodes.

The NFS client was depending on the server returning valid results from OPEN's post-op changeid, and it appears that RHEL-9 servers are missing upstream fix:

58f258f65267 Revert "nfsd: skip some unnecessary stats in the v4 case"

.. which will cause OPEN(CREATE) post-op to be zero on filesystems that do not yet support iversion (ext4, tmpfs..).

I'd like to take both patches from the upstream posting:
https://lore.kernel.org/linux-nfs/164078891040.2414.11995954842150988952.stgit@bazille.1015granger.net/

A simple reproducer is export an ext4 filesystem, and observe if a file created on an nfsv4 mount of that export has nfs.changeid4.after set to zero:

A broken server replies for an OPEN and REMOVE
[root@bcodding-rhel9 ~]# tshark -r rhel9.pcap -T fields -e frame.number -e nfs.opcode -e nfs.changeid4.after nfs.changeid4.after==0
5	53,22,18,10,3,9	0
17	53,22,28	0

A working server replies for the same (no results for the tshark filter):
[root@bcodding-rhel9 ~]# tshark -r /tmp/pcap -T fields -e frame.number -e nfs.opcode -e nfs.changeid4.after nfs.changeid4.after==0

Comment 11 errata-xmlrpc 2022-11-15 11:10:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8267