Bug 2218844

Summary: [RHEL 9] rename() duplicates paths for a file on NFSv4 volume
Product: Red Hat Enterprise Linux 9 Reporter: Zhi Li <yieli>
Component: kernelAssignee: Jeff Layton <jlayton>
kernel sub component: NFS QA Contact: Zhi Li <yieli>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bcodding, jiyin, jlayton, jwboyer, nfs-team, shdunne, xzhou, yoyang
Version: 9.3Keywords: Regression, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-5.14.0-362.4.1.el9_3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:49:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2209174    

Description Zhi Li 2023-06-30 08:22:32 UTC
Description of problem:
-rename() duplicates paths for a file on NFSv4 volume.

Version-Release number of selected component (if applicable):
kernel-5.14.0-331.el9.x86_64

How reproducible:
always

Steps to Reproduce:
1. Mount a NFSv4 volume with noac on node1 and node2.

[root@node1 ~]# mount -o noac server:/srv/export /mnt/nfs
[root@node2 ~]# mount -o noac server:/srv/export /mnt/nfs

2. Create directories and a file.

[root@node1 ~]# mkdir /mnt/nfs/{A,B}; touch /mnt/nfs/A/f

3. Move the file from A to B on the node1.

[root@node1 ~]# mv /mnt/nfs/A/f /mnt/nfs/B/

4. cat the file on the node2, and move it from B to A.

[root@node2 ~]# cat /mnt/nfs/B/f; mv /mnt/nfs/B/f /mnt/nfs/A/

5. Move it again from A to B on the node1.

[root@node1 ~]# mv /mnt/nfs/A/f /mnt/nfs/B/

6. cat the file again on the node2, and move it from B to A again.

[root@node2 ~]# cat /mnt/nfs/B/f; mv /mnt/nfs/B/f /mnt/nfs/A/
mv: `/mnt/nfs/B/f' and `/mnt/nfs/A/f' are the same file

Actual results:
- File paths are duplicated, and mv is failed with "... are the same file".

Expected results
- No duplicate, and mv is succeeded.

Additional info:

This issue exists on rhel9.3 but does not occur on rhel8.9, please confirm if this is a valid bug for rhel9.

Beaker job:
Test passed on rhel8.9(kernel-4.18.0-498.el8):
https://beaker.engineering.redhat.com/jobs/7988933

Test failed on rhel9.3 (kernel-5.14.0-331.el9)
https://beaker.engineering.redhat.com/jobs/8017247

Comment 2 Murphy Zhou 2023-09-05 02:03:01 UTC
Hi team,

Has anyone looked into this regression?

Comment 5 Benjamin Coddington 2023-09-08 16:47:03 UTC
Finally bisected back to:
638e3e7d9493 nfsd: use the getattr operation to fetch i_version

Could be the assumption about change attr for directories is incorrect?  I am digging in, but maybe you see the problem right away Jeff?

Comment 6 Benjamin Coddington 2023-09-08 18:03:01 UTC
This doesn't reproduce on:
615e95831ec3 (HEAD -> bisect) Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Jeff suspects this is the xfs change attr issue he's already fixed upstream, and has a couple of directions to choose from for a fix here, so handing this over.

Here's my single-system reproducer (commented lines are for a non-restart bisect attempt):
#!/bin/bash

set -o xtrace
set -o errexit

umount -a -t nfs4

sudo rmmod nfsv4 nfsv3 nfs || true

#sudo systemctl stop nfs-server

#sudo umount /proc/fs/nfsd || true
#sleep 1

#sudo rmmod nfsd rpcsec_gss_krb5 auth_rpcgss lockd nfs_acl || true

#
#LOCALVERSION="" make -C /devel/linux-nfs -j10 M=fs/nfs
#LOCALVERSION="" make -C /devel/linux-nfs -j10 M=fs/nfsd
#
#sudo insmod /devel/linux-nfs/fs/nfs/nfs.ko
#sudo insmod /devel/linux-nfs/fs/nfs/nfsv4.ko

#sudo modprobe lockd
#sudo modprobe nfs_acl
#sudo modprobe auth_rpcgss
#sudo modprobe rpcsec_gss_krb5

#sudo insmod /devel/linux-nfs/fs/nfsd/nfsd.ko

sudo systemctl start nfs-server

mkdir /mnt/localhost1 || true
mkdir /mnt/localhost2 || true
mount -t nfs -ov4,sec=sys,noac,nosharecache localhost:/exports /mnt/localhost1
mount -t nfs -ov4,sec=sys,noac,nosharecache localhost:/exports /mnt/localhost2

rm -rf /exports/{A,B}

mkdir /mnt/localhost1/{A,B}

touch /mnt/localhost1/A/f
mv /mnt/localhost1/A/f /mnt/localhost1/B/
cat /mnt/localhost2/B/f
mv /mnt/localhost2/B/f /mnt/localhost2/A/
mv /mnt/localhost1/A/f /mnt/localhost1/B/

umount -a -t nfs4

Comment 7 Jeff Layton 2023-09-09 17:21:05 UTC
This turns out to be an ancient nfsd bug, that only manifested with the changes to how we fetch and present the change attr. Patch posted here for upstream:

    https://lore.kernel.org/linux-nfs/ZPyMyv1nNFV2whKP@tissot.1015granger.net/T/#t

Comment 27 Zhi Li 2023-09-20 06:08:19 UTC
Moving to VERIFIED according to the comment#26.

Comment 29 errata-xmlrpc 2023-11-07 08:49:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6583