Bug 1286127

Summary: DHT + rebalance : rename of files fails with an error 'No such file or directory' even though files are present.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Susant Kumar Palai <spalai>
Component: distributeAssignee: Raghavendra G <rgowdapp>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.1CC: kramdoss, moagrawa, mzywusko, nbalacha, nlevinki, racpatel, rgowdapp, rhs-bugs, smohan, spalai, storage-qa-internal, tdesala, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: dht-rename-file, dht-fops-while-rebal, dht-3.2.0-stretch
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1064283
: 1395217 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1064283    
Bug Blocks: 1395133, 1395217, 1398554    

Comment 2 Raghavendra G 2016-06-28 09:02:17 UTC
Most likely this bug is due to cached-subvol changes during rebalance. Since rhs-3.1, dht_rename on files acquires locks. So, this bug is most likely fixed. Need to be retested.

Comment 3 Raghavendra G 2016-06-28 09:07:23 UTC
Please note that there is a small race-window between lookup on file(s) and rename fops. If the file gets migrated in this window, we can still run into rename errors (as cached-subvol is changed). To fix this bug completely, rename should also need to handle cached-subvol changes like open (dht_open2), stat (dht_stat2) etc.

Comment 4 Raghavendra G 2017-04-04 04:39:16 UTC
This can also happen because 
1. the layout of parent directory changed
2. but a lookup was not sent on src/dst. So, no entry corresponding to src/dst is present on newly hashed-subvols
3. rename is issued.

Since a rename expects an entry on hashed-subvol, an attempt to unlink/rename/link might fail.

To summarize, this bug can happen both because of 
1. changes in layout
2. migration of file in the window b/w lookup and rename fops.

Comment 5 Prasad Desala 2017-07-27 13:19:14 UTC
Observed the same issue on glusterfs version 3.8.4-35.el7rhgs.x86_64.
Steps:
======
1) On a nfs-ganesha setup, create a distributed-replicate volume and start it.
2) nfs mount it on multiple clients.
3) Create few files from the mount point.
4) Add few bricks and trigger rebalance.
5) From one client start renaming the files, and from other client start changing file permission and continuous lookups.

Few files rename operation failed with error 'No such file or directory'. on lookup from mount point we can find those files.

Mount point:
=============
mv: cannot move ‘rename_0_file_32’ to ‘rename_1_file_32’: No such file or directory
mv: cannot move ‘rename_0_file_38’ to ‘rename_1_file_38’: No such file or directory
mv: cannot move ‘rename_0_file_66’ to ‘rename_1_file_66’: No such file or directory
mv: cannot move ‘rename_0_file_75’ to ‘rename_1_file_75’: No such file or directory
mv: cannot move ‘rename_0_file_79’ to ‘rename_1_file_79’: No such file or directory
mv: cannot move ‘rename_0_file_142’ to ‘rename_1_file_142’: No such file or directory
mv: cannot move ‘rename_0_file_218’ to ‘rename_1_file_218’: No such file or directory
mv: cannot move ‘rename_0_file_222’ to ‘rename_1_file_222’: No such file or directory
mv: cannot move ‘rename_0_file_239’ to ‘rename_1_file_239’: No such file or directory
mv: cannot move ‘rename_0_file_295’ to ‘rename_1_file_295’: No such file or directory
mv: cannot move ‘rename_0_file_300’ to ‘rename_1_file_300’: No such file or directory
mv: cannot move ‘rename_0_file_375’ to ‘rename_1_file_375’: No such file or directory
mv: cannot move ‘rename_0_file_400’ to ‘rename_1_file_400’: No such file or directory
mv: cannot move ‘rename_0_file_426’ to ‘rename_1_file_426’: No such file or directory
mv: cannot move ‘rename_0_file_514’ to ‘rename_1_file_514’: No such file or directory
mv: cannot move ‘rename_0_file_525’ to ‘rename_1_file_525’: No such file or directory
mv: cannot move ‘rename_0_file_556’ to ‘rename_1_file_556’: No such file or directory
mv: cannot move ‘rename_0_file_679’ to ‘rename_1_file_679’: No such file or directory
mv: cannot move ‘rename_0_file_809’ to ‘rename_1_file_809’: No such file or directory
mv: cannot move ‘rename_0_file_817’ to ‘rename_1_file_817’: No such file or directory

Comment 7 Prasad Desala 2018-04-09 09:17:02 UTC
Hit this issue on 3.4.0(3.12.2-7) while doing the same steps as in the description.