Bug 1286127 - DHT + rebalance : rename of files fails with an error 'No such file or directory' even though files are present.
Summary: DHT + rebalance : rename of files fails with an error 'No such file or direct...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Raghavendra G
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard: dht-rename-file, dht-fops-while-rebal...
Depends On: 1064283
Blocks: 1395133 1395217 1398554
TreeView+ depends on / blocked
 
Reported: 2015-11-27 11:39 UTC by Susant Kumar Palai
Modified: 2018-04-16 18:03 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1064283
: 1395217 (view as bug list)
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Comment 2 Raghavendra G 2016-06-28 09:02:17 UTC
Most likely this bug is due to cached-subvol changes during rebalance. Since rhs-3.1, dht_rename on files acquires locks. So, this bug is most likely fixed. Need to be retested.

Comment 3 Raghavendra G 2016-06-28 09:07:23 UTC
Please note that there is a small race-window between lookup on file(s) and rename fops. If the file gets migrated in this window, we can still run into rename errors (as cached-subvol is changed). To fix this bug completely, rename should also need to handle cached-subvol changes like open (dht_open2), stat (dht_stat2) etc.

Comment 4 Raghavendra G 2017-04-04 04:39:16 UTC
This can also happen because 
1. the layout of parent directory changed
2. but a lookup was not sent on src/dst. So, no entry corresponding to src/dst is present on newly hashed-subvols
3. rename is issued.

Since a rename expects an entry on hashed-subvol, an attempt to unlink/rename/link might fail.

To summarize, this bug can happen both because of 
1. changes in layout
2. migration of file in the window b/w lookup and rename fops.

Comment 5 Prasad Desala 2017-07-27 13:19:14 UTC
Observed the same issue on glusterfs version 3.8.4-35.el7rhgs.x86_64.
Steps:
======
1) On a nfs-ganesha setup, create a distributed-replicate volume and start it.
2) nfs mount it on multiple clients.
3) Create few files from the mount point.
4) Add few bricks and trigger rebalance.
5) From one client start renaming the files, and from other client start changing file permission and continuous lookups.

Few files rename operation failed with error 'No such file or directory'. on lookup from mount point we can find those files.

Mount point:
=============
mv: cannot move ‘rename_0_file_32’ to ‘rename_1_file_32’: No such file or directory
mv: cannot move ‘rename_0_file_38’ to ‘rename_1_file_38’: No such file or directory
mv: cannot move ‘rename_0_file_66’ to ‘rename_1_file_66’: No such file or directory
mv: cannot move ‘rename_0_file_75’ to ‘rename_1_file_75’: No such file or directory
mv: cannot move ‘rename_0_file_79’ to ‘rename_1_file_79’: No such file or directory
mv: cannot move ‘rename_0_file_142’ to ‘rename_1_file_142’: No such file or directory
mv: cannot move ‘rename_0_file_218’ to ‘rename_1_file_218’: No such file or directory
mv: cannot move ‘rename_0_file_222’ to ‘rename_1_file_222’: No such file or directory
mv: cannot move ‘rename_0_file_239’ to ‘rename_1_file_239’: No such file or directory
mv: cannot move ‘rename_0_file_295’ to ‘rename_1_file_295’: No such file or directory
mv: cannot move ‘rename_0_file_300’ to ‘rename_1_file_300’: No such file or directory
mv: cannot move ‘rename_0_file_375’ to ‘rename_1_file_375’: No such file or directory
mv: cannot move ‘rename_0_file_400’ to ‘rename_1_file_400’: No such file or directory
mv: cannot move ‘rename_0_file_426’ to ‘rename_1_file_426’: No such file or directory
mv: cannot move ‘rename_0_file_514’ to ‘rename_1_file_514’: No such file or directory
mv: cannot move ‘rename_0_file_525’ to ‘rename_1_file_525’: No such file or directory
mv: cannot move ‘rename_0_file_556’ to ‘rename_1_file_556’: No such file or directory
mv: cannot move ‘rename_0_file_679’ to ‘rename_1_file_679’: No such file or directory
mv: cannot move ‘rename_0_file_809’ to ‘rename_1_file_809’: No such file or directory
mv: cannot move ‘rename_0_file_817’ to ‘rename_1_file_817’: No such file or directory

Comment 7 Prasad Desala 2018-04-09 09:17:02 UTC
Hit this issue on 3.4.0(3.12.2-7) while doing the same steps as in the description.


Note You need to log in before you can comment on or make changes to this bug.