Hide Forgot
Description of problem: ********************************* While creating 100000 of files from cifs mount and doing rm -rf from another client and listing from third client shows no such file or directory for few files and unknown permission for few files. -rw-r--r--. 1 root root 15 Oct 19 09:56 file2510 -?????????? ? ? ? ? ? file2511 -?????????? ? ? ? ? ? file2512 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2513 -?????????? ? ? ? ? ? file2514 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2515 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2516 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2517 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2518 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2519 -rw-r--r--. 1 root root 15 Oct 19 09:53 file252 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2520 -rw-r--r--. 1 root root 15 Oct 19 09:56 file2521 Version-Release glusterfs-3.8.4-2.26.git0a405a4.el7rhgs.x86_64 number of selected component (if applicable): How reproducible: twice Steps to Reproduce: 1.As mentioned in description 2.Mount volume on cifs, create 100000 files, rm -rf from other client, ll from third client 3. Actual results: No such file or directory for few files (which may happen because the files are not yet removed ) but showing unknown file permissions as well. Expected results: Should not show unknown file permissions and no such file or directory. Additional info:
Could you please try disabling readdir-ahead and md-cache? Does that have any effect?
This does not look like an md-cache issue. If unlink lands between read-dirp and the follow up lookup/stat call then we will end up in this scenario. Cross checked this with DHT team (Rafi). Also simulated this scenario using gdb in a non md-cache setup. The same behavior is seen with Kernel NFS server and multiple NFS clients, where one client is running "rm -f" and the other client running "ls -l".
Nithya - could you have a look at it as comment 3 claims that this could be a DHT issue?
I don't think this is something new and expected with the readdirp/stat/unlink race. Was this seen in earlier releases?
This is expected behavior. I could recreate the same on xfs also. ls: cannot access 'd/6351': No such file or directory ls: cannot access 'd/6357': No such file or directory ls: cannot access 'd/6364': No such file or directory ls: cannot access 'd/6366': No such file or directory total 0 ??????????? ? ? ? ? ? 6229 ??????????? ? ? ? ? ? 6230 ??????????? ? ? ? ? ? 6231 ??????????? ? ? ? ? ? 6232 ??????????? ? ? ? ? ? 6233 ??????????? ? ? ? ? ? 6234 Have 3 terminals. One one execute: while true; do touch d/{1..10000}; done on second execute: while true; do ls -l d; done on third execute: while true; do rm -f d/*; done You will see above. As per now closing this as not a bug.
(In reply to Atin Mukherjee from comment #4) > Nithya - could you have a look at it as comment 3 claims that this could be > a DHT issue? I should have been more explicit about my statement. As pranith mentioned this is not a bug. That is why I mentioned the issue can be easily reproduced with Kernel NFS. Local filesystem like XFS would be little faster and that is why we may not always see the issue. But Pranith gave example of XFS as well. Wanted to discuss this with QE before closing this bug and hence did not changed the bug status.
As the Bug is closed with sufficient data points, clearing the needinfo.