Bug 2232278 - find not obeying option -ignore_readdir_race
Summary: find not obeying option -ignore_readdir_race
Keywords:
Status: ON_QA
Alias: None
Deadline: 2023-08-21
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: findutils
Version: 8.8
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Lukáš Zaoral
QA Contact: Radka Brychtova
URL:
Whiteboard:
Depends On:
Blocks: 2232514 2232519
TreeView+ depends on / blocked
 
Reported: 2023-08-16 05:20 UTC by Daniel Reynolds
Modified: 2023-08-17 11:46 UTC (History)
5 users (show)

Fixed In Version: findutils-4.6.0-21.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2232514 2232519 (view as bug list)
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNU Savannah 45930 0 None None None 2023-08-16 08:38:06 UTC
Gitlab redhat/centos-stream/rpms findutils merge_requests 3 0 None merged fix find not obeying option -ignore_readdir_race in symlink_loop 2023-08-17 11:10:59 UTC
Red Hat Issue Tracker RHELPLAN-165884 0 None None None 2023-08-16 05:21:20 UTC

Description Daniel Reynolds 2023-08-16 05:20:11 UTC
Description of problem:

1. find is being used on NFS file systems.
2. two processes e.g. 'find DIR -name PATTERN -ignore_readdir_race -exec rm {} \;' look for files to delete in the same directory.
3. one archive process works as expected.
4. the other archive process running 'find' reports error message

    'find: '/path/to/file': No such file or directory'

Note, only shows up on NFS file systems not local file systems.

Version-Release number of selected component (if applicable):
- findutils-4.6.0-20.el8.x86_64

How reproducible:
Sometimes

Steps to Reproduce:
1. Create two find processes deleting files as above.
2. Run many times, sometimes find reports error 'find: '/path/to/file': No such file or directory'

Actual results:
'find: '/path/to/file': No such file or directory'

Expected results:
With option -ignore_readdir_race find should ignore missing files when trying to stat() them.

Additional info:
- high business impact for customer.  Blocking migration.
- strace performed confirming behaviour and attached to case.  See extract below.

~~~
231057 18:30:01.360720 getdents64(4</infadata/Backup/FIND_POC/Target>, [{d_ino=154801, d_off=11878423, ... d_name="MC_EDH_SIDE_CAR_GCP_POST_VBUP20230730-02:18:29_SmLYdODCg7.log"}] ...
231057 18:30:27.998095 newfstatat(5</infadata/Backup/FIND_POC/Target>, "MC_EDH_SIDE_CAR_GCP_POST_VBUP20230730-02:18:29_SmLYdODCg7.log", 0x55dc79775fe8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory) <0.000327>
231057 18:30:27.999508 newfstatat(5</infadata/Backup/FIND_POC/Target>, "MC_EDH_SIDE_CAR_GCP_POST_VBUP20230730-02:18:29_SmLYdODCg7.log", 0x7ffe36e02980, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory) <0.000343>
231057 18:30:28.000927 write(2</infadata/Backup/FIND_POC/Log/find_and_remove_GLT0.log>, "find: ", 6) = 6 <0.000416>
231057 18:30:28.002800 write(2</infadata/Backup/FIND_POC/Log/find_and_remove_GLT0.log>, "\342\200\230/infadata/Backup/FIND_POC/Target/MC_EDH_SIDE_CAR_GCP_POST_VBUP20230730-02:18:29_SmLYdODCg7.log\342\200\231", 100) = 100 <0.000506>
231057 18:30:28.004705 write(2</infadata/Backup/FIND_POC/Log/find_and_remove_GLT0.log>, ": No such file or directory", 27) = 27 <0.000412>
[snip]
231057 18:30:30.384816 +++ exited with 1 +++
~~~

Comment 2 Lukáš Zaoral 2023-08-16 08:38:06 UTC
Hello!
Thank you for the report!  I was able to reproduce this issue on my machine.  This problem was already reported to upstream few years back (https://savannah.gnu.org/bugs/?45930) but the fix was not committed upstream, yet.  So it is possible to encounter it on any release of RHEL or Fedora (even RHEL 7), it just happens that NFS on RHEL 8 has "better" timings to trigger it.  Fortunately, the fix seems trivial.

I'll update you ASAP when I have additional information to share.

Regards,
Lukas


Note You need to log in before you can comment on or make changes to this bug.