Description of problem: Running find in a "watch" loop to monitor changes. Version-Release number of selected component: 1:findutils-4.6.0-24.fc31 Additional info: reporter: libreport-2.12.0 backtrace_rating: 4 cgroup: 0::/user.slice/user-1601400001.slice/user/gnome-terminal-server.service cmdline: find . -type f crash_function: leave_dir executable: /usr/bin/find journald_cursor: s=0c3752e789814a58940002600f8f501b;i=436d;b=da731d6eaffa435b8e2ac8126a7e3112;m=467bab6d3;t=5a321997c16f4;x=10a195077aa90357 kernel: 5.5.15-200.fc31.x86_64 rootdir: / runlevel: N 5 type: CCpp uid: 1601400001 Truncated backtrace: Thread no. 1 (5 frames) #2 leave_dir at ../../../gl/lib/fts-cycle.c:136 #4 fts_build at ../../../gl/lib/fts.c:1390 #5 fts_read at ../../../gl/lib/fts.c:968 #6 find at ../../find/ftsfind.c:576 #7 process_all_startpoints at ../../find/ftsfind.c:638 Potential duplicate: bug 1558249
Created attachment 1678354 [details] File: backtrace
Created attachment 1678355 [details] File: core_backtrace
Created attachment 1678356 [details] File: cpuinfo
Created attachment 1678357 [details] File: dso_list
Created attachment 1678358 [details] File: environ
Created attachment 1678359 [details] File: limits
Created attachment 1678360 [details] File: maps
Created attachment 1678361 [details] File: mountinfo
Created attachment 1678362 [details] File: open_fds
Created attachment 1678363 [details] File: proc_pid_status
Is the crash reproducible? If yes, could you please try to run find with the -noleaf option?
(In reply to Kamil Dudka from comment #11) > Is the crash reproducible? If yes, could you please try to run find with > the -noleaf option? Not easily. It was just a "watch -n1 "find . -type f | wc -l" loop. It ran over an hour, and only one of those runs resulted in the crash. I've used find for more than 2 decades - I never remember ever seeing a coredump/segfault before using it. I suspect a race condition - ie. find may have been using a directory inode as that inode was being removed by another process. Pure guess work though.
(In reply to Peter Larsen from comment #12) > I suspect a race condition - ie. find may have been using a directory inode > as that inode was being removed by another process. Pure guess work though. This should not result in a crash under normal circumstances. What is known to break is the leaf optimization on less commonly used file systems. The optimization allows to traverse a directory tree recursively without calling stat() on all its nodes to check whether they are directories or not. But the optimization works properly only on file systems that report link counts properly, which for example CIFS is not. find (more precisely gnulib's FTS module) maintains a white-list and a black-list of file system types to enable or disable the leaf optimization automatically. The -noleaf option of find can be used on Fedora to explicitly disable the optimization to ease debugging of such cases. See bug #1558249 for an example of similar bug that was fixed recently.
(In reply to Kamil Dudka from comment #13) > (In reply to Peter Larsen from comment #12) > > I suspect a race condition - ie. find may have been using a directory inode > > as that inode was being removed by another process. Pure guess work though. > > This should not result in a crash under normal circumstances. > > What is known to break is the leaf optimization on less commonly used file > systems. The optimization allows to traverse a directory tree recursively > without calling stat() on all its nodes to check whether they are > directories or not. But the optimization works properly only on file > systems that report link counts properly, which for example CIFS is not. > find (more precisely gnulib's FTS module) maintains a white-list and a > black-list of file system types to enable or disable the leaf optimization > automatically. The -noleaf option of find can be used on Fedora to > explicitly disable the optimization to ease debugging of such cases. See > bug #1558249 for an example of similar bug that was fixed recently. Ok - this was done on XFS - standard /home LVM volume on a clean Fedora31 install. The procsss running were creating/altering about 65000 files under a base directory of the home directory. Since it's a long process I had "find" list all the files and counted them so I could trace what was happening. Definitely not using CIFS. Note the install program ran from an NFS share, but find only monitored the destination directory. I'll try to attempt this install again this weekend on a new VM. Unfortunately this software requires about 150GB of disk space to install so I first have to see if I have that space somewhere. If -noleaf works to me that solves the issue. You're welcome to try this yourself - the software in question can be downloaded for free here: https://www.xilinx.com/member/forms/download/xef.html?filename=Xilinx_Unified_2019.2_1106_2127_Lin64.bin but it's obviously not part of Fedora. I give it a destination directory under $HOME and then run the find in a separate terminal against this directory while the install is on-going.
Did find operate on a single file system? If not, was there any automount involved? I remember a similar bug report where a recursive bind mount triggered by automount crashed find: bug #1188498
(In reply to Kamil Dudka from comment #15) > Did find operate on a single file system? If not, was there any automount > involved? > > I remember a similar bug report where a recursive bind mount triggered by > automount crashed find: bug #1188498 Single file system $HOME/installdir - not using FUSE or anything like that. The install process (not find) read from NFS and wrote to $HOME/installdir. Note, investigating the directory shows a ton of symbolic and hard links - it looks like all the references are local to the files in this structure, and doesn't link to system wide (different file systems).
There was a similar bug report upstream this week and they have an experimental patch for it: https://lists.gnu.org/archive/html/bug-gnulib/2020-04/msg00069.html I will build test packages with the above patch applied...
Experimental packages of findutils with the upstream patch applied are available in the following copr: https://copr.fedorainfracloud.org/coprs/kdudka/findutils-rhbz1823247/
dist-git commit: https://src.fedoraproject.org/rpms/findutils/c/d84db4f6
FEDORA-2020-b1c7f64b0b has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-b1c7f64b0b
FEDORA-2020-4ff071d8e5 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-4ff071d8e5
FEDORA-2020-4ff071d8e5 has been pushed to the Fedora 32 testing repository. In short time you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-4ff071d8e5` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-4ff071d8e5 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2020-b1c7f64b0b has been pushed to the Fedora 31 testing repository. In short time you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-b1c7f64b0b` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-b1c7f64b0b See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2020-4ff071d8e5 has been pushed to the Fedora 32 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2020-b1c7f64b0b has been pushed to the Fedora 31 stable repository. If problem still persists, please make note of it in this bug report.