Bug 1823247 - [abrt] findutils: leave_dir(): find killed by SIGABRT
Summary: [abrt] findutils: leave_dir(): find killed by SIGABRT
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: findutils
Version: 31
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:d5864717700e558dab70b44a1e8...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-13 01:36 UTC by Peter Larsen
Modified: 2020-05-10 04:50 UTC (History)
3 users (show)

Fixed In Version: findutils-4.7.0-4.fc33 findutils-4.7.0-4.fc32 findutils-4.6.0-25.fc31
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 02:31:16 UTC
Type: ---


Attachments (Terms of Use)
File: backtrace (14.99 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: core_backtrace (2.06 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: cpuinfo (2.07 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: dso_list (738 bytes, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: environ (5.64 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: limits (1.29 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: maps (4.84 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: mountinfo (2.64 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: open_fds (629 bytes, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details
File: proc_pid_status (1.40 KB, text/plain)
2020-04-13 01:36 UTC, Peter Larsen
no flags Details

Description Peter Larsen 2020-04-13 01:36:45 UTC
Description of problem:
Running find in a "watch" loop to monitor changes. 

Version-Release number of selected component:
1:findutils-4.6.0-24.fc31

Additional info:
reporter:       libreport-2.12.0
backtrace_rating: 4
cgroup:         0::/user.slice/user-1601400001.slice/user/gnome-terminal-server.service
cmdline:        find . -type f
crash_function: leave_dir
executable:     /usr/bin/find
journald_cursor: s=0c3752e789814a58940002600f8f501b;i=436d;b=da731d6eaffa435b8e2ac8126a7e3112;m=467bab6d3;t=5a321997c16f4;x=10a195077aa90357
kernel:         5.5.15-200.fc31.x86_64
rootdir:        /
runlevel:       N 5
type:           CCpp
uid:            1601400001

Truncated backtrace:
Thread no. 1 (5 frames)
 #2 leave_dir at ../../../gl/lib/fts-cycle.c:136
 #4 fts_build at ../../../gl/lib/fts.c:1390
 #5 fts_read at ../../../gl/lib/fts.c:968
 #6 find at ../../find/ftsfind.c:576
 #7 process_all_startpoints at ../../find/ftsfind.c:638

Potential duplicate: bug 1558249

Comment 1 Peter Larsen 2020-04-13 01:36:47 UTC
Created attachment 1678354 [details]
File: backtrace

Comment 2 Peter Larsen 2020-04-13 01:36:48 UTC
Created attachment 1678355 [details]
File: core_backtrace

Comment 3 Peter Larsen 2020-04-13 01:36:49 UTC
Created attachment 1678356 [details]
File: cpuinfo

Comment 4 Peter Larsen 2020-04-13 01:36:50 UTC
Created attachment 1678357 [details]
File: dso_list

Comment 5 Peter Larsen 2020-04-13 01:36:51 UTC
Created attachment 1678358 [details]
File: environ

Comment 6 Peter Larsen 2020-04-13 01:36:52 UTC
Created attachment 1678359 [details]
File: limits

Comment 7 Peter Larsen 2020-04-13 01:36:53 UTC
Created attachment 1678360 [details]
File: maps

Comment 8 Peter Larsen 2020-04-13 01:36:54 UTC
Created attachment 1678361 [details]
File: mountinfo

Comment 9 Peter Larsen 2020-04-13 01:36:55 UTC
Created attachment 1678362 [details]
File: open_fds

Comment 10 Peter Larsen 2020-04-13 01:36:56 UTC
Created attachment 1678363 [details]
File: proc_pid_status

Comment 11 Kamil Dudka 2020-04-14 11:24:51 UTC
Is the crash reproducible?  If yes, could you please try to run find with the -noleaf option?

Comment 12 Peter Larsen 2020-04-14 15:24:53 UTC
(In reply to Kamil Dudka from comment #11)
> Is the crash reproducible?  If yes, could you please try to run find with
> the -noleaf option?

Not easily. It was just a "watch -n1 "find . -type f | wc -l" loop. It ran over an hour, and only one of those runs resulted in the crash.  I've used find for more than 2 decades - I never remember ever seeing a coredump/segfault before using it.

I suspect a race condition - ie. find may have been using a directory inode as that inode was being removed by another process. Pure guess work though.

Comment 13 Kamil Dudka 2020-04-14 16:08:33 UTC
(In reply to Peter Larsen from comment #12)
> I suspect a race condition - ie. find may have been using a directory inode
> as that inode was being removed by another process. Pure guess work though.

This should not result in a crash under normal circumstances.

What is known to break is the leaf optimization on less commonly used file systems.  The optimization allows to traverse a directory tree recursively without calling stat() on all its nodes to check whether they are directories or not.  But the optimization works properly only on file systems that report link counts properly, which for example CIFS is not.  find (more precisely gnulib's FTS module) maintains a white-list and a black-list of file system types to enable or disable the leaf optimization automatically.  The -noleaf option of find can be used on Fedora to explicitly disable the optimization to ease debugging of such cases.  See bug #1558249 for an example of similar bug that was fixed recently.

Comment 14 Peter Larsen 2020-04-14 18:09:12 UTC
(In reply to Kamil Dudka from comment #13)
> (In reply to Peter Larsen from comment #12)
> > I suspect a race condition - ie. find may have been using a directory inode
> > as that inode was being removed by another process. Pure guess work though.
> 
> This should not result in a crash under normal circumstances.
> 
> What is known to break is the leaf optimization on less commonly used file
> systems.  The optimization allows to traverse a directory tree recursively
> without calling stat() on all its nodes to check whether they are
> directories or not.  But the optimization works properly only on file
> systems that report link counts properly, which for example CIFS is not. 
> find (more precisely gnulib's FTS module) maintains a white-list and a
> black-list of file system types to enable or disable the leaf optimization
> automatically.  The -noleaf option of find can be used on Fedora to
> explicitly disable the optimization to ease debugging of such cases.  See
> bug #1558249 for an example of similar bug that was fixed recently.

Ok - this was done on XFS - standard /home LVM volume on a clean Fedora31 install. The procsss running were creating/altering about 65000 files under a base directory of the home directory. Since it's a long process I had "find" list all the files and counted them so I could trace what was happening.  Definitely not using CIFS.  Note the install program ran from an NFS share, but find only monitored the destination directory. 

I'll try to attempt this install again this weekend on a new VM. Unfortunately this software requires about 150GB of disk space to install so I first have to see if I have that space somewhere. If -noleaf works to me that solves the issue. You're welcome to try this yourself - the software in question can be downloaded for free here: https://www.xilinx.com/member/forms/download/xef.html?filename=Xilinx_Unified_2019.2_1106_2127_Lin64.bin but it's obviously not part of Fedora. I give it a destination directory under $HOME and then run the find in a separate terminal against this directory while the install is on-going.

Comment 15 Kamil Dudka 2020-04-15 11:00:54 UTC
Did find operate on a single file system?  If not, was there any automount involved?

I remember a similar bug report where a recursive bind mount triggered by automount crashed find: bug #1188498

Comment 16 Peter Larsen 2020-04-15 14:10:57 UTC
(In reply to Kamil Dudka from comment #15)
> Did find operate on a single file system?  If not, was there any automount
> involved?
> 
> I remember a similar bug report where a recursive bind mount triggered by
> automount crashed find: bug #1188498

Single file system $HOME/installdir - not using FUSE or anything like that.  The install process (not find) read from NFS and wrote to $HOME/installdir.  Note, investigating the directory shows a ton of symbolic and hard links - it looks like all the references are local to the files in this structure, and doesn't link to system wide (different file systems).

Comment 17 Kamil Dudka 2020-04-16 13:26:33 UTC
There was a similar bug report upstream this week and they have an experimental patch for it:

    https://lists.gnu.org/archive/html/bug-gnulib/2020-04/msg00069.html

I will build test packages with the above patch applied...

Comment 18 Kamil Dudka 2020-04-16 16:53:56 UTC
Experimental packages of findutils with the upstream patch applied are available in the following copr:

    https://copr.fedorainfracloud.org/coprs/kdudka/findutils-rhbz1823247/

Comment 19 Kamil Dudka 2020-04-17 16:28:02 UTC
dist-git commit: https://src.fedoraproject.org/rpms/findutils/c/d84db4f6

Comment 20 Fedora Update System 2020-04-23 14:43:53 UTC
FEDORA-2020-b1c7f64b0b has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-b1c7f64b0b

Comment 21 Fedora Update System 2020-04-23 14:43:55 UTC
FEDORA-2020-4ff071d8e5 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-4ff071d8e5

Comment 22 Fedora Update System 2020-04-23 20:46:17 UTC
FEDORA-2020-4ff071d8e5 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-4ff071d8e5`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-4ff071d8e5

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 23 Fedora Update System 2020-04-25 04:18:46 UTC
FEDORA-2020-b1c7f64b0b has been pushed to the Fedora 31 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-b1c7f64b0b`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-b1c7f64b0b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 24 Fedora Update System 2020-04-28 02:31:16 UTC
FEDORA-2020-4ff071d8e5 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 25 Fedora Update System 2020-05-10 04:50:39 UTC
FEDORA-2020-b1c7f64b0b has been pushed to the Fedora 31 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.