Bug 683836 - ext4 crash and umount race condition
Summary: ext4 crash and umount race condition
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-10 13:22 UTC by Albert Strasheim
Modified: 2011-05-10 09:29 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-05-10 09:06:39 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
mount script (1.50 KB, text/plain)
2011-03-10 13:22 UTC, Albert Strasheim
no flags Details
screenshot (611.05 KB, image/bmp)
2011-03-10 13:23 UTC, Albert Strasheim
no flags Details
screenshot 1 (30.54 KB, image/png)
2011-03-10 13:24 UTC, Albert Strasheim
no flags Details
screenshot 2 (28.45 KB, image/png)
2011-03-10 13:24 UTC, Albert Strasheim
no flags Details
screenshot 3 (28.39 KB, image/png)
2011-03-10 13:25 UTC, Albert Strasheim
no flags Details
kernel crash (3.15 KB, text/plain)
2011-03-10 13:36 UTC, Albert Strasheim
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 32312 0 None None None Never
Red Hat Bugzilla 703399 0 unspecified CLOSED umount race condition 2021-02-22 00:41:40 UTC

Internal Links: 703399

Description Albert Strasheim 2011-03-10 13:22:41 UTC
Created attachment 483433 [details]
mount script

Description of problem:

Doing multiple umounts in parallel of a bunch of ext4 file systems causes umount to return "not mounted" for some file systems, even if /etc/mtab is a symlink to /proc/mounts.

Also, running the mount script a few times causes the kernel to crash. One crash dump attached.

Version-Release number of selected component (if applicable):

kernel-2.6.38-0.rc8.git0.2.fc14.x86_64

How reproducible:

Always

Steps to Reproduce:
1. boot with loop.max_loop=256
2. sudo ./testext4.py
3. cat /proc/mounts | grep /dev/loop | cut -d " " -f 1 | perl -ne 's/^/sudo umount /; s/$/&/; print' | sh -x
  
Actual results:

A few of the umounts fail with "not mounted". If you check /proc/mounts, the file systems are in fact mounted and another attempt to umount them succeeds.

Also, running ./testext4.py a few times with umounts in between causes the kernel to crash. One screenshot attached.

I am running on a machine with 24 cores.

Comment 1 Albert Strasheim 2011-03-10 13:23:04 UTC
Created attachment 483435 [details]
screenshot

Comment 2 Albert Strasheim 2011-03-10 13:24:09 UTC
Created attachment 483436 [details]
screenshot 1

Comment 3 Albert Strasheim 2011-03-10 13:24:42 UTC
Created attachment 483437 [details]
screenshot 2

Comment 4 Albert Strasheim 2011-03-10 13:25:29 UTC
Created attachment 483438 [details]
screenshot 3

Comment 5 Albert Strasheim 2011-03-10 13:36:12 UTC
Created attachment 483442 [details]
kernel crash

Comment 6 Eric Sandeen 2011-03-15 16:21:45 UTC
COmment #5 does indeed look like an unmount race, calling completion on something which has gone away.

Might be worth testing with a completely different filesystem (xfs, perhaps) to see whether some of this might be more of a vfs issue.

Comment 7 Albert Strasheim 2011-03-25 08:53:50 UTC
Same umount race with XFS.

Comment 8 Albert Strasheim 2011-03-30 05:05:56 UTC
This probably needs to be reported upstream?

Comment 9 Eric Sandeen 2011-03-30 18:28:15 UTC
(In reply to comment #8)
> This probably needs to be reported upstream?

Actually yes, that would be good.  Upstream has more bandwidth for bugs than I do alone.  :)

Thanks,
-Eric

Comment 10 Albert Strasheim 2011-03-31 04:56:24 UTC
Reported here:

https://bugzilla.kernel.org/show_bug.cgi?id=32312

Comment 11 Albert Strasheim 2011-05-07 20:02:40 UTC
According to Jan Kara, this bug should have been fixed by commit 0aeea18964173715a1037034ef6838198f319319 by lczerner, which went into 2.6.39-rc1.

Comment 12 Chuck Ebbert 2011-05-09 11:18:54 UTC
(In reply to comment #11)
> According to Jan Kara, this bug should have been fixed by commit
> 0aeea18964173715a1037034ef6838198f319319 by lczerner, which went
> into 2.6.39-rc1.

That patch actually went in 2.6.38 just before release.

Comment 13 Albert Strasheim 2011-05-09 11:29:01 UTC
I have retested with kernel-2.6.38.4-20.fc15.x86_64, and the bug is still there.

So if 0aeea18964173715a1037034ef6838198f319319 is in 2.6.38, it doesn't fix this bug.

Comment 14 Lukáš Czerner 2011-05-10 08:18:50 UTC
(In reply to comment #13)
> I have retested with kernel-2.6.38.4-20.fc15.x86_64, and the bug is still
> there.

Did the crash appear on the kernel-2.6.38.4-20.fc15.x86_64 as well ? Or just the "not mounted" error ?

Comment 15 Albert Strasheim 2011-05-10 08:43:00 UTC
Hello

I did a few tests and ut seems the crash is fixed in kernel-2.6.38.4-20.fc15.x86_64, but the "not mounted" error still appears.

Regards

Albert

Comment 16 Albert Strasheim 2011-05-10 08:43:21 UTC
Hello

I did a few tests and it seems the crash is fixed in kernel-2.6.38.4-20.fc15.x86_64, but the "not mounted" error still appears.

Regards

Albert

Comment 17 Lukáš Czerner 2011-05-10 09:06:39 UTC
Good, that's what I though. So Jan was right the crash has been fixed by that commit. The "not mounted" (EINVAL) error is however completely different problem and should have it's own bz entry. So if you do not mind I will close this one and you can open the new bz for that "not mounted" error (please cc me).

Thanks Albert!
-Lukas


Note You need to log in before you can comment on or make changes to this bug.