Bug 997972 - fsck crash with corrupted file system
fsck crash with corrupted file system
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: e2fsprogs (Show other bugs)
20
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Eric Sandeen
Fedora Extras Quality Assurance
:
Depends On:
Blocks: 997982
  Show dependency treegraph
 
Reported: 2013-08-16 12:14 EDT by Hubert Kario
Modified: 2014-02-18 15:02 EST (History)
4 users (show)

See Also:
Fixed In Version: e2fsprogs-1.42.9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 997982 (view as bug list)
Environment:
Last Closed: 2014-02-18 15:02:27 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Corrupted file system (105.02 KB, application/gzip)
2013-08-16 12:14 EDT, Hubert Kario
no flags Details

  None (edit)
Description Hubert Kario 2013-08-16 12:14:14 EDT
Created attachment 787358 [details]
Corrupted file system

Description of problem:
When checking a corrupted file system (attached) fsck crashes

Version-Release number of selected component (if applicable):
e2fsprogs-1.42.8-1.fc20.x86_64

How reproducible:
Always

Steps to Reproduce:
1. gunzip disk.img.00000000000000000006.wrk_7.gz
2. losetup /dev/loop0 disk.img.00000000000000000006.wrk_7
3. fsck -f -p /dev/loop0

Actual results:
fsck from util-linux 2.23.1
/dev/loop0: recovering journal
fsck.ext4: Bad magic number in super-block while trying to re-open /dev/loop0
Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x61
fsck.ext4[0x426771]
/lib64/libc.so.6(+0x37300)[0x7f20dc272300]
/lib64/libext2fs.so.2(ext2fs_mmp_stop+0xd)[0x359b02316d]
fsck.ext4(fatal_error+0x42)[0x41db52]
fsck.ext4(e2fsck_run_ext3_journal+0x243)[0x41d0c3]
fsck.ext4(main+0x6ef)[0x409b7f]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f20dc25cfa5]
fsck.ext4[0x40be75]

Expected results:
no crash

Additional info:
File system created by truncating writes to disk to 256 bytes
Comment 1 Hubert Kario 2013-08-16 12:20:25 EDT
The bug is also present on Fedora 18:

e2fsprogs-1.42.5-1.fc18.x86_64

fsck from util-linux 2.22.2
/dev/loop0: recovering journal
fsck.ext4: Bad magic number in super-block while trying to re-open /dev/loop0
Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x61
fsck.ext4[0x426f20]
/lib64/libc.so.6[0x35e5035cd0]
/lib64/libext2fs.so.2(ext2fs_mmp_stop+0x1c)[0x35e60237cc]
fsck.ext4(fatal_error+0x42)[0x41dc92]
fsck.ext4(e2fsck_run_ext3_journal+0x2d3)[0x41d1d3]
fsck.ext4(main+0x71e)[0x409a3e]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x35e5021a05]
fsck.ext4[0x40bc6d]
Comment 2 Eric Sandeen 2013-08-16 12:34:36 EDT
persists upstream too
Comment 3 Eric Sandeen 2013-08-16 12:41:29 EDT
Program received signal SIGSEGV, Segmentation fault.
ext2fs_mmp_stop (fs=0x67c3b0) at mmp.c:374
374		if (!(fs->super->s_feature_incompat & EXT4_FEATURE_INCOMPAT_MMP) ||
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6.x86_64
(gdb) bt
#0  ext2fs_mmp_stop (fs=0x67c3b0) at mmp.c:374
#1  0x00000000004249ce in fatal_error (ctx=0x67c000, msg=<value optimized out>) at util.c:59
#2  0x0000000000423183 in e2fsck_run_ext3_journal (ctx=0x67c000) at journal.c:973
#3  0x0000000000410574 in main (argc=<value optimized out>, argv=<value optimized out>) at unix.c:1500
(gdb) p fs->super
$1 = (struct ext2_super_block *) 0x0
Comment 4 Eric Sandeen 2013-08-16 13:01:08 EDT
It's trying to stop the multiple mount protection crud, but there's no super set up (because of the bad magic number failure).

This avoids it, at least:

diff --git a/e2fsck/util.c b/e2fsck/util.c
index 9eaf557..18005f4 100644
--- a/e2fsck/util.c
+++ b/e2fsck/util.c
@@ -55,7 +55,7 @@ void fatal_error(e2fsck_t ctx, const char *msg)
 		fprintf (stderr, "e2fsck: %s\n", msg);
 	if (!fs)
 		goto out;
-	if (fs->io) {
+	if (fs->io && fs->super) {
 		ext2fs_mmp_stop(ctx->fs);
 		if (ctx->fs->io->magic == EXT2_ET_MAGIC_IO_CHANNEL)
 			io_channel_flush(ctx->fs->io);


but then you just get:

# e2fsck/e2fsck.static -fy test.img 
e2fsck 1.43-WIP (20-Jun-2013)
test.img: recovering journal
e2fsck/e2fsck.static: Bad magic number in super-block while trying to re-open test.img

test.img: ********** WARNING: Filesystem still has errors **********

and looking for a backup superblock doesn't work; replaying the journal seems to wipe them all out.

What the heck happened to this filesystem? :)  (mounting -o norecovery,ro yields a filesystem with no files in it)

Are you fuzz-testing here?
Comment 5 Eric Sandeen 2013-08-16 13:12:26 EDT
patch sent upstream:

http://marc.info/?l=linux-ext4&m=137667276009490&w=2
Comment 6 Hubert Kario 2013-08-16 13:17:15 EDT
(In reply to Eric Sandeen from comment #4)
> 
> What the heck happened to this filesystem? :)  (mounting -o norecovery,ro
> yields a filesystem with no files in it)
> 
> Are you fuzz-testing here?

more-or-less, I'm working on a file system checker that records all the writes that go the the file system and then replays them one by one (or sector by sector), possibly with errors. In the end I want to have a tool that can simulate any imaginable HDD (or SSD) failure mode.

In this specific case, it was truncating all writes that go to the file system to 256 bytes, so "a bit" of information was lost

So the answer to "What the heck happened to this filesystem?" would be:
It fell into a Bl**Tec blender :)
Comment 7 Eric Sandeen 2013-08-16 13:19:00 EDT
Oof.
Comment 8 Fedora End Of Life 2013-09-16 12:34:46 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 20 development cycle.
Changing version to '20'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora20
Comment 9 Eric Sandeen 2014-02-18 15:02:27 EST
Fixed in e2fsprogs-1.42.9

Note You need to log in before you can comment on or make changes to this bug.