Bug 997982
Summary: | fsck crash with corrupted file system | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Hubert Kario <hkario> | ||||||
Component: | e2fsprogs | Assignee: | Lukáš Czerner <lczerner> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Eryu Guan <eguan> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.0 | CC: | eguan, esandeen, josef, kzak, lczerner, oliver, rwheeler, tthakur | ||||||
Target Milestone: | rc | Keywords: | Regression, Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | e2fsprogs-1.42.9-4.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 997972 | Environment: | |||||||
Last Closed: | 2014-06-13 09:19:30 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 997972 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Hubert Kario
2013-08-16 16:49:56 UTC
Is this from fuzz testing? commit 7ff040f30f0ff3bf5e2c832da3cb577e00a52d60 Author: Eric Sandeen <sandeen> Date: Mon Sep 9 10:33:20 2013 -0400 e2fsck: don't try to stop mmp if there is no superblock set up Under some failure cases, we can get to fatal_error() without even having a superblock set up. In that case, ext2fs_mmp_stop() will segfault when it tries to dereference fs->super. Check for the existence of a superblock before we go down the ext2fs_mmp_stop() path to avoid this problem. Reported-by: Hubert Kario <hkario> Addresses-Red-Hat-Bugzilla: #997972 Signed-off-by: Eric Sandeen <sandeen> Signed-off-by: "Theodore Ts'o" <tytso> Ho hum, I guess it doesn't fix it after all. Actually, as of that commit, it does not crash. Something after that seems to have broken it again. Fantastico! No, wait ;) For me it does work in 1.42.9 as well as git upstream. Hubert, can you re-test w/ latest e2fsprogs-v1.42.9 in RHEL7? Thanks, -Eric I can confirm that e2fsprogs-1.42.9-3.el7.x86_64 don't crash with this fs image. I disagree, it is still reproducible for me. # cp disk.img.00000000000000000006.back disk.img.00000000000000000006 # fsck.ext4 disk.img.00000000000000000006 # fsck.ext4 disk.img.00000000000000000006 e2fsck 1.42.9 (28-Dec-2013) disk.img.00000000000000000006: obnovuje se žurnál fsck.ext4: Chybné magické číslo v superbloku při pokusu znovu otevřít disk.img.00000000000000000006 Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x7f9000000005 fsck.ext4[0x4275c1] /lib64/libc.so.6(+0x35a00)[0x7f9081598a00] fsck.ext4(fatal_error+0x50)[0x41e410] fsck.ext4(e2fsck_run_ext3_journal+0x2d3)[0x41d973] fsck.ext4(main+0x6ef)[0x409b8f] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9081584af5] fsck.ext4[0x40bec9] yum info e2fsprogs Loaded plugins: auto-update-debuginfo, product-id, subscription-manager This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. Installed Packages Name : e2fsprogs Arch : x86_64 Version : 1.42.9 Release : 3.el7 Size : 2.4 M Repo : installed From repo : beaker-Server Summary : Utilities for managing ext2, ext3, and ext4 filesystems URL : http://e2fsprogs.sourceforge.net/ License : GPLv2 Description : The e2fsprogs package contains a number of utilities for creating, : checking, modifying, and correcting any inconsistencies in second, : third and fourth extended (ext2/ext3/ext4) filesystems. E2fsprogs : contains e2fsck (used to repair filesystem inconsistencies after an : unclean shutdown), mke2fs (used to initialize a partition to contain : an empty ext2 filesystem), debugfs (used to examine the internal : structure of a filesystem, to manually repair a corrupted : filesystem, or to create test cases for e2fsck), tune2fs (used to : modify filesystem parameters), and most of the other core ext2fs : filesystem utilities. : : You should install the e2fsprogs package if you need to manage the : performance of an ext2, ext3, or ext4 filesystem. Moreover this is not really fixed upstream. It is just a coincidence that this is not present upstream the real problem is still present both RHEL7 and upstream. The real problem is that while ext2fs_free() will actually free the ext2_filsys structure the caller will still have the pointer set (ctx->fs) which may result in null pointer dereference while accessing (ctx->fs->io->magic) because the io structure has been free properly and it's pointer has been set to NULL. in e2fsprogs we're not using ext2fs_free() in some places (in other places we do). Since the pointed to fs is not set to NULL some places are setting it to NULL manually. This however is in contrast with ext2fs_free_mem() which will set the pointer to NULL for you (the pointer to the pointer is expected). This probably confused people so the best fix would be for ext2fs_free() to take pointer to pointer as well and set the pointer to NULL for us. I am testing the fix right now, it should fix the issue for RHEL7 as well as upstream (even though we can not reproduce this particular case). However I am not sure whether we want to get it into RHEL7 since we would have to push it before it will be pulled upstream. Thanks! -Lukas interesting, in my case the output looks like this: [root@rhel7-64 tmp]# gunzip disk.img.00000000000000000006.gz [root@rhel7-64 tmp]# cp disk.img.00000000000000000006{,.new} [root@rhel7-64 tmp]# e2fsck -f disk.img.00000000000000000006.new e2fsck 1.42.9 (28-Dec-2013) disk.img.00000000000000000006.new: recovering journal e2fsck: Bad magic number in super-block while trying to re-open disk.img.00000000000000000006.new disk.img.00000000000000000006.new: ********** WARNING: Filesystem still has errors ********** [root@rhel7-64 tmp]# echo $? 12 Created attachment 865039 [details]
Patch to fix the real problem
Here is a patch to fix the problem mentioned in the comment above. It seems to work fairly well so I'll test it some more and send it to the list. Then we can think about porting it back to RHEL7 if there is still time to do so.
Hubert, you might be just lucky enough that one of the pointers was overwritten by 0. when it comes to referencing freed memory you can never know what to find there. It's 100% reliably reproducible for me. -Lukas Ahh yes, under valgrind I can get it to crash: # valgrind --free-fill=c0 e2fsck -f -p disk.img.00000000000000000006.new ==12702== Memcheck, a memory error detector ==12702== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==12702== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==12702== Command: e2fsck -f -p disk.img.00000000000000000006.new ==12702== ==12702== Warning: noted but unhandled ioctl 0x127c with no size/direction hints ==12702== This could cause spurious value errors to appear. ==12702== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. disk.img.00000000000000000006.new: recovering journal ==12702== Warning: noted but unhandled ioctl 0x127c with no size/direction hints ==12702== This could cause spurious value errors to appear. ==12702== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. e2fsck: Bad magic number in super-block while trying to re-open disk.img.00000000000000000006.new ==12702== Invalid read of size 8 ==12702== at 0x41E3F3: fatal_error (in /usr/sbin/e2fsck) ==12702== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== Address 0x5ea5298 is 8 bytes inside a block of size 296 free'd ==12702== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==12702== by 0x41D797: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== ==12702== Invalid read of size 8 ==12702== at 0x41E3FA: fatal_error (in /usr/sbin/e2fsck) ==12702== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== Address 0x5ea52b0 is 32 bytes inside a block of size 296 free'd ==12702== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==12702== by 0x41D797: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== ==12702== Invalid read of size 8 ==12702== at 0x4E56906: ext2fs_mmp_stop (in /usr/lib64/libext2fs.so.2.4) ==12702== by 0x41E408: fatal_error (in /usr/sbin/e2fsck) ==12702== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== Address 0x5ea52b0 is 32 bytes inside a block of size 296 free'd ==12702== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==12702== by 0x41D797: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== ==12702== Invalid read of size 1 ==12702== at 0x4E5690D: ext2fs_mmp_stop (in /usr/lib64/libext2fs.so.2.4) ==12702== by 0x41E408: fatal_error (in /usr/sbin/e2fsck) ==12702== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==12702== by 0x409B8E: main (in /usr/sbin/e2fsck) ==12702== Address 0xc0c0c0c0c0c0c121 is not stack'd, malloc'd or (recently) free'd ==12702== Signal (11) SIGSEGV si_code=SI_KERNEL fault addr=(nil) e2fsck[0x4275c1] /lib64/libc.so.6(+0x35a00)[0x58f8a00] /lib64/libext2fs.so.2(ext2fs_mmp_stop+0xd)[0x4e5690d] e2fsck(fatal_error+0x49)[0x41e409] e2fsck(e2fsck_run_ext3_journal+0x2d3)[0x41d973] e2fsck(main+0x6ef)[0x409b8f] /lib64/libc.so.6(__libc_start_main+0xf5)[0x58e4af5] e2fsck[0x40bec9] ==12702== ==12702== HEAP SUMMARY: ==12702== in use at exit: 2,628 bytes in 53 blocks ==12702== total heap usage: 295 allocs, 242 frees, 166,211 bytes allocated ==12702== ==12702== LEAK SUMMARY: ==12702== definitely lost: 0 bytes in 0 blocks ==12702== indirectly lost: 0 bytes in 0 blocks ==12702== possibly lost: 0 bytes in 0 blocks ==12702== still reachable: 2,628 bytes in 53 blocks ==12702== suppressed: 0 bytes in 0 blocks ==12702== Rerun with --leak-check=full to see details of leaked memory ==12702== ==12702== For counts of detected and suppressed errors, rerun with: -v ==12702== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 3 from 3) Ok, thanks Lukas, and sorry for the premature closing - just trying to get some bugs behind us. ;) Thanks for looking into this. Sounds like you have a patch for this and we need to get blocker status set? Thanks! I cannot reproduce the crash either if I run e2fsck test.img directly, but following comment 13 I can hit the crash with e2fsprogs-1.42.9-3.el7 [root@hp-dl388eg8-01 ~]# valgrind --free-fill=c0 e2fsck test.img ==1801== Memcheck, a memory error detector ==1801== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==1801== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==1801== Command: e2fsck test.img ==1801== e2fsck 1.42.9 (28-Dec-2013) ==1801== Warning: noted but unhandled ioctl 0x127c with no size/direction hints ==1801== This could cause spurious value errors to appear. ==1801== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. test.img: recovering journal ==1801== Warning: noted but unhandled ioctl 0x127c with no size/direction hints ==1801== This could cause spurious value errors to appear. ==1801== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. e2fsck: Bad magic number in super-block while trying to re-open test.img ==1801== Invalid read of size 8 ==1801== at 0x41E3F3: fatal_error (in /usr/sbin/e2fsck) ==1801== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== Address 0x5ea5d48 is 8 bytes inside a block of size 296 free'd ==1801== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==1801== by 0x41D797: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== ==1801== Invalid read of size 8 ==1801== at 0x41E3FA: fatal_error (in /usr/sbin/e2fsck) ==1801== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== Address 0x5ea5d60 is 32 bytes inside a block of size 296 free'd ==1801== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==1801== by 0x41D797: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== ==1801== Invalid read of size 8 ==1801== at 0x4E56906: ext2fs_mmp_stop (in /usr/lib64/libext2fs.so.2.4) ==1801== by 0x41E408: fatal_error (in /usr/sbin/e2fsck) ==1801== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== Address 0x5ea5d60 is 32 bytes inside a block of size 296 free'd ==1801== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==1801== by 0x41D797: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== ==1801== Invalid read of size 1 ==1801== at 0x4E5690D: ext2fs_mmp_stop (in /usr/lib64/libext2fs.so.2.4) ==1801== by 0x41E408: fatal_error (in /usr/sbin/e2fsck) ==1801== by 0x41D972: e2fsck_run_ext3_journal (in /usr/sbin/e2fsck) ==1801== by 0x409B8E: main (in /usr/sbin/e2fsck) ==1801== Address 0xc0c0c0c0c0c0c121 is not stack'd, malloc'd or (recently) free'd ==1801== Signal (11) SIGSEGV si_code=SI_KERNEL fault addr=(nil) e2fsck[0x4275c1] /lib64/libc.so.6(+0x35a00)[0x58f8a00] /lib64/libext2fs.so.2(ext2fs_mmp_stop+0xd)[0x4e5690d] e2fsck(fatal_error+0x49)[0x41e409] e2fsck(e2fsck_run_ext3_journal+0x2d3)[0x41d973] e2fsck(main+0x6ef)[0x409b8f] /lib64/libc.so.6(__libc_start_main+0xf5)[0x58e4af5] e2fsck[0x40bec9] ==1801== ==1801== HEAP SUMMARY: ==1801== in use at exit: 3,430 bytes in 77 blocks ==1801== total heap usage: 333 allocs, 256 frees, 167,418 bytes allocated ==1801== ==1801== LEAK SUMMARY: ==1801== definitely lost: 0 bytes in 0 blocks ==1801== indirectly lost: 0 bytes in 0 blocks ==1801== possibly lost: 0 bytes in 0 blocks ==1801== still reachable: 3,430 bytes in 77 blocks ==1801== suppressed: 0 bytes in 0 blocks ==1801== Rerun with --leak-check=full to see details of leaked memory ==1801== ==1801== For counts of detected and suppressed errors, rerun with: -v ==1801== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 3 from 3) With e2fsprogs-1.42.9-4.el7 I cannot hit the crash Set to VERIFIED. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |