Bug 693033

Summary: [xfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Product: Red Hat Enterprise Linux 6 Reporter: Eryu Guan <eguan>
Component: kernelAssignee: Dave Chinner <dchinner>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: dchinner, rwheeler
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-13 03:12:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
xfs bad image generated by fsfuzzer none

Description Eryu Guan 2011-04-02 06:36:54 UTC
Created attachment 489548 [details]
xfs bad image generated by fsfuzzer

Description of problem:
Running fstest (from fsfuzzer) on bad image generated by fsfuzzer got kernel BUG.

Version-Release number of selected component (if applicable):
2.6.32-122.el6

How reproducible:
Everytime

Steps to Reproduce:
1. Compile fstest from fsfuzzer
http://people.redhat.com/sgrubb/files/fsfuzzer-0.7.tar.gz
you should add #include <sys/stat.h> in fstest.c to make it compiled
2. mount -o loop xfs.bad.img /mnt/test
3. /path/to/fstest /mnt/test
4. Wait for kernel BUG
  
Actual results:
Kernel BUG

Expected results:
System not crash

Additional info:
This is not a regression, it also happens on 6.0 GA kernel.

Not every bad image causes this kernel BUG, but they all do generate huge call trace when testing with fstest, is this an issue? When testing on ext4 bad image there is no such call trace.

Filesystem "loop0": corrupt dinode 12357, extent total = -1744830463, nblocks = 11993113.  Unmount and run xfs_repair.
ffff880264289500: 49 4e 81 a4 02 02 00 00 00 00 00 00 00 00 00 00  IN..............
Filesystem "loop0": XFS internal error xfs_iformat(1) at line 332 of file fs/xfs/xfs_inode.c.  Caller 0xffffffffa02afaba

Pid: 7564, comm: fstest Not tainted 2.6.32-122.el6.x86_64 #1
Call Trace:
 [<ffffffffa02a51d3>] ? xfs_error_report+0x43/0x50 [xfs]
 [<ffffffffa02afaba>] ? xfs_iread+0x16a/0x1b0 [xfs]
 [<ffffffffa02a523a>] ? xfs_corruption_error+0x5a/0x70 [xfs]
 [<ffffffffa02af80c>] ? xfs_iformat+0x43c/0x580 [xfs]
 [<ffffffffa02afaba>] ? xfs_iread+0x16a/0x1b0 [xfs]
 [<ffffffffa02afaba>] ? xfs_iread+0x16a/0x1b0 [xfs]
 [<ffffffff8118b01e>] ? inode_init_always+0x11e/0x1c0
 [<ffffffffa02aa3a4>] ? xfs_iget+0x2c4/0x6c0 [xfs]
 [<ffffffffa02a9b11>] ? xfs_iunlock+0x91/0x100 [xfs]
 [<ffffffffa02c7b46>] ? xfs_lookup+0xc6/0x110 [xfs]
 [<ffffffffa02d35c4>] ? xfs_vn_lookup+0x54/0xa0 [xfs]
 [<ffffffff8117ee3b>] ? do_lookup+0x18b/0x220
 [<ffffffff8117f439>] ? __link_path_walk+0x569/0x820
 [<ffffffff8117fdba>] ? path_walk+0x6a/0xe0
 [<ffffffff8117ff8b>] ? do_path_lookup+0x5b/0xa0
 [<ffffffff81180117>] ? user_path_at+0x57/0xa0
 [<ffffffff8117d775>] ? putname+0x35/0x50
 [<ffffffff81180122>] ? user_path_at+0x62/0xa0
 [<ffffffff8117643c>] ? vfs_fstatat+0x3c/0x80
 [<ffffffff81261485>] ? _atomic_dec_and_lock+0x55/0x80
 [<ffffffff81261485>] ? _atomic_dec_and_lock+0x55/0x80
 [<ffffffff811765ab>] ? vfs_stat+0x1b/0x20
 [<ffffffff811765d4>] ? sys_newstat+0x24/0x50
 [<ffffffff810d1872>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b

Comment 2 Eryu Guan 2011-04-02 06:41:52 UTC
Call trace

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffffa027801e>] xfs_alloc_fix_freelist+0x2e/0x480 [xfs]
PGD 2711c9067 PUD 275ac2067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 0
Modules linked in: xfs exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core shpchp sg ioatdma dca i7core_edac edac_core bnx2 ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas dm_mod [last unloaded: microcode]

Modules linked in: xfs exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core shpchp sg ioatdma dca i7core_edac edac_core bnx2 ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas dm_mod [last unloaded: microcode]
Pid: 7279, comm: fstest Tainted: G        W  ----------------   2.6.32-122.el6.x86_64 #1 System x3550 M3 -[7944I21]-
RIP: 0010:[<ffffffffa027801e>]  [<ffffffffa027801e>] xfs_alloc_fix_freelist+0x2e/0x480 [xfs]
RSP: 0018:ffff880473d31b58  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff880473d31c38 RCX: 000000000000000c
RDX: ffff880473d001f8 RSI: 0000000000000002 RDI: ffff880473d31c38
RBP: ffff880473d31c28 R08: 0000000000000000 R09: ffff8802730c1d48
R10: 000000000000000a R11: 0000000000000000 R12: 000000000000de3a
R13: 0000000000000002 R14: 0000000000000000 R15: ffff880473d00000
FS:  00007f25c5ecd700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000027513c000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process fstest (pid: 7279, threadinfo ffff880473d30000, task ffff880475ad8b00)
Stack:
 ffff880274d5fe00 ffff880273dfa0c0 ffff880473d31bd8 0000000000000000
<0> ffff880473d31be8 0000000000000000 ffff880273dfa100 0004125000000010
<0> ffff880273dfa0e0 ffff880273dfa0d0 ffff88026a533d18 0000000000000000
Call Trace:
 [<ffffffffa02b9278>] ? xfs_perag_get+0x48/0xc0 [xfs]
 [<ffffffffa02784f3>] xfs_free_extent+0x83/0xc0 [xfs]
 [<ffffffffa0281f0d>] xfs_bmap_finish+0x15d/0x1a0 [xfs]
 [<ffffffffa02a7d73>] xfs_itruncate_finish+0x183/0x390 [xfs]
 [<ffffffffa02c1622>] xfs_free_eofblocks+0x1d2/0x2a0 [xfs]
 [<ffffffffa02c1c50>] xfs_release+0x110/0x1d0 [xfs]
 [<ffffffffa02cb8c5>] xfs_file_release+0x15/0x20 [xfs]
 [<ffffffff811727b5>] __fput+0xf5/0x210
 [<ffffffff811728f5>] fput+0x25/0x30
 [<ffffffff8116de6d>] filp_close+0x5d/0x90
 [<ffffffff8116df45>] sys_close+0xa5/0x100
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
Code: e5 41 57 41 56 41 55 41 54 53 48 81 ec a8 00 00 00 0f 1f 44 00 00 48 8b 47 18 4c 8b 7f 08 48 89 fb 41 89 f5 48 89 85 48 ff ff ff <80> 78 10 00 4c 8b 27 0f 84 a5 02 00 00 48 c7 45 c0 00 00 00 00
RIP  [<ffffffffa027801e>] xfs_alloc_fix_freelist+0x2e/0x480 [xfs]
 RSP <ffff880473d31b58>
CR2: 0000000000000010

Comment 4 Dave Chinner 2011-04-03 00:55:22 UTC
(In reply to comment #0)

<only addressing the call trace output in this response>

> Not every bad image causes this kernel BUG, but they all do generate huge call
> trace when testing with fstest, is this an issue? When testing on ext4 bad
> image there is no such call trace.

Isn't it obvious what the traces XFS is producing are?

> Filesystem "loop0": corrupt dinode 12357, extent total = -1744830463, nblocks =
> 11993113.  Unmount and run xfs_repair.

That is, you've corrupted the filesystem and XFS is detecting the corruption and reporting exactly how it came across it. This is what an enterprise filesystem is supposed to do when it encounters corruption. ;)

If ext4 is not telling you everything it knows about the corruptions that are being detected (or even detecting them), then that is a problem with ext4, not XFS.

Comment 5 Dave Chinner 2011-04-03 01:15:57 UTC
(In reply to comment #2)
> Call trace
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> IP: [<ffffffffa027801e>] xfs_alloc_fix_freelist+0x2e/0x480 [xfs]
.....
>  [<ffffffffa02b9278>] ? xfs_perag_get+0x48/0xc0 [xfs]
>  [<ffffffffa02784f3>] xfs_free_extent+0x83/0xc0 [xfs]
>  [<ffffffffa0281f0d>] xfs_bmap_finish+0x15d/0x1a0 [xfs]
>  [<ffffffffa02a7d73>] xfs_itruncate_finish+0x183/0x390 [xfs]
>  [<ffffffffa02c1622>] xfs_free_eofblocks+0x1d2/0x2a0 [xfs]
>  [<ffffffffa02c1c50>] xfs_release+0x110/0x1d0 [xfs]
>  [<ffffffffa02cb8c5>] xfs_file_release+0x15/0x20 [xfs]
>  [<ffffffff811727b5>] __fput+0xf5/0x210
>  [<ffffffff811728f5>] fput+0x25/0x30
>  [<ffffffff8116de6d>] filp_close+0x5d/0x90
>  [<ffffffff8116df45>] sys_close+0xa5/0x100
>  [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b

This is almost certainly a corrupted block number in an extent tree. On a debug kernel it would have fired an ASSERT because the AG number is greater than the number of AGs in the filesytem. Hence xfs_perag_get() returned a NULL pointer, and the first thing that xfs_alloc_fix_freelist() does is dereference pag->pagf_init which is at offset 0x10 into the perag structure.

Indeed, one of the errors xfs_check outputs:

inode 19353 bad block number 30601643628 [7471104,1644], offset 0

which translates as AG 74771104. Given this is a single AG filesystem, we have a winner. I'll put together a fix for upstream and test it, and go from there.

BTW, when you fuzz a filesystem, you might want to fuzz a filesystem that you can actually run fsck on successfully. i.e. not just test the kernel code, but also exercise the check and repair code to see if it detects and corrects all the errors you introduced. On this filesystem image, xfs_repair tells you straight away that superblock is corrupt and that it cannot repair the fs because there is no secondary superblock:

Phase 1 - find and verify superblock...
bad primary superblock - inconsistent filesystem geometry in realtime filesystem component !!!

attempting to find secondary superblock...
.Sorry, could not find valid secondary superblock
Exiting now.

Basically, a single AG filesystem in XFS is a big no-no because it has no redundant superblocks to enable recovery. You should be testing fuzzing with a 32MB, 2AG filesystem at minimum.

Cheers,

Dave.

Comment 6 RHEL Program Management 2011-04-04 02:50:03 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 7 Eryu Guan 2011-04-07 08:08:54 UTC
(In reply to comment #5)
> 
> Basically, a single AG filesystem in XFS is a big no-no because it has no
> redundant superblocks to enable recovery. You should be testing fuzzing with a
> 32MB, 2AG filesystem at minimum.

I'll update the script, thanks!
> 
> Cheers,
> 
> Dave.

Comment 8 Dave Chinner 2011-09-19 05:26:59 UTC
This is fixed in RHEL 6.2 by this commit:

commit be1c42e4920e32f877503a96c07a8a93c84ae53b
Author: Dave Chinner <dchinner>
Date:   Tue Aug 9 22:55:16 2011 -0400

    [fs] xfs: catch bad block numbers freeing extents.
    
    Message-id: <1312930551-29224-327-git-send-email-dchinner>
    Patchwork-id: 39900
    O-Subject: [REHL6.2 PATCH 326/361] xfs: catch bad block numbers freeing extents.
    Bugzilla: 695827
    RH-Acked-by: Eric Sandeen <sandeen>
    RH-Acked-by: Christoph Hellwig <chellwig>
    
    RH BZ: 695827 - XFS update to 2.6.39 code base for delayed logging
    Upstream Commit: be65b18a10e62321c5ba09a1dc0f70babeb0eba1
    
    Upstream commit message:
    
    A fuzzed filesystem crashed a kernel when freeing an extent with a
    block number beyond the end of the filesystem. Convert all the debug
    asserts in xfs_free_extent() to active checks so that we catch bad
    extents and return that the filesytsem is corrupted rather than
    crashing.
    
    Signed-off-by: Dave Chinner <dchinner>
    Reviewed-by: Christoph Hellwig <hch>
    Reviewed-by: Alex Elder <aelder>
    Signed-off-by: Kyle McMartin <kmcmarti>

What's the next state the BZ needs to go to so that QA can verify it is fixed?

Comment 9 RHEL Program Management 2011-10-07 15:28:43 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 10 Eryu Guan 2011-10-13 03:12:05 UTC
Mark it as a dup of the big update bug 695827.
I've verified this bug has been fixed.

*** This bug has been marked as a duplicate of bug 695827 ***