Bug 868233

Summary: [xfs/md] NULL pointer dereference - xfs_alloc_ioend_bio
Product: Red Hat Enterprise Linux 6 Reporter: Boris Ranto <branto>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED ERRATA QA Contact: Boris Ranto <branto>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4CC: bfoster, dchinner, esandeen, jcpunk, Jes.Sorensen, rwheeler
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-337.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 06:51:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Boris Ranto 2012-10-19 10:19:53 UTC
Description of problem:
The kernel panicks in xfs_alloc_ioend_bio when the test uses write transient (faulty) md device for storage and runs stress test on it.

Version-Release number of selected component (if applicable):
kernel-2.6.32-309.el6
kernel-2.6.32-328.el6

How reproducible:
Always

Steps to Reproduce:
1. Try to create a lot of small files on write transient md device

  
Actual results:
NULL pointer dereference at xfs_alloc_ioend_bio

Expected results:
No panic.

Additional info:
This is actually a regression, I could not hit this in kernel-2.6.32-279.el6.

The call trace:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa0514af3>] xfs_alloc_ioend_bio+0x33/0x50 [xfs]
PGD 10481b067 PUD 21520d067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/virtual/block/md5/md/metadata_version
CPU 2 
Modules linked in: faulty ext2 xfs exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 microcode i2c_i801 sg iTCO_wdt iTCO_vendor_support shpchp e1000e snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc xhci_hcd ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t ahci nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video output wmi dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]

Pid: 4362, comm: faulty Not tainted 2.6.32-328.el6.x86_64 #1                  /DX79SI
RIP: 0010:[<ffffffffa0514af3>]  [<ffffffffa0514af3>] xfs_alloc_ioend_bio+0x33/0x50 [xfs]
RSP: 0018:ffff88021710fab8  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff880216c172d0 RCX: ffff88021a5804c0
RDX: 0000000000000060 RSI: ffff880215254f00 RDI: 0000000000000286
RBP: ffff88021710fac8 R08: 0000000000001244 R09: 0000000100001244
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880217e877b8 R14: ffff88021710fd98 R15: 0000000000000000
FS:  00007f5e4e074700(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000219161000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process faulty (pid: 4362, threadinfo ffff88021710e000, task ffff880214b0eae0)
Stack:
 ffff880213c41ea0 ffff880216c172d0 ffff88021710fb18 ffffffffa0514c4e
<d> ffff88021710fb18 ffffffffa0513530 ffff880217e877b8 ffff880216c172d0
<d> ffffea0007232bf8 0000000000000000 ffff880217e877b8 0000000000000001
Call Trace:
 [<ffffffffa0514c4e>] xfs_submit_ioend+0xfe/0x110 [xfs]
 [<ffffffffa0513530>] ? xfs_setfilesize_trans_alloc+0x50/0xb0 [xfs]
 [<ffffffffa0515551>] xfs_vm_writepage+0x321/0x5b0 [xfs]
 [<ffffffff8112cf57>] __writepage+0x17/0x40
 [<ffffffff8112e2d9>] write_cache_pages+0x1c9/0x4b0
 [<ffffffff8112cf40>] ? __writepage+0x0/0x40
 [<ffffffff8112e5e4>] generic_writepages+0x24/0x30
 [<ffffffffa051479d>] xfs_vm_writepages+0x5d/0x80 [xfs]
 [<ffffffff8112e611>] do_writepages+0x21/0x40
 [<ffffffff8111a8fb>] __filemap_fdatawrite_range+0x5b/0x60
 [<ffffffff8111aec3>] filemap_fdatawrite_range+0x13/0x20
 [<ffffffffa051aa16>] xfs_flush_pages+0x76/0xb0 [xfs]
 [<ffffffffa0511718>] xfs_release+0x1e8/0x250 [xfs]
 [<ffffffffa0519945>] xfs_file_release+0x15/0x20 [xfs]
 [<ffffffff81183575>] __fput+0xf5/0x210
 [<ffffffff811836b5>] fput+0x25/0x30
 [<ffffffff8117eb4d>] filp_close+0x5d/0x90
 [<ffffffff8117ec25>] sys_close+0xa5/0x100
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Code: 08 0f 1f 44 00 00 48 89 fb 48 8b 7f 30 e8 26 5a ca e0 bf 10 00 00 00 89 c6 e8 da 6c ca e0 48 8b 53 20 48 c1 ea 09 48 0f af 53 18 <48> 89 10 48 8b 53 30 48 89 50 10 48 83 c4 08 5b c9 c3 66 66 2e 
RIP  [<ffffffffa0514af3>] xfs_alloc_ioend_bio+0x33/0x50 [xfs]
 RSP <ffff88021710fab8>
CR2: 0000000000000000

Comment 3 Eric Sandeen 2012-10-19 19:59:12 UTC
Hum, -299 seems to be the first kernel that regressed and there are no fs/xfs changes in there (or fs/ changes for that matter).  Looking into it.

Comment 4 Brian Foster 2012-10-19 20:21:23 UTC
I was able to reproduce this readily on a VM so I ran a bisect that narrowed in on the following commit (this does not appear to be XFS related):

commit fb0650c2790fc0198540ac92b2cd86326e6bf9de
Author: Mike Snitzer <snitzer>
Date:   Fri Aug 3 19:24:03 2012 -0400

    [block] do not artificially constrain max_sectors for stacking drivers
    
    Message-id: <1344021906-26216-2-git-send-email-snitzer>
    Patchwork-id: 49302
    O-Subject: [RHEL6.4 PATCH 01/64] block: do not artificially constrain max_sectors for stacking drivers
    Bugzilla: 844968
    RH-Acked-by: Alasdair Kergon <agk>
    
    BZ: 844968

Comment 5 Eric Sandeen 2012-10-19 20:23:25 UTC
Thanks Brian, I landed at the same commit almost simultaneously ;)

Comment 6 Mike Snitzer 2012-10-19 22:10:25 UTC
It looks like MD's faulty.c isn't properly stacking its limits.

The various raid levels use disk_stack_limits, but not faulty.c

I'd wager the issue is that faulty.c is just using the limits set by blk_set_stacking_limits (via drivers/md/md.c:md_alloc)... which is a BUG, blk_set_stacking_limits just establishes a baseline upon which a stacking device should stack its limits (via one of these: {blk,bdev,disk}_stack_limits).

Comment 7 Eric Sandeen 2012-10-19 22:24:47 UTC
Thanks Mike, this did it, I'll send it upstream.

index 75c3bfd..a262192 100644
--- a/drivers/md/faulty.c
+++ b/drivers/md/faulty.c
@@ -315,8 +315,11 @@ static int run(struct mddev *mddev)
        }
        conf->nfaults = 0;
 
-       list_for_each_entry(rdev, &mddev->disks, same_set)
+       list_for_each_entry(rdev, &mddev->disks, same_set) {
                conf->rdev = rdev;
+               disk_stack_limits(mddev->gendisk, rdev->bdev,
+                                 rdev->data_offset << 9);
+       }
 
        md_set_array_sectors(mddev, faulty_size(mddev, 0, 0));
        mddev->private = conf;


FWIW, I think maybe xfs was not handling a failed bio_alloc due to the much higher number, as well.

Comment 8 Eric Sandeen 2012-10-19 23:32:23 UTC
Confirmed, xfs isn't handling a xfs_alloc_ioend_bio() allocation failure w/ 64k nvecs...

Comment 9 Eric Sandeen 2012-10-19 23:48:33 UTC
Sent patch upstream:

http://marc.info/?l=linux-raid&m=135069023019874&w=2

Comment 11 RHEL Program Management 2012-10-24 15:12:26 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 12 Jarod Wilson 2012-10-29 18:51:49 UTC
Patch(es) available on kernel-2.6.32-337.el6

Comment 17 errata-xmlrpc 2013-02-21 06:51:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html