Bug 1061339

Summary:

NULL pointer dereference when TRIM is issued on MD device

Product:

[Fedora] Fedora

Reporter:

Richard W.M. Jones <rjones>

Component:

kernel

Assignee:

Jes Sorensen <Jes.Sorensen>

Status:

CLOSED RAWHIDE

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

rawhide

CC:

esandeen, gansalmon, itamar, jonathan, josef, kernel-maint, kzak, madhu.chinakonda, msnitzer, oliver

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-02-14 19:11:11 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
log file	none
log file (md only case)	none

Description Richard W.M. Jones 2014-02-04 16:05:31 UTC

Description of problem:

mke2fs -t ext2 -F -b 4096 /dev/VG/LV1
mke2fs 1.42.9 (28-Dec-2013)
[   44.142483] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[   44.142483] IP: [<ffffffff8122040a>] bio_trim+0x1a/0x40
[   44.142483] PGD 1d193067 PUD 1d1c1067 PMD 0 
[   44.142483] Oops: 0000 [#1] SMP 
[   44.142483] Modules linked in: raid1 kvm_amd snd_pcsp snd_pcm kvm snd_timer snd soundcore serio_raw ata_generic pata_acpi virtio_balloon virtio_pci virtio_mmio virtio_net virtio_scsi virtio_blk virtio_console virtio_rng virtio_ring virtio ideapad_laptop sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc32 crc_itu_t libcrc32c megaraid megaraid_sas megaraid_mbox megaraid_mm
[   44.142483] CPU: 0 PID: 229 Comm: mke2fs Tainted: G        W    3.14.0-0.rc1.git0.1.fc21.x86_64 #1
[   44.142483] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   44.142483] task: ffff88001c100000 ti: ffff88001c0e4000 task.ti: ffff88001c0e4000
[   44.142483] RIP: 0010:[<ffffffff8122040a>]  [<ffffffff8122040a>] bio_trim+0x1a/0x40
[   44.142483] RSP: 0018:ffff88001c0e5b88  EFLAGS: 00000246
[   44.142483] RAX: ffff88001d13f020 RBX: 0000000000000000 RCX: 000000000000b690
[   44.142483] RDX: 0000000000008000 RSI: 0000000000000000 RDI: 0000000000000000
[   44.142483] RBP: ffff88001c0e5b98 R08: 00000000000174a0 R09: ffff88001f0174a0
[   44.142483] R10: 0000000000000000 R11: ffffea0000744fc0 R12: 0000000001000000
[   44.142483] R13: 0000000000000000 R14: ffff88001c0bfe80 R15: ffff88001d16df00
[   44.142483] FS:  00007fe89c7817c0(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000
[   44.142483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   44.142483] CR2: 0000000000000028 CR3: 000000001c0e7000 CR4: 00000000000006f0
[   44.142483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   44.142483] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
[   44.142483] Stack:
[   44.142483]  0000000000000001 0000000000000000 ffff88001c0e5c80 ffffffffa01923f3
[   44.142483]  ffff88001c0e5c50 ffffc90000125040 0000000000008000 ffff88001d16df60
[   44.142483]  0000000000003000 ffff88001c0e5c18 ffffffff00008000 0000000000000001
[   44.142483] Call Trace:
[   44.142483]  [<ffffffffa01923f3>] make_request+0x4c3/0xcd0 [raid1]
[   44.142483]  [<ffffffff810c8ec6>] ? check_preempt_wakeup+0x166/0x250
[   44.142483]  [<ffffffff81555e85>] md_make_request+0xe5/0x230
[   44.142483]  [<ffffffff81326c20>] generic_make_request+0xe0/0x130
[   44.142483]  [<ffffffff81326ce8>] submit_bio+0x78/0x160
[   44.142483]  [<ffffffff81220bfe>] ? bio_alloc_bioset+0x1ce/0x2f0
[   44.142483]  [<ffffffff811fcc73>] ? pollwake+0x73/0x90
[   44.142483]  [<ffffffff8133243b>] blkdev_issue_discard+0x1fb/0x2c0
[   44.142483]  [<ffffffff81336da5>] blkdev_ioctl+0x635/0x7d0
[   44.142483]  [<ffffffff811e83a7>] ? do_sync_write+0x67/0xa0
[   44.142483]  [<ffffffff81222d11>] block_ioctl+0x41/0x50
[   44.142483]  [<ffffffff811fbf90>] do_vfs_ioctl+0x2e0/0x4a0
[   44.142483]  [<ffffffff811fc1f1>] SyS_ioctl+0xa1/0xc0
[   44.142483]  [<ffffffff816fbbe9>] system_call_fastpath+0x16/0x1b
[   44.142483] Code: 01 e9 75 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 54 41 89 d4 41 c1 e4 09 85 f6 53 48 89 fb 75 06 <44> 3b 67 28 74 14 3e 80 63 10 f7 c1 e6 09 48 89 df e8 f0 fe ff 
[   44.142483] RIP  [<ffffffff8122040a>] bio_trim+0x1a/0x40
[   44.142483]  RSP <ffff88001c0e5b88>
[   44.142483] CR2: 0000000000000028
[   44.144483] ---[ end trace f318ded04f590341 ]---
guestfsd: error: ext2: /dev/VG/LV1: mke2fs 1.42.9 (28-Dec-2013)
libguestfs: trace: mkfs = -1 (error)

Version-Release number of selected component (if applicable):

kernel 0:3.14.0-0.rc1.git0.1.fc21
e2fsprogs-1.42.9-2.fc21.x86_64

How reproducible:

Unknown, at least once.

Steps to Reproduce:
1. Run the libguestfs test suite in Rawhide.

Additional info:

http://kojipkgs.fedoraproject.org//work/tasks/9085/6489085/build.log
http://kojipkgs.fedoraproject.org//work/tasks/9085/6489085/root.log

Comment 1 Richard W.M. Jones 2014-02-04 16:10:04 UTC

I should note:

This is running under virtualization.  I don't have an easy means
to test this on baremetal, so don't ask me to do that.

The backing disk is virtio-scsi.

It was all working fine about 2 weeks ago.

Comment 2 Eric Sandeen 2014-02-04 17:07:05 UTC

Heh, userspace doing I/O should never cause a kernel bug.  This is a kernel bug, not e2fsprogs.

Looks like possibly a problem in dm discard handling.

Comment 3 Richard W.M. Jones 2014-02-05 12:22:22 UTC

I was pointed to this patch, and tested it, but it did *NOT*
fix this bug.
https://lkml.org/lkml/2014/2/4/107

Comment 4 Josh Boyer 2014-02-07 14:37:41 UTC

Can you recreate this with no previous kernel oopses/warnings present?  Likely so, but we'd like to make sure something else didn't mess up kernel memory and your oops has the 'W' taint set already.

Comment 5 Richard W.M. Jones 2014-02-07 15:16:40 UTC

Created attachment 860554 [details]
log file

The shortest reproducer I can come up with (using guestfish) is:

guestfish -xv -N part -N part \
  md-create test "/dev/sda1 /dev/sdb1" : \
  pvcreate /dev/md/test : \
  vgcreate VG /dev/md/test : \
  lvcreate LV VG 32 : \
  mkfs ext4 /dev/VG/LV

The full output (including the actual commands being run by
guestfsd) is attached.

Unfortunately there is an earlier problem (in kvm_amd module).
This is automatically loaded because I'm running this under TCG
so the guest thinks that nested (AMD) virt is available.  Not sure
how to get rid of this.

Comment 6 Richard W.M. Jones 2014-02-07 15:24:34 UTC

I renamed the kvm-amd.ko file so it wouldn't get loaded.  The
mkfs bug reported here still occurs.

Comment 7 Mike Snitzer 2014-02-08 13:43:04 UTC

Given md_make_request in the stack trace, this looks like an MD bug, not DM.

Reassigning to Jes.

Comment 8 Richard W.M. Jones 2014-02-08 14:09:17 UTC

Created attachment 860894 [details]
log file (md only case)

(In reply to Mike Snitzer from comment #7)
> Given md_make_request in the stack trace, this looks like an MD bug, not DM.

You are correct.  In fact the problem happens with a pure MD
device, as in this test case:

guestfish -xv -N part -N part \
  md-create test "/dev/sda1 /dev/sdb1" : \
  mkfs ext4 /dev/md/test

The full output from this test is attached.

Comment 9 Jes Sorensen 2014-02-08 14:15:18 UTC

Could you please provide the actually run creating the device and /proc/mdstat
output.

It would be interesting to know whether this happens on non virtio-scsi.

I don't have an easy way to test this, so please don't me expect to.

Jes

Comment 10 Jes Sorensen 2014-02-08 14:18:02 UTC

(In reply to Jes Sorensen from comment #9)
> Could you please provide the actually run creating the device and
> /proc/mdstat
> output.
> 
> It would be interesting to know whether this happens on non virtio-scsi.
> 
> I don't have an easy way to test this, so please don't me expect to.

...test virtio-scsi that is.

Comment 11 Richard W.M. Jones 2014-02-08 14:30:59 UTC

(In reply to Jes Sorensen from comment #9)
> Could you please provide the actually run creating the device

It's in the output attached above, but in brief the commands
run are:

mdadm --create --run test --level raid1 --raid-devices 2 /dev/sda1 /dev/sdb1
wipefs -a --force /dev/md/test
mke2fs -t ext4 -F /dev/md/test

The mke2fs command is the one which fails.

> and /proc/mdstat output.

The /proc/mdstat after creation of the MD device but before
running mke2fs is:

Personalities : [raid1] 
md127 : active raid1 sdb1[1] sda1[0]
      102144 blocks super 1.2 [2/2] [UU]
      [==>..................]  resync = 14.5% (14848/102144) finish=0.0min speed=14848K/sec
      
unused devices: <none>

I guess the resync does not complete before the mke2fs runs, because
the commands are run in series as fast as possible.

> It would be interesting to know whether this happens on non virtio-scsi.

The following script uses [QMU-emulated] IDE, and it also fails in the same way,
so it seems to have nothing to do with virtio-scsi.

--------------------------------------------
#!/bin/bash -

export LIBGUESTFS_BACKEND=direct

rm /tmp/test1.img /tmp/test2.img
truncate -s 100M /tmp/test1.img
truncate -s 100M /tmp/test2.img

guestfish -xv <<EOF
add-drive-opts /tmp/test1.img iface:ide
add-drive-opts /tmp/test2.img iface:ide
run
part-disk /dev/sda mbr
part-disk /dev/sdb mbr
md-create test "/dev/sda1 /dev/sdb1"
mkfs ext4 /dev/md/test
EOF

Comment 12 Richard W.M. Jones 2014-02-11 12:23:24 UTC

Kent Overstreet posted a patch which fixes the problem for me.

https://lkml.org/lkml/2014/2/10/809
[PATCH] block: Fix cloning of discard/write same bios

Comment 13 Josh Boyer 2014-02-14 19:11:11 UTC

This should be fixed with the rc2-git4 kernel that will be built today.