Bug 971553 - Creating RAID4 LV raises kernel WARNIN followed by invalid opcode: 0000 [#1] SMP at kernel BUG at drivers/block/virtio_blk.c:368! when creating FS
Creating RAID4 LV raises kernel WARNIN followed by invalid opcode: 0000 [#1] ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2 (Show other bugs)
7.0
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Jonathan Earl Brassow
cluster-qe@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-06 15:39 EDT by Marian Csontos
Modified: 2015-07-14 14:24 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-14 14:24:02 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Marian Csontos 2013-06-06 15:39:16 EDT
Description of problem:
creating raid4 LV results in "bio idx 0 >= vcnt 0" WARNING: at fs/bio.c:819 bio_advance+0xbe/0xd0() followed by invalid opcode: 0000 [#1] SMP at kernel BUG at drivers/block/virtio_blk.c:368! when creating FS

Version-Release number of selected component (if applicable):
kernel-3.10.0-0.rc4.59.el7.x86_64
lvm2-2.02.99-0.46.el7.x86_64
(lvmetad is running)

How reproducible:
2/2

Steps to Reproduce:
1. lvcreate -n raid -i 4 --type raid4 -L 3.5G shuffle
2. wait for sync
3. mkfs /dev/shuffle/raid

Actual results:
LV locks up. Kernel thread stopped.

Expected results:
Pass or fail gracefully

Additional info:

Relevant data from messages:

Running `lvcreate -n raid -i 4 --type raid4 -L 3.5G shuffle`:

    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.749667] device-mapper: raid: Superblocks created for new array
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.758279] md/raid:mdX: not clean -- starting background reconstruction
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.759325] md/raid:mdX: device dm-11 operational as raid disk 4
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.760260] md/raid:mdX: device dm-9 operational as raid disk 3
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.761100] md/raid:mdX: device dm-7 operational as raid disk 2
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.761931] md/raid:mdX: device dm-5 operational as raid disk 1
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.762829] md/raid:mdX: device dm-3 operational as raid disk 0
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.764506] md/raid:mdX: allocated 5394kB
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.765602] md/raid:mdX: raid level 5 active with 5 out of 5 devices, algorithm 4
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.767169] Choosing daemon_sleep default (5 sec)
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.768104] created bitmap (1 pages) for device mdX
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.780001] mdX: bitmap file is out of date, doing full recovery
    Jun  6 09:27:06 zaphodc1-node02 kernel: [  366.784467] mdX: bitmap initialized from disk: read 1 pages, set 1792 of 1792 bits
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.798082] md: resync of RAID array mdX
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.798534] ------------[ cut here ]------------
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799002] WARNING: at fs/bio.c:819 bio_advance+0xbe/0xd0()
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013] bio idx 0 >= vcnt 0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013] Modules linked in: dm_raid raid456 raid1 raid10 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sctp sunrpc sg xfs libcrc32c microcode pcspkr virtio_balloon i2c_piix4 virtio_net i2c_core ata_generic pata_acpi ata_piix libata virtio_blk floppy dm_mirror dm_region_hash dm_log dm_mod
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013] CPU: 0 PID: 4128 Comm: mdX_resync Not tainted 3.10.0-0.rc4.59.el7.x86_64 #1
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  0000000000000009 ffff88003fc03cf8 ffffffff81600cc7 ffff88003fc03d30
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  ffffffff8105f621 0000000000001000 ffff88003d373e20 0000000000000000
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  0000000000001000 ffff8800375ee758 ffff88003fc03d90 ffffffff8105f68c
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013] Call Trace:
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  <IRQ>  [<ffffffff81600cc7>] dump_stack+0x19/0x1b
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8105f621>] warn_slowpath_common+0x61/0x80
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8105f68c>] warn_slowpath_fmt+0x4c/0x50
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff811cdfce>] bio_advance+0xbe/0xd0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff812c3c57>] blk_update_request+0xa7/0x3e0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff812c3fac>] blk_update_bidi_request+0x1c/0x80
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff812c6d17>] __blk_end_bidi_request+0x17/0x40
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff812c6e1f>] __blk_end_request_all+0x1f/0x30
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffffa004c34b>] virtblk_done+0x11b/0x250 [virtio_blk]
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81388b78>] vring_interrupt+0x38/0x90
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff810eeefe>] handle_irq_event_percpu+0x3e/0x1e0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff810ef0d7>] handle_irq_event+0x37/0x60
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff810f1abf>] handle_edge_irq+0x6f/0x120
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8101329f>] handle_irq+0xbf/0x150
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81067eff>] ? irq_enter+0x4f/0x90
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff816110cd>] do_IRQ+0x4d/0xc0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff816070ed>] common_interrupt+0x6d/0x6d
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  <EOI>  [<ffffffff810619d7>] ? vprintk_emit+0x1c7/0x520
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff815fae52>] printk+0x67/0x69
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8149ad8a>] md_do_sync+0x38a/0xda0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8109a52c>] ? update_curr+0xec/0x170
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81097c58>] ? __enqueue_entity+0x78/0x80
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81097b30>] ? account_entity_dequeue+0x80/0xa0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8109aa03>] ? dequeue_entity+0x143/0x5c0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81091d35>] ? check_preempt_curr+0x85/0xa0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81099175>] ? set_next_entity+0x95/0xb0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff810105be>] ? __switch_to+0x13e/0x4a0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8107122b>] ? recalc_sigpending+0x1b/0x50
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8149809d>] md_thread+0x12d/0x170
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81497f70>] ? rdev_attr_show+0x90/0x90
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81084300>] kthread+0xc0/0xd0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81084240>] ? insert_kthread_work+0x40/0x40
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff8160f2ec>] ret_from_fork+0x7c/0xb0
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013]  [<ffffffff81084240>] ? insert_kthread_work+0x40/0x40
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.799013] ---[ end trace 737f57722fdda8b8 ]---
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.829860] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.830408] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
    Jun  6 09:27:07 zaphodc1-node02 kernel: [  366.831295] md: using 128k window, over a total of 917504k.

And after `mkfs /dev/shuffle/raid`:

    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.064665] ------------[ cut here ]------------
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] kernel BUG at drivers/block/virtio_blk.c:368!
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] invalid opcode: 0000 [#1] SMP
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] Modules linked in: dm_raid raid456 raid1 raid10 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sctp sunrpc sg xfs libcrc32c microcode pcspkr virtio_balloon i2c_piix4 virtio_net i2c_core ata_generic pata_acpi ata_piix libata virtio_blk floppy dm_mirror dm_region_hash dm_log dm_mod
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] CPU: 0 PID: 4124 Comm: mdX_raid5 Tainted: G        W   --------------   3.10.0-0.rc4.59.el7.x86_64 #1                                                                                                            
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] task: ffff88003d171640 ti: ffff88003d33c000 task.ti: ffff88003d33c000
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] RIP: 0010:[<ffffffffa004c9ff>]  [<ffffffffa004c9ff>] virtblk_request+0x20f/0x220 [virtio_blk]
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] RSP: 0018:ffff88003d33dcb0  EFLAGS: 00010006
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] RAX: 0000000000010001 RBX: ffff88002c362420 RCX: 0000000000000000
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] RDX: 0000000000000000 RSI: ffff8800375b5960 RDI: ffff88002c2ff028
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] RBP: ffff88003d33dce0 R08: ffff8800375b5960 R09: 0000000000000000
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] R10: 0000000000000002 R11: 0000000000000000 R12: ffff88003b84a000
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] R13: ffff8800375b5960 R14: ffff88003b84a090 R15: ffff88002c2ff028
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] CR2: 00007fa6c30a61d0 CR3: 0000000037509000 CR4: 00000000000006f0
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] Stack:
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  0000000300000000 ffff8800375b5960 ffff8800375b5960 0000000000000009
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  ffff8800375b5960 0000000000000000 ffff88003d33dcf8 ffffffff812c46d3
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  ffff88002c2ffbe8 ffff88003d33dd30 ffffffff812c487a ffff88002c2ffbe8
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] Call Trace:
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff812c46d3>] __blk_run_queue+0x33/0x40
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff812c487a>] queue_unplugged+0x2a/0xa0
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff812c732d>] blk_flush_plug_list+0x15d/0x1f0
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff812c7784>] blk_finish_plug+0x14/0x40
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffffa02bc2ae>] raid5d+0x4fe/0x580 [raid456]
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff8106e7a0>] ? __internal_add_timer+0x130/0x130
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff8149809d>] md_thread+0x12d/0x170
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff81085180>] ? wake_up_bit+0x30/0x30
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff81497f70>] ? rdev_attr_show+0x90/0x90
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff81084300>] kthread+0xc0/0xd0
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff81084240>] ? insert_kthread_work+0x40/0x40
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff8160f2ec>] ret_from_fork+0x7c/0xb0
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  [<ffffffff81084240>] ? insert_kthread_work+0x40/0x40
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] Code: 00 00 00 89 43 14 e9 c9 fe ff ff 90 c7 43 10 02 00 00 00 48 c7 43 18 00 00 00 00 41 0f b7 87 ec 00 00 00 89 43 14 e9 a9 fe ff ff <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] RIP  [<ffffffffa004c9ff>] virtblk_request+0x20f/0x220 [virtio_blk]
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016]  RSP <ffff88003d33dcb0>
    Jun  6 09:27:22 zaphodc1-node02 kernel: [  382.065016] ---[ end trace 737f57722fdda8b9 ]---

dmsetup status:

    shuffle-raid_rmeta_4: 0 8192 linear 
    shuffle-raid_rimage_4: 0 1835008 linear 
    shuffle-raid: 0 7340032 raid raid4 5 AAAAA 1835008/1835008 idle 67976
    shuffle-raid_rmeta_3: 0 8192 linear 
    shuffle-raid_rimage_3: 0 1835008 linear 
    shuffle-raid_rmeta_2: 0 8192 linear 
    rhel-swap: 0 4259840 linear 
    rhel-root: 0 9388032 linear 
    shuffle-raid_rimage_2: 0 1835008 linear 
    shuffle-raid_rmeta_1: 0 8192 linear 
    shuffle-raid_rimage_1: 0 1835008 linear 
    shuffle-raid_rmeta_0: 0 8192 linear 
    shuffle-raid_rimage_0: 0 1835008 linear
Comment 1 Marian Csontos 2013-06-06 15:40:19 EDT
Jon, if you can not reproduce, what more debugging information you need from me?
Comment 2 Mikulas Patocka 2013-06-06 16:26:59 EDT
Kent Overstreet has done most of the changes that are related to this bug.
I'd forward the bug to him...
Comment 3 Mikulas Patocka 2013-06-06 16:36:42 EDT
... the bug could be possibly caused by a bio with bi_size that doesn't match
the size of the vector.

Add a test to generic_make_request that checks that bi_size matches the sum
of length of all entries in the vector (exclude bios with (bi_rw &
BIO_NO_ADVANCE_ITER_MASK) from this test) and make it crash with BUG if the
size doesn't match. Try to reproduce the bug again and you'll find the exact
driver in the stack that produced the corrupted bio from the stacktrace.
Comment 4 Mike Snitzer 2013-06-07 08:27:15 EDT
Pretty sure you'll just need this fix:

http://git.kernel.dk/?p=linux-block.git;a=commit;h=4997b72ee62930cb841d185398ea547d979789f4
Comment 5 Mike Snitzer 2013-06-07 08:29:36 EDT
(In reply to Mike Snitzer from comment #4)
> Pretty sure you'll just need this fix:
> 
> http://git.kernel.dk/?p=linux-block.git;a=commit;
> h=4997b72ee62930cb841d185398ea547d979789f4

BTW, this fix is staged in Jens' 'for-linus' branch for his next 3.10-rcX pull request.
Comment 6 Marian Csontos 2013-06-10 14:39:03 EDT
Thanks Mike. The patch has fixed the problem. I am eager to see this in RHEL7 tree.
Comment 7 Jonathan Earl Brassow 2013-06-19 09:39:22 EDT
I'll keep an eye out for it.
Comment 8 Jonathan Earl Brassow 2013-06-20 11:56:14 EDT
Upstream in v3.10-rc6:
commit 4997b72ee62930cb841d185398ea547d979789f4
Author: Kent Overstreet <koverstreet@google.com>
Date:   Thu May 30 08:44:39 2013 +0200

    raid5: Initialize bi_vcnt
    
    The patch that converted raid5 to use bio_reset() forgot to initialize
    bi_vcnt.
    
    Signed-off-by: Kent Overstreet <koverstreet@google.com>
    Cc: NeilBrown <neilb@suse.de>
    Cc: linux-raid@vger.kernel.org
    Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
Comment 9 Marian Csontos 2013-07-01 09:56:05 EDT
Patch is included in RHEL7 nightly build. Works for me. Thanks.
Comment 10 Jonathan Earl Brassow 2013-07-24 19:03:50 EDT
Yes, patch is in RHEL7 kernel tree under the same commit ID.

Note You need to log in before you can comment on or make changes to this bug.