Bug 1056807

Summary: altering thin pool volumes can cause a kernel BUG
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Thin Provisioning QA Contact: Cluster QE <mspqa-list>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: high CC: agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac
Version: 7.0Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-23 21:35:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2014-01-23 00:00:10 UTC
Description of problem:
Playing around thin pool configuration, in this case an lvrename.


Create 16 PV(s) for thinpool_8_2673 on host-073.virt.lab.msp.redhat.com
Create VG thinpool_8_2673 on host-073.virt.lab.msp.redhat.com
make needed LVs on host-073.virt.lab.msp.redhat.com
lvcreate --thinpool thinpool_8_26730 -L 1G thinpool_8_2673


--- ALTERATION ITERATION 1 ---------------------------

Skipping reduce (Not supported on thin pool volumes)
VOLUME EXTENSION from 256 to 266 on host-073.virt.lab.msp.redhat.com

VOLUME MINOR DEV NUM CHANGE to 175 on host-073.virt.lab.msp.redhat.com
host-073.virt.lab.msp.redhat.com: lvchange -ay -f -My --major 255 --minor 175 /dev/thinpool_8_2673/thinpool_8_26730
verifying new minor num on host-073.virt.lab.msp.redhat.com

VOLUME RENAME from /dev/thinpool_8_2673/thinpool_8_26730 to rename_192 on host-073.virt.lab.msp.redhat.com
Didn't receive heartbeat for 120 seconds
lvrename failed:


Jan 22 17:10:58 host-073 qarshd[28585]: Running cmdline: lvrename thinpool_8_2673 /dev/thinpool_8_2673/thinpool_8_26730 rename_192
Jan 22 17:10:58 host-073 lvm[15196]: No longer monitoring thin thinpool_8_2673-thinpool_8_26730-tpool.
[528497.004741] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[528497.005023] BUG: unable to handle kernel paging request at ffff880010446800
[528497.005023] IP: [<ffff880010446800>] 0xffff8800104467ff
[528497.005023] PGD 1db8067 PUD 1db9067 PMD 10494063 PTE 8000000010446163
[528497.005023] Oops: 0011 [#1] SMP 
[528497.005023] Modules linked in: dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison ext4 mbcache jbd2 dm_raid raid456 raid1 raid10 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx binfmt_misc bridge stp llc sd_mod crct10dif_generic crc_t10dif crct10dif_common iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter ip_tables sctp sg microcode virtio_balloon pcspkr serio_raw i2c_piix4 virtio_net i2c_core mperf nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c ata_generic pata_acpi virtio_blk ata_piix virtio_pci virtio_ring virtio libata floppy dm_mirror dm_region_hash dm_log dm_mod
[528497.005023] CPU: 0 PID: 15196 Comm: dmeventd Not tainted 3.10.0-67.el7.x86_64 #1
[528497.005023] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[528497.005023] task: ffff88003d1eab40 ti: ffff880011852000 task.ti: ffff880011852000
[528497.005023] RIP: 0010:[<ffff880010446800>]  [<ffff880010446800>] 0xffff8800104467ff
[528497.005023] RSP: 0000:ffff880011853a50  EFLAGS: 00010286
[528497.005023] RAX: ffff880010446800 RBX: 0000000000000080 RCX: ffffffffa03be050
[528497.005023] RDX: 0000000000000000 RSI: ffff880011853b70 RDI: ffff88003b8e18d8
[528497.005023] RBP: ffff880011853ae8 R08: 0000000000000000 R09: 0000000000000000
[528497.005023] R10: 0000000000000100 R11: 0000000000000000 R12: 0000000000000000
[528497.005023] R13: ffff88003b8e18d8 R14: ffff880011853b70 R15: 000000000000044c
[528497.005023] FS:  00007fe72b6d3800(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[528497.005023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[528497.005023] CR2: ffff880010446800 CR3: 000000003d240000 CR4: 00000000000006f0
[528497.005023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[528497.005023] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[528497.005023] Stack:
[528497.005023]  ffffffff811464b5 ffff880011853aa8 ffffffff8115762c fffffff411853ad8
[528497.005023]  ffffffff8113c389 0000000000000016 000000000001dba0 0000000000000080
[528497.005023]  ffff88003ba22ae8 0000000000000000 000000000001db9f 0000000000000020
[528497.005023] Call Trace:
[528497.005023]  [<ffffffff811464b5>] ? shrink_slab+0xa5/0x330
[528497.005023]  [<ffffffff8115762c>] ? compact_zone+0xfc/0x400
[528497.005023]  [<ffffffff8113c389>] ? __rmqueue+0x89/0x440
[528497.005023]  [<ffffffff8114962b>] do_try_to_free_pages+0x38b/0x4a0
[528497.005023]  [<ffffffff81149811>] try_to_free_pages+0xd1/0x170
[528497.005023]  [<ffffffff8113efc2>] __alloc_pages_nodemask+0x672/0xa00
[528497.005023]  [<ffffffff81142bd1>] ? release_pages+0x1c1/0x200
[528497.005023]  [<ffffffff8117ce8a>] alloc_pages_vma+0x9a/0x140
[528497.005023]  [<ffffffff81191120>] do_huge_pmd_anonymous_page+0x190/0x4a0
[528497.005023]  [<ffffffff8115ed97>] handle_mm_fault+0x2f7/0x3a0
[528497.005023]  [<ffffffff815c7e86>] __do_page_fault+0x146/0x510
[528497.005023]  [<ffffffff81163769>] ? vma_merge+0x229/0x330
[528497.005023]  [<ffffffff811648f9>] ? do_brk+0x209/0x330
[528497.005023]  [<ffffffff815c826a>] do_page_fault+0x1a/0x70
[528497.005023]  [<ffffffff815c4608>] page_fault+0x28/0x30
[528497.005023] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 07 00 00 
[528497.005023] RIP  [<ffff880010446800>] 0xffff8800104467ff
[528497.005023]  RSP <ffff880011853a50>
[528497.005023] CR2: ffff880010446800
[528497.005023] ---[ end trace b12845f56f1f7f2d ]---
[528497.005023] Kernel panic - not syncing: Fatal exception




Version-Release number of selected component (if applicable):
3.10.0-67.el7.x86_64
lvm2-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
lvm2-libs-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
lvm2-cluster-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-libs-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-event-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-event-libs-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-persistent-data-0.2.8-3.el7    BUILT: Fri Dec 27 13:40:56 CST 2013
cmirror-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014



How reproducible:
Reproduced this twice now.

Comment 1 Corey Marthaler 2014-01-23 00:02:10 UTC
Second time was while adding and deleting volume tags.

Jan 22 17:08:41 host-026 qarshd[13625]: Running cmdline: lvchange --deltag L7/ --addtag L9! --addtag lvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_configlvm_config /dev/thinpool_5_2726/thinpool_5_27260
Jan 22 17:08:42 host-026 systemd: Starting qarsh Per-Connection Server...
Jan 22 17:08:42 host-026 systemd: Started qarsh Per-Connection Server.
Jan 22 17:08:42 host-026 qarshd[13629]: Talking to peer ::ffff:10.15.80.47:39634 (IPv6)
Jan 22 17:08:42 host-026 qarshd[13629]: Running cmdline: lvs /dev/thinpool_5_2726/thinpool_5_27260 --noheadings -o lv_tags
Jan 22 17:08:42 host-026 systemd: Starting qarsh Per-Connection Server...
Jan 22 17:08:42 host-026 systemd: Started qarsh Per-Connection Server.
Jan 22 17:08:42 host-026 qarshd[13634]: Talking to peer ::ffff:10.15.80.47:39635 (IPv6)
[526665.437017] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[526665.437017] BUG: unable to handle kernel paging request at ffff88001be7bc00
[526665.437017] IP: [<ffff88001be7bc00>] 0xffff88001be7bbff
[526665.437017] PGD 1db8067 PUD 1db9067 PMD 370da063 PTE 800000001be7b163
[526665.437017] Oops: 0011 [#1] SMP 
[526665.437017] Modules linked in: dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison dm_raid raid456 raid1 raid10 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx dm_log_userspace binfmt_misc bridge stp llc sd_mod crct10dif_generic crc_t10dif crct10dif_common iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter ip_tables sctp sg i2c_piix4 serio_raw microcode pcspkr i2c_core virtio_balloon virtio_net mperf nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c ata_generic pata_acpi virtio_blk ata_piix virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[526665.437017] CPU: 0 PID: 427 Comm: systemd-journal Not tainted 3.10.0-67.el7.x86_64 #1
[526665.437017] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[526665.437017] task: ffff88003d14b610 ti: ffff880037428000 task.ti: ffff880037428000
[526665.437017] RIP: 0010:[<ffff88001be7bc00>]  [<ffff88001be7bc00>] 0xffff88001be7bbff
[526665.437017] RSP: 0000:ffff880037429a50  EFLAGS: 00010286
[526665.437017] RAX: ffff88001be7bc00 RBX: 0000000000000080 RCX: 0000000000000000
[526665.437017] RDX: 000000000000c5de RSI: ffff880037429b70 RDI: ffff88003bb272d8
[526665.437017] RBP: ffff880037429ae8 R08: 0000000000000000 R09: 0000000000000040
[526665.437017] R10: 0000000000000100 R11: 0000000000000000 R12: 000000000000000c
[526665.437017] R13: ffff88003bb272d8 R14: ffff880037429b70 R15: 0000000000000000
[526665.437017] FS:  00007f8f6face840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[526665.437017] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[526665.437017] CR2: ffff88001be7bc00 CR3: 000000003d679000 CR4: 00000000000006f0
[526665.437017] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[526665.437017] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[526665.437017] Stack:
[526665.437017]  ffffffff811464b5 ffff880037429aa8 ffffffff8115762c 0000000037429ad8
[526665.437017]  ffffffff8113c389 0000000000000000 000000000001b6a4 0000000000000080
[526665.437017]  ffff88003c22f068 0000000000000000 000000000001b6a3 0000000000000020
[526665.437017] Call Trace:
[526665.437017]  [<ffffffff811464b5>] ? shrink_slab+0xa5/0x330
[526665.437017]  [<ffffffff8115762c>] ? compact_zone+0xfc/0x400
[526665.437017]  [<ffffffff8113c389>] ? __rmqueue+0x89/0x440
[526665.437017]  [<ffffffff8114962b>] do_try_to_free_pages+0x38b/0x4a0
[526665.437017]  [<ffffffff81149811>] try_to_free_pages+0xd1/0x170
[526665.437017]  [<ffffffff8113efc2>] __alloc_pages_nodemask+0x672/0xa00
[526665.437017]  [<ffffffff81142bd1>] ? release_pages+0x1c1/0x200
[526665.437017]  [<ffffffff8117ce8a>] alloc_pages_vma+0x9a/0x140
[526665.437017]  [<ffffffff81191120>] do_huge_pmd_anonymous_page+0x190/0x4a0
[526665.437017]  [<ffffffff8115ed97>] handle_mm_fault+0x2f7/0x3a0
[526665.437017]  [<ffffffff815c7e86>] __do_page_fault+0x146/0x510
[526665.437017]  [<ffffffff81165525>] ? do_mmap_pgoff+0x305/0x3c0
[526665.437017]  [<ffffffff81150cb9>] ? vm_mmap_pgoff+0x99/0xc0
[526665.437017]  [<ffffffff815c826a>] do_page_fault+0x1a/0x70
[526665.437017]  [<ffffffff815c4608>] page_fault+0x28/0x30
[526665.437017] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <45> a1 7f 81 ff ff ff ff 80 00 00 00 00 00 00 00 00 00 00 00 00 
[526665.437017] RIP  [<ffff88001be7bc00>] 0xffff88001be7bbff
[526665.437017]  RSP <ffff880037429a50>
[526665.437017] CR2: ffff88001be7bc00
[526665.437017] ---[ end trace 67a2e0bb9a0a1de3 ]---
[526665.437017] Kernel panic - not syncing: Fatal exception

Comment 2 Mike Snitzer 2014-01-23 01:10:43 UTC
Pretty certain this is a dup of bug#1056647.

You could try this brew-built kernel to validate that is the case:
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=6918718

I'll leave this BZ open for now..

Comment 3 Mike Snitzer 2014-01-23 16:47:39 UTC
(In reply to Mike Snitzer from comment #2)
> Pretty certain this is a dup of bug#1056647.
> 
> You could try this brew-built kernel to validate that is the case:
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=6918718
> 
> I'll leave this BZ open for now..

Please use this brew build instead (still building but...):

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=6924773

Comment 4 Mike Snitzer 2014-01-23 21:35:24 UTC

*** This bug has been marked as a duplicate of bug 1056647 ***

Comment 5 Corey Marthaler 2014-01-24 00:08:08 UTC
The new kernel does appear to fix this issue, haven't hit the problem in over 3 hours of testing...