Bug 1201473
Summary: | unsynced raid snapshot creation/deletion causes panic | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Corey Marthaler <cmarthal> | ||||
Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> | ||||
lvm2 sub component: | Mirroring and RAID (RHEL6) | QA Contact: | cluster-qe <cluster-qe> | ||||
Status: | CLOSED NEXTRELEASE | Docs Contact: | |||||
Severity: | urgent | ||||||
Priority: | unspecified | CC: | agk, dhoward, heinzm, jbrassow, msnitzer, prajnoha, prockai, tlavigne, zkabelac | ||||
Version: | 6.7 | Keywords: | Regression, TestBlocker | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-12-17 18:18:44 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1268411 | ||||||
Attachments: |
|
Description
Corey Marthaler
2015-03-12 18:36:26 UTC
Version-Release number of selected component (if applicable): 2.6.32-540.el6.x86_64 lvm2-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 lvm2-libs-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 lvm2-cluster-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 udev-147-2.57.el6 BUILT: Thu Jul 24 08:48:47 CDT 2014 device-mapper-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-libs-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-event-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-event-libs-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-persistent-data-0.3.2-1.el6 BUILT: Fri Apr 4 08:43:06 CDT 2014 cmirror-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 I can not reproduce this bug on a physical machine. I can however continue to reproduce this on my virt machines, even running a newer kernel. I've played around with the size of the raid volume on both types of machines to get varying degrees of raid sync. I'll attach the info on the set up of my virt machines. 2.6.32-544.el6.x86_64 lvm2-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 lvm2-libs-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 lvm2-cluster-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 udev-147-2.61.el6 BUILT: Mon Mar 2 05:08:11 CST 2015 device-mapper-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-libs-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-event-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-event-libs-1.02.94-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 device-mapper-persistent-data-0.3.2-1.el6 BUILT: Fri Apr 4 08:43:06 CDT 2014 cmirror-2.02.117-1.el6 BUILT: Wed Mar 4 09:30:04 CST 2015 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [<0000000000000004>] 0x4 PGD 0 Oops: 0010 [#1] SMP last sysfs file: /sys/devices/virtual/block/dm-9/dm/suspended CPU 0 Modules linked in: dm_snapshot dm_bufio dm_raid raid10 raid1 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx iptable_filter ip_tables autofs4 sg sd_mod crc_t10dif be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_multipath microcode serio_raw virtio_balloon virtio_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] Pid: 26, comm: md_misc/0 Not tainted 2.6.32-544.el6.x86_64 #1 Red Hat KVM RIP: 0010:[<0000000000000004>] [<0000000000000004>] 0x4 RSP: 0018:ffff88003ea5be08 EFLAGS: 00010202 RAX: 0000000000000004 RBX: ffff880037fb6800 RCX: ffff880002218e88 RDX: 0000000000000000 RSI: ffff880002218e88 RDI: 000000066474e551 RBP: ffff88003ea5be20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880002218e80 R13: ffffffffa042e3e0 R14: ffff88003ea5bfd8 R15: ffff880002218e88 FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000004 CR3: 000000003ce30000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process md_misc/0 (pid: 26, threadinfo ffff88003ea58000, task ffff88003ea4eab0) Stack: ffffffffa0005859 ffff88003ea5be30 ffff880002218e80 ffff88003ea5be30 <d> ffffffffa042e3f9 ffff88003ea5bee0 ffffffff8109a730 0000000000000000 <d> 0000000000000000 ffff88003ea5be60 ffff88003ea4f128 ffff88003ea4eab0 Call Trace: [<ffffffffa0005859>] ? dm_table_event+0x49/0x60 [dm_mod] [<ffffffffa042e3f9>] do_table_event+0x19/0x20 [dm_raid] [<ffffffff8109a730>] worker_thread+0x170/0x2a0 [<ffffffff810a1300>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8109a5c0>] ? worker_thread+0x0/0x2a0 [<ffffffff810a0e6e>] kthread+0x9e/0xc0 [<ffffffff8109a5c0>] ? worker_thread+0x0/0x2a0 [<ffffffff8100c28a>] child_rip+0xa/0x20 [<ffffffff810a0dd0>] ? kthread+0x0/0xc0 [<ffffffff8100c280>] ? child_rip+0x0/0x20 Code: Bad RIP value. RIP [<0000000000000004>] 0x4 RSP <ffff88003ea5be08> CR2: 0000000000000004 ---[ end trace ac1a7c7bdfa3a583 ]--- Kernel panic - not syncing: Fatal exception Pid: 26, comm: md_misc/0 Tainted: G D --------------- 2.6.32-544.el6.x86_64 #1 Call Trace: [<ffffffff8153426f>] ? panic+0xa7/0x16f [<ffffffff81539044>] ? oops_end+0xe4/0x100 [<ffffffff8104e8cb>] ? no_context+0xfb/0x260 [<ffffffff8104eb55>] ? __bad_area_nosemaphore+0x125/0x1e0 [<ffffffff8104ec23>] ? bad_area_nosemaphore+0x13/0x20 [<ffffffff8104f31c>] ? __do_page_fault+0x30c/0x500 [<ffffffff8106f6d2>] ? enqueue_entity+0x112/0x440 [<ffffffff810602c4>] ? check_preempt_wakeup+0x1a4/0x260 [<ffffffff8106fa64>] ? enqueue_task_fair+0x64/0x100 [<ffffffff8105a7ec>] ? check_preempt_curr+0x7c/0x90 [<ffffffff810670de>] ? try_to_wake_up+0x24e/0x3e0 [<ffffffff8153af6e>] ? do_page_fault+0x3e/0xa0 [<ffffffffa042e3e0>] ? do_table_event+0x0/0x20 [dm_raid] [<ffffffff81538325>] ? page_fault+0x25/0x30 [<ffffffffa042e3e0>] ? do_table_event+0x0/0x20 [dm_raid] [<ffffffffa0005859>] ? dm_table_event+0x49/0x60 [dm_mod] [<ffffffffa042e3f9>] ? do_table_event+0x19/0x20 [dm_raid] [<ffffffff8109a730>] ? worker_thread+0x170/0x2a0 [<ffffffff810a1300>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8109a5c0>] ? worker_thread+0x0/0x2a0 [<ffffffff810a0e6e>] ? kthread+0x9e/0xc0 [<ffffffff8109a5c0>] ? worker_thread+0x0/0x2a0 [<ffffffff8100c28a>] ? child_rip+0xa/0x20 [<ffffffff810a0dd0>] ? kthread+0x0/0xc0 [<ffffffff8100c280>] ? child_rip+0x0/0x20 Created attachment 1004251 [details]
my virt setup information
Heinz, My virt nodes use iscsi devices for storage. (In reply to Corey Marthaler from comment #11) > Heinz, > > My virt nodes use iscsi devices for storage. Can you reproduce on your vms with other type storage? This issue needs to be described in the Release Notes for RHEL 6.7 Content Services needs your input to make that happen. Please complete the Doc Text text field for this bug by April 20 using the Cause, Consequence, Workaround, and Result model, as follows: Cause — Actions or circumstances that cause this bug to occur on a customer's system Consequence — What happens to the customer's system or application when the bug occurs? Workaround (if any) — If a workaround for the issue exists, describe in detail. If more than one workaround is available, describe each one. Result — Describe what happens when a workaround is applied. If the issue is completely circumvented by the workaround, state so. Any side effects caused by the workaround should also be noted here. If no reliable workaround exists, try to describe some preventive measures that help to avoid the bug scenario. moving back to assigned so any updated patch is not forgotten for posting in 6.8 |