RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1111748 - kernel BUG at drivers/md/persistent-data/dm-btree-spine.c when taking snapshot of a cached XFS
Summary: kernel BUG at drivers/md/persistent-data/dm-btree-spine.c when taking snapsho...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Joe Thornber
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 1113520 1119326
TreeView+ depends on / blocked
 
Reported: 2014-06-20 22:43 UTC by Corey Marthaler
Modified: 2023-03-08 07:26 UTC (History)
8 users (show)

Fixed In Version: lvm2-2.02.112-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1113759 (view as bug list)
Environment:
Last Closed: 2015-03-05 13:09:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0513 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2015-03-05 16:14:41 UTC

Description Corey Marthaler 2014-06-20 22:43:15 UTC
Description of problem:

This issue appears to require the following:
 1. cached volume
 2. xfs on top and mounted
 3. some i/o written to the fs
 4. snapshot attempt

If any of the above aren't present, the kernel BUG isn't tripped and the snapshot attempt works.

lvcreate -L 4G -n corigin cache_sanity /dev/sdf1

Create cache data and cache metadata (fast) volumes
lvcreate -L 2G -n pool cache_sanity /dev/sda1
lvcreate -L 8M -n pool_meta cache_sanity /dev/sda1
WARNING: xfs signature detected on /dev/cache_sanity/pool_meta at offset 0. Wipe it? [y/n] 
  1 existing signature left on the device.

Create cache pool volume by combining the cache data and cache metadata (fast) volumes
lvconvert --type cache-pool --cachemode writethrough --poolmetadata cache_sanity/pool_meta cache_sanity/pool
Create cached volume by combining the cache pool (fast) and origin (slow) volumes
lvconvert --type cache --cachepool cache_sanity/pool cache_sanity/corigin

Placing an xfs filesystem on origin volume
Mounting origin volume

[root@host-015 ~]# lvs -a -o +devices
  LV               Attr       LSize   Pool Origin          Data%  Devices
  corigin          Cwi-aoC---   4.00g pool [corigin_corig]        corigin_corig(0)
  [corigin_corig]  -wi-ao----   4.00g                             /dev/sdf1(0)
  [lvol0_pmspare]  ewi-------   8.00m                             /dev/sdg1(0)
  pool             Cwi-a-C---   2.00g                             pool_cdata(0)
  [pool_cdata]     Cwi-aoC---   2.00g                             /dev/sda1(0)
  [pool_cmeta]     ewi-aoC---   8.00m                             /dev/sda1(512)

[root@host-015 ~]# df -h
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/cache_sanity-corigin  4.0G   33M  4.0G   1% /mnt/corigin

[root@host-015 ~]# dd if=/dev/zero of=/mnt/corigin/ddfile bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.595037 s, 176 MB/s

[root@host-015 ~]# lvcreate -s /dev/cache_sanity/corigin -c 64 -n merge -L 500M

[883088.722481] XFS (dm-2): Mounting Filesystem
[883088.765490] XFS (dm-2): Ending clean mount
[883120.221176] node_check: 6 callbacks suppressed
[883120.222037] device-mapper: btree spine: node_check failed: csum 549844194 != wanted 549945722
[883120.223520] device-mapper: block manager: btree_node validator check failed for block 4
[883120.224939] ------------[ cut here ]------------
[883120.225743] kernel BUG at drivers/md/persistent-data/dm-btree-spine.c:178!
[883120.225881] invalid opcode: 0000 [#1] SMP 
[883120.225881] Modules linked in: dm_snapshot dm_cache_mq dm_cache() dm_raid raid456 raid1 raid10 async_raid6_recov async_memcpy async_pq async_xor async_tx dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 binfmt_misc bridge stp llc sd_mod crct10dif_generic crc_t10dif crct10dif_common iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter ip_tables sctp sg i2c_piix4 pcspkr i2c_core serio_raw virtio_balloon virtio_net mperf nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c ata_generic pata_acpi virtio_blk ata_piix libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[883120.225881] CPU: 0 PID: 18467 Comm: lvcreate Tainted: G            -------------- T 3.10.0-123.el7.x86_64 #1
[883120.225881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[883120.225881] task: ffff88003bc06660 ti: ffff88000b1c0000 task.ti: ffff88000b1c0000
[883120.225881] RIP: 0010:[<ffffffffa052f761>]  [<ffffffffa052f761>] ro_node+0x21/0x30 [dm_persistent_data]
[883120.225881] RSP: 0018:ffff88000b1c1b20  EFLAGS: 00010246
[883120.225881] RAX: 0000000000000000 RBX: ffff88000b1c1b80 RCX: 0000000000000000
[883120.225881] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff88000b1c1b80
[883120.225881] RBP: ffff88000b1c1b70 R08: 0000000000000000 R09: 0000000000020c4f
[883120.225881] R10: 6567616e616d206b R11: 636f6c62203a7265 R12: ffffffffa05287a0
[883120.225881] R13: ffffffffa05287a0 R14: ffff88000b1c1bd0 R15: ffff88003b780800
[883120.225881] FS:  00007fb22160e880(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[883120.225881] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[883120.225881] CR2: 00007fd47003e000 CR3: 000000000c35a000 CR4: 00000000000006f0
[883120.225881] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[883120.225881] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[883120.225881] Stack:
[883120.225881]  ffffffffa052d522 ffffffac810dead1 ffff88000b1c1b80 ffffffff819e28e0
[883120.225881]  000000009204f187 0000000000000004 ffffffffa05287a0 ffff88000b1c1bd0
[883120.225881]  ffff88000455f800 ffff88003b780800 ffff88000b1c1bc0 ffffffffa052e21e
[883120.225881] Call Trace:
[883120.225881]  [<ffffffffa052d522>] ? walk_node+0x42/0x100 [dm_persistent_data]
[883120.225881]  [<ffffffffa05287a0>] ? block_dec+0x160/0x160 [dm_persistent_data]
[883120.225881]  [<ffffffffa052e21e>] dm_btree_walk+0x4e/0x80 [dm_persistent_data]
[883120.225881]  [<ffffffffa05bba30>] ? complete_migration+0x30/0x30 [dm_cache]
[883120.225881]  [<ffffffffa05285dc>] dm_array_walk+0x3c/0x60 [dm_persistent_data]
[883120.225881]  [<ffffffffa05be700>] ? blocks_are_unmapped_or_clean+0xd0/0xd0 [dm_cache]
[883120.225881]  [<ffffffffa05bf51f>] dm_cache_load_mappings+0x7f/0xe0 [dm_cache]
[883120.225881]  [<ffffffffa05bba30>] ? complete_migration+0x30/0x30 [dm_cache]
[883120.225881]  [<ffffffff810f0000>] ? kdb_register_repeat+0x270/0x270
[883120.225881]  [<ffffffffa05bdef9>] cache_preresume+0xf9/0x1a0 [dm_cache]
[883120.225881]  [<ffffffffa0005ff9>] dm_table_resume_targets+0x49/0xe0 [dm_mod]
[883120.225881]  [<ffffffffa000389c>] dm_resume+0x4c/0xd0 [dm_mod]
[883120.225881]  [<ffffffffa0008bcb>] dev_suspend+0x12b/0x250 [dm_mod]
[883120.225881]  [<ffffffffa0008aa0>] ? table_load+0x380/0x380 [dm_mod]
[883120.225881]  [<ffffffffa00094e5>] ctl_ioctl+0x255/0x500 [dm_mod]
[883120.225881]  [<ffffffff8123db34>] ? SYSC_semtimedop+0x264/0xd10
[883120.225881]  [<ffffffffa00097a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[883120.225881]  [<ffffffff811c2f25>] do_vfs_ioctl+0x2e5/0x4c0
[883120.225881]  [<ffffffff81257a2e>] ? file_has_perm+0xae/0xc0
[883120.225881]  [<ffffffff8123c601>] ? wake_up_sem_queue_do+0x31/0x60
[883120.225881]  [<ffffffff811c31a1>] SyS_ioctl+0xa1/0xc0
[883120.225881]  [<ffffffff815f2119>] system_call_fastpath+0x16/0x1b
[883120.225881] Code: ff ff 5d c3 0f 0b 0f 1f 40 00 0f 1f 44 00 00 8b 47 08 85 c0 74 15 83 e8 01 55 48 98 48 8b 7c c7 10 48 89 e5 e8 91 99 ff ff 5d c3 <0f> 0b 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 
[883120.225881] RIP  [<ffffffffa052f761>] ro_node+0x21/0x30 [dm_persistent_data]
[883120.225881]  RSP <ffff88000b1c1b20>
[883120.289367] ---[ end trace 39cd9818e7719e97 ]---
[883120.290102] Kernel panic - not syncing: Fatal exception



Version-Release number of selected component (if applicable):
3.10.0-123.el7.x86_64
lvm2-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-libs-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-cluster-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-persistent-data-0.3.2-1.el7    BUILT: Thu Apr  3 09:58:51 CDT 2014
cmirror-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014


How reproducible:
Everytime

Comment 1 Joe Thornber 2014-06-24 09:06:12 UTC
I've added a test to the device-mapper-test-suite that recreates this scenario driving device-mapper directly (cutting LVM out of the loop):

https://github.com/jthornber/device-mapper-test-suite/blob/master/lib/dmtest/tests/stacking/stacking_tests.rb

With my kernel tree this passes fine.

The next step is to test with a RHEL7 kernel.

Comment 2 Joe Thornber 2014-06-24 14:15:57 UTC
Test passes on a RHEL7 kernel.

Handing bug back to LVM team for their take.

Comment 3 Joe Thornber 2014-08-27 15:19:43 UTC
Taking back since seems to be stalled.

Comment 4 Petr Rockai 2014-10-22 06:46:05 UTC
I could not reproduce this with current LVM and a RHEL7 kernel (I have used smaller devices and a smaller filesystem, but that does not seem critical for reproducing the bug). The exact wording from the original bug is no longer allowed, you need to take a snapshot of the _corig volume explicitly (this might be related to why the bug no longer happens). Presumed fixed.

Comment 6 Corey Marthaler 2014-11-13 23:41:43 UTC
Hit this on the latest rpms (w/o snaps being involved) by doing many iterations of volume configuration/changing operations. I was unable to reproduce this more than once so this may be rare now. I'll work on getting a new reproducer since the original one in comment #0 no longer applies.


3.10.0-184.el7.x86_64
lvm2-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
lvm2-libs-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
lvm2-cluster-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-libs-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-event-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-event-libs-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-persistent-data-0.3.2-1.el7    BUILT: Thu Apr  3 09:58:51 CDT 2014
cmirror-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014



[...]
====================================================================
   Iter 93 started at Wed Nov 12 21:46:41 CST 2014
====================================================================

VOLUME CREATION: cache_1_1276, convert= snap=

8 disk(s) to be used:
        host-115.virt.lab.msp.redhat.com=/dev/sdh /dev/sdg /dev/sdf /dev/sdd /dev/sda /dev/sde /dev/sdc /dev/sdb
removing PV /dev/sdf1 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sda2 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sdg1 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sda1 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sdg2 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sdd2 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sdd1 on host-115.virt.lab.msp.redhat.com
removing PV /dev/sdf2 on host-115.virt.lab.msp.redhat.com

creating lvm devices...
Create 2 PV(s) for cache_1_1276 on host-115.virt.lab.msp.redhat.com
Create VG cache_1_1276 on host-115.virt.lab.msp.redhat.com
make needed LVs on host-115.virt.lab.msp.redhat.com
Create origin (slow) volume
lvcreate  -L 4G -n cache_1_12760 cache_1_1276 /dev/sdb1
Create cache (fast) volume
lvcreate  -L 1G -n cache_1_12760_fast cache_1_1276 /dev/sdc1
Create cache metadata (fast) volume
lvcreate  -L 8M -n cache_1_12760_fast_meta cache_1_1276 /dev/sdc1
Create cache pool volume by combining the cache data and cache metadata (fast) volumes
lvconvert --type cache-pool --poolmetadata cache_1_1276/cache_1_12760_fast_meta cache_1_1276/cache_1_12760_fast --yes
  WARNING: Converting logical volume cache_1_1276/cache_1_12760_fast and cache_1_1276/cache_1_12760_fast_meta to pool's data and metadata volumes.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)

Create cached volume by combining the cache pool (fast) and origin (slow) volumes
lvconvert --type cache --cachepool cache_1_1276/cache_1_12760_fast cache_1_1276/cache_1_12760 --yes

[...]

--- ALTERATION ITERATION 1 ---------------------------

Skipping reduce (Not supported on thin pool, raid, or cache volumes)
Skipping extend (Not supported on thin pool, raid, or cache volumes)

VOLUME RENAME from /dev/cache_1_1276/cache_1_12760 to rename_388 on host-115.virt.lab.msp.redhat.com
VOLUME RENAME back to /dev/cache_1_1276/cache_1_12760 from rename_388 on host-115.virt.lab.msp.redhat.com


deactivating VG cache_1_1276 on host-115.virt.lab.msp.redhat.com
EXPORTING VOLUME GROUP on host-115.virt.lab.msp.redhat.com
attempting to activate on host-115.virt.lab.msp.redhat.com
  Volume group "cache_1_1276" is exported
IMPORTING VOLUME GROUP on host-115.virt.lab.msp.redhat.com

activating VG cache_1_1276 on host-115.virt.lab.msp.redhat.com
CHANGING LV WRITE ACCESS TO RO on host-115.virt.lab.msp.redhat.com
verifying ro access flag on on host-115.virt.lab.msp.redhat.com

activating VG cache_1_1276 on host-115.virt.lab.msp.redhat.com
attempting to write to lv on host-115.virt.lab.msp.redhat.com
dd: writing to ‘/dev/cache_1_1276/cache_1_12760’: Operation not permitted
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000311297 s, 0.0 kB/s

CHANGING LV WRITE ACCESS BACK TO RW on host-115.virt.lab.msp.redhat.com
Didn't receive heartbeat from host-115.virt.lab.msp.redhat.com for 120 seconds
write access change attempt failed on host-115.virt.lab.msp.redhat.com



[13640.377423] device-mapper: btree spine: node_check failed: csum 2719545050 != wanted 2719646530
[13640.378437] device-mapper: block manager: btree_node validator check failed for block 4
[13640.379362] ------------[ cut here ]------------
[13640.379886] kernel BUG at drivers/md/persistent-data/dm-btree-spine.c:178!
[13640.380332] invalid opcode: 0000 [#1] SMP 
[13640.380332] Modules linked in: dm_cache_mq dm_cache() dm_thin_pool dm_persistent_data dm_bio_prison dm_raid raid456 raid1 raid10 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx dm_snapshot dm_bufio sd_mod crct10dif_generic crc_t10dif crct10dif_common iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter ip_tables ppdev serio_raw pcspkr virtio_balloon parport_pc parport i2c_piix4 i2c_core dm_multipath xfs libcrc32c ata_generic pata_acpi virtio_net virtio_blk ata_piix libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[13640.380332] CPU: 0 PID: 4306 Comm: lvchange Tainted: G            -------------- T 3.10.0-184.el7.x86_64 #1
[13640.380332] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[13640.380332] task: ffff88003a09e660 ti: ffff880000154000 task.ti: ffff880000154000
[13640.380332] RIP: 0010:[<ffffffffa02f4721>]  [<ffffffffa02f4721>] ro_node+0x21/0x30 [dm_persistent_data]
[13640.380332] RSP: 0018:ffff880000157b20  EFLAGS: 00010246
[13640.380332] RAX: 0000000000000000 RBX: ffff880000157b80 RCX: 0000000000000000
[13640.380332] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff880000157b80
[13640.380332] RBP: ffff880000157b70 R08: 0000000000000000 R09: 0000000000007314
[13640.380332] R10: 6567616e616d206b R11: 636f6c62203a7265 R12: ffffffffa02ed7a0
[13640.380332] R13: ffffffffa02ed7a0 R14: ffff880000157bd0 R15: ffff88003ccb6800
[13640.380332] FS:  00007f714875d880(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[13640.380332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13640.380332] CR2: 00007f4312d1e028 CR3: 000000003d4d7000 CR4: 00000000000006f0
[13640.380332] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[13640.380332] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[13640.380332] Stack:
[13640.380332]  ffffffffa02f24e2 ffffffac00000246 ffff880000157b80 ffff880000157b70
[13640.380332]  00000000e913e5a1 0000000000000004 ffffffffa02ed7a0 ffff880000157bd0
[13640.380332]  ffff88002b13dc00 ffff88003ccb6800 ffff880000157bc0 ffffffffa02f31de
[13640.380332] Call Trace:
[13640.380332]  [<ffffffffa02f24e2>] ? walk_node+0x42/0x100 [dm_persistent_data]
[13640.380332]  [<ffffffffa02ed7a0>] ? block_dec+0x160/0x160 [dm_persistent_data]
[13640.380332]  [<ffffffffa02f31de>] dm_btree_walk+0x4e/0x80 [dm_persistent_data]
[13640.380332]  [<ffffffffa03159f0>] ? complete_migration+0x30/0x30 [dm_cache]
[13640.380332]  [<ffffffffa02ed5dc>] dm_array_walk+0x3c/0x60 [dm_persistent_data]
[13640.380332]  [<ffffffffa0318640>] ? blocks_are_unmapped_or_clean+0xd0/0xd0 [dm_cache]
[13640.380332]  [<ffffffffa031950f>] dm_cache_load_mappings+0x7f/0xe0 [dm_cache]
[13640.380332]  [<ffffffffa03159f0>] ? complete_migration+0x30/0x30 [dm_cache]
[13640.380332]  [<ffffffffa0317e69>] cache_preresume+0xf9/0x1a0 [dm_cache]
[13640.380332]  [<ffffffffa00064a9>] dm_table_resume_targets+0x49/0xe0 [dm_mod]
[13640.380332]  [<ffffffffa000392c>] dm_resume+0x4c/0xd0 [dm_mod]
[13640.380332]  [<ffffffffa0008ccb>] dev_suspend+0x12b/0x250 [dm_mod]
[13640.380332]  [<ffffffffa0008ba0>] ? table_load+0x380/0x380 [dm_mod]
[13640.380332]  [<ffffffffa00095e5>] ctl_ioctl+0x255/0x500 [dm_mod]
[13640.380332]  [<ffffffff812530d4>] ? SYSC_semtimedop+0x264/0xd10
[13640.380332]  [<ffffffffa00098a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[13640.380332]  [<ffffffff811d8185>] do_vfs_ioctl+0x2e5/0x4c0
[13640.380332]  [<ffffffff8126d01e>] ? file_has_perm+0xae/0xc0
[13640.380332]  [<ffffffff81251b01>] ? unmerge_queues+0x1/0x70
[13640.380332]  [<ffffffff811d8401>] SyS_ioctl+0xa1/0xc0
[13640.380332]  [<ffffffff81611de9>] system_call_fastpath+0x16/0x1b
[13640.380332] Code: ff ff 5d c3 0f 0b 0f 1f 40 00 0f 1f 44 00 00 8b 47 08 85 c0 74 15 83 e8 01 55 48 98 48 8b 7c c7 10 48 89 e5 e8 d1 99 ff ff 5d c3 <0f> 0b 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 
[13640.380332] RIP  [<ffffffffa02f4721>] ro_node+0x21/0x30 [dm_persistent_data]
[13640.380332]  RSP <ffff880000157b20>
[13640.429171] ---[ end trace f102357b28798cc2 ]---
[13640.429714] Kernel panic - not syncing: Fatal exception

Comment 7 Mike Snitzer 2014-11-17 14:58:03 UTC
(In reply to Corey Marthaler from comment #6)
> Hit this on the latest rpms (w/o snaps being involved) by doing many
> iterations of volume configuration/changing operations. I was unable to
> reproduce this more than once so this may be rare now. I'll work on getting
> a new reproducer since the original one in comment #0 no longer applies.
> 
> 
> 3.10.0-184.el7.x86_64

You should be testing with this brew build kernel:
https://brewweb.devel.redhat.com/taskinfo?taskID=8236908

(It includes the patchset to bring the RHEL7.1 kernel to "full support" for dm-cache, including the fix for the bug#1080894 crash you shared in comment#6)

Moving back to ON_QA...

Comment 8 Mike Snitzer 2014-11-17 15:02:17 UTC
(In reply to Mike Snitzer from comment #7)

> You should be testing with this brew build kernel:
> https://brewweb.devel.redhat.com/taskinfo?taskID=8236908
> 
> (It includes the patchset to bring the RHEL7.1 kernel to "full support" for
> dm-cache, including the fix for the bug#1080894 crash you shared in
> comment#6)

Seems that scratch brew build already got cleaned up, here is another one I just kicked off: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8254038

Comment 9 Corey Marthaler 2015-01-09 21:07:36 UTC
Fix verified in the latest rpms. 

3.10.0-219.el7.x86_64
lvm2-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
lvm2-libs-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
lvm2-cluster-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-libs-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-event-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-event-libs-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-persistent-data-0.4.1-2.el7    BUILT: Wed Nov 12 12:39:46 CST 2014
cmirror-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015


[...]
====================================================================
   Iter 558 started at Fri Jan  9 14:59:52 CST 2015
====================================================================
VOLUME CREATION: cache_3_1055, convert= snap=

Comment 11 errata-xmlrpc 2015-03-05 13:09:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0513.html


Note You need to log in before you can comment on or make changes to this bug.