Bug 1149008

Summary: exclusive activation of thin pool and thin origin device is not maintained when mixing and merging thin and non thin snapshots
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Zdenek Kabelac <zkabelac>
lvm2 sub component: Thin Provisioning QA Contact: cluster-qe <cluster-qe>
Status: CLOSED WONTFIX Docs Contact:
Severity: medium    
Priority: low CC: agk, heinzm, jbrassow, msnitzer, prajnoha, thornber, zkabelac
Version: 7.3Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-15 07:31:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1119323    
Attachments:
Description Flags
-vvvv of the lvconvert none

Description Corey Marthaler 2014-10-02 22:07:22 UTC
Created attachment 943583 [details]
-vvvv of the lvconvert

Description of problem:
[root@hayes-02 ~]# pvscan
  PV /dev/mapper/mpathep1   VG snapper_thinp   lvm2 [249.99 GiB / 249.99 GiB free]
  PV /dev/mapper/mpathap1   VG snapper_thinp   lvm2 [249.99 GiB / 249.99 GiB free]
  PV /dev/mapper/mpathbp1   VG snapper_thinp   lvm2 [249.99 GiB / 249.99 GiB free]
  PV /dev/mapper/mpathcp1   VG snapper_thinp   lvm2 [249.99 GiB / 249.99 GiB free]
  PV /dev/mapper/mpathdp1   VG snapper_thinp   lvm2 [249.99 GiB / 249.99 GiB free]

[root@hayes-02 ~]# vgs
  VG            #PV #LV #SN Attr   VSize  VFree
  snapper_thinp   5   0   0 wz--nc  1.22t 1.22t

[root@hayes-02 ~]# lvcreate --activate ey --thinpool POOL --profile thin-performance --zero y -L 500M --poolmetadatasize 100M snapper_thinp
  Logical volume "lvol0" created
  Logical volume "POOL" created
[root@hayes-02 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi-a-tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathep1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathep1(0)

[root@hayes-02 ~]# lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n origin
  Logical volume "origin" created
[root@hayes-02 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi-a-tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathep1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathep1(0)
  origin          Vwi-a-tz-- 100.00m POOL        0.00

[root@hayes-02 ~]# lvcreate --activate ey -s /dev/snapper_thinp/origin -L 100M -n invalid1
  Logical volume "invalid1" created
[root@hayes-02 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi-a-tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathep1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  invalid1        swi-a-s--- 100.00m      origin 0.00          /dev/mapper/mpathep1(150)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathep1(0)
  origin          owi-a-tz-- 100.00m POOL        0.00

[root@hayes-02 ~]# lvconvert --merge /dev/snapper_thinp/invalid1 --yes
  Merging of volume invalid1 started.
  origin: Merged: 100.0%
  Merge of snapshot into logical volume origin has finished.
  Logical volume "invalid1" successfully removed

[root@hayes-02 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi-a-tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathep1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathep1(0)
  origin          Vwi-a-tz-- 100.00m POOL        0.00

[root@hayes-02 ~]# lvcreate --activate ey -s /dev/snapper_thinp/origin -n invalid2
  Logical volume "invalid2" created
[root@hayes-02 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi-a-tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathep1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  invalid2        Vwi---tz-k 100.00m POOL origin
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathep1(0)
  origin          Vwi-a-tz-- 100.00m POOL        0.00

[root@hayes-02 ~]# lvconvert --merge /dev/snapper_thinp/invalid2 --yes -vvvv > /tmp/lvconvert 2>&1
[root@hayes-02 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi-a-tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathep1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathep1(0)
  origin          Vwi-a-tz-- 100.00m POOL        0.00


# Now all of a sudden POOL_tdata|_tmeta and origin volumes are active on the other nodes in the cluster
[root@hayes-01 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi---tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathap1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathep1(0)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathap1(0)
  origin          Vwi-a-tz-- 100.00m POOL        0.00

[root@hayes-03 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            twi---tz-- 500.00m             0.00   0.04   POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 500.00m                           /dev/mapper/mpathap1(25)
  [POOL_tmeta]    ewi-ao---- 100.00m                           /dev/mapper/mpathdp1(0)
  [lvol0_pmspare] ewi------- 100.00m                           /dev/mapper/mpathap1(0)
  origin          Vwi-a-tz-- 100.00m POOL        0.00

# From here on, any new "normal" snap attempt will fail since the origin is no longer activated exclusively (shouldn't a thin snap attempt fail as well?)
[root@hayes-02 tmp]# lvcreate --activate ey -s /dev/snapper_thinp/origin -n invalid3
  Logical volume "invalid3" created
[root@hayes-02 tmp]# lvcreate --activate ey -s /dev/snapper_thinp/origin -L 100M -n invalid4
  origin must be active exclusively to create snapshot


Version-Release number of selected component (if applicable):
2.6.32-502.el6.x86_64
lvm2-2.02.111-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
lvm2-libs-2.02.111-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
lvm2-cluster-2.02.111-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
udev-147-2.57.el6    BUILT: Thu Jul 24 08:48:47 CDT 2014
device-mapper-1.02.90-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
device-mapper-libs-1.02.90-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
device-mapper-event-1.02.90-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
device-mapper-event-libs-1.02.90-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT 2014
cmirror-2.02.111-2.el6    BUILT: Mon Sep  1 06:46:43 CDT 2014

Comment 3 Corey Marthaler 2016-10-05 21:04:57 UTC
Update on this scenario, when run with cached thin pool data volumes this may result in panics.


./snapper_thinp -r /usr/tests/sts-rhel7.3 -l /usr/tests/sts-rhel7.3/ -R ../../../resource-STSRHTS555.xml  -E host-081 -e invalidated_thin_snap_merge -c

creating lvm devices...
host-081: pvcreate /dev/sdg1 /dev/sdh1 /dev/sde1 /dev/sdf1 /dev/sdc1 /dev/sdd1 /dev/sda1
host-081: vgcreate  snapper_thinp /dev/sdg1 /dev/sdh1 /dev/sde1 /dev/sdf1 /dev/sdc1 /dev/sdd1 /dev/sda1

============================================================
Iteration 1 of 1 started at Wed Oct  5 14:56:26 CDT 2016
============================================================
SCENARIO - [invalidated_thin_snap_merge]
Create "invalidated" (full) thin snapshots and then verify that merge attempts will not cause problem
Making pool volume
Converting *cached* volume to thin pool data device
lvcreate --activate ey  --zero y -L 4M -n meta snapper_thinp /dev/sda1
lvcreate --activate ey  --zero y -L 500M -n POOL snapper_thinp /dev/sda1
lvcreate --activate ey --zero y -L 400M -n cpool snapper_thinp /dev/sdc1
lvcreate --activate ey --zero y -L 8M -n cpool_meta snapper_thinp /dev/sdc1
Create cache pool volume by combining the cache data and cache metadata (fast) volumes
lvconvert --yes --type cache-pool --poolmetadata snapper_thinp/cpool_meta snapper_thinp/cpool
  WARNING: Converting logical volume snapper_thinp/cpool and snapper_thinp/cpool_meta to cache pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Create cached volume by combining the cache pool (fast) and origin (slow) volumes
lvconvert --yes --type cache --cachepool snapper_thinp/cpool snapper_thinp/POOL
Create thin pool volume by combining the cached thin data and meta volumes
lvconvert --zero y --thinpool snapper_thinp/POOL --poolmetadata meta --yes
  WARNING: Converting logical volume snapper_thinp/POOL and snapper_thinp/meta to thin pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)

Sanity checking pool device (POOL) metadata
examining superblock
examining devices tree
examining mapping tree
checking space map counts


Making origin volume
lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n origin
lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n other1
lvcreate --activate ey -V 100M -T snapper_thinp/POOL -n other2
lvcreate --activate ey -V 100M -T snapper_thinp/POOL -n other3
lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n other4
lvcreate --activate ey -V 100M -T snapper_thinp/POOL -n other5
  WARNING: Sum of all thin volume sizes (600.00 MiB) exceeds the size of thin pool snapper_thinp/POOL (500.00 MiB)!

lvcreate --activate ey -k n -s /dev/snapper_thinp/origin -n invalid1
Filling snapshot /dev/snapper_thinp/invalid1
dd if=/dev/zero of=/dev/snapper_thinp/invalid1 bs=1M count=101
dd: error writing ‘/dev/snapper_thinp/invalid1’: No space left on device
101+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 5.09591 s, 20.6 MB/s
Attempt to merge back an invalidated snapshot volume
lvconvert --merge /dev/snapper_thinp/invalid1 --yes

lvcreate --activate ey -k n -s /dev/snapper_thinp/origin -n invalid2
Filling snapshot /dev/snapper_thinp/invalid2
dd if=/dev/zero of=/dev/snapper_thinp/invalid2 bs=1M count=101
dd: error writing ‘/dev/snapper_thinp/invalid2’: No space left on device
101+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 5.97496 s, 17.5 MB/s
Attempt to merge back an invalidated snapshot volume
lvconvert --merge /dev/snapper_thinp/invalid2 --yes


# host-082

[432011.901213] device-mapper: space map common: unable to decrement a reference count below 0
[432011.903167] device-mapper: cache: 253:6: metadata operation 'dm_cache_set_dirty' failed: error = -22
[432011.905228] device-mapper: cache: 253:6: aborting current metadata transaction
[432011.907740] ------------[ cut here ]------------
[432011.908881] WARNING: at drivers/md/dm-bufio.c:1500 dm_bufio_client_destroy+0x1e0/0x1f0 [dm_bufio]()
[432011.910906] Modules linked in: dm_cache_smq dm_cache dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_userspace gfs2 ip6table_filter ip6_tables binfmt_misc dlm sd_mod crc_t10dif crct10dif_generic crct10dif_common sg iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter i2c_piix4 ppdev virtio_balloon pcspkr i6300esb i2c_core parport_pc parport dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net virtio_blk ata_piix libata serio_raw virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[432011.924697] CPU: 0 PID: 3157 Comm: clvmd Tainted: G        W      ------------   3.10.0-511.el7.x86_64 #1
[432011.926857] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[432011.928171]  0000000000000000 0000000067106829 ffff88003b5e7af8 ffffffff81685e8c
[432011.929995]  ffff88003b5e7b30 ffffffff81085820 ffff88001742f400 ffff88001742f428
[432011.931837]  0000000000000000 ffff8800150ca818 ffff88001742f428 ffff88003b5e7b40
[432011.933677] Call Trace:
[432011.934297]  [<ffffffff81685e8c>] dump_stack+0x19/0x1b
[432011.935487]  [<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
[432011.936890]  [<ffffffff8108596a>] warn_slowpath_null+0x1a/0x20
[432011.938238]  [<ffffffffa035caa0>] dm_bufio_client_destroy+0x1e0/0x1f0 [dm_bufio]
[432011.939923]  [<ffffffffa03ec2a5>] dm_block_manager_destroy+0x15/0x20 [dm_persistent_data]
[432011.941781]  [<ffffffffa044c9f8>] __destroy_persistent_data_objects+0x28/0x30 [dm_cache]
[432011.943620]  [<ffffffffa044e8c5>] dm_cache_metadata_abort+0x25/0x60 [dm_cache]
[432011.945263]  [<ffffffffa0447c9d>] metadata_operation_failed+0x8d/0x110 [dm_cache]
[432011.946995]  [<ffffffffa044bab6>] cache_postsuspend+0x296/0x4b0 [dm_cache]
[432011.948570]  [<ffffffff8131b67b>] ? kobject_uevent_env+0x1ab/0x620
[432011.949987]  [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
[432011.951354]  [<ffffffffa00068fa>] dm_table_postsuspend_targets+0x4a/0x60 [dm_mod]
[432011.953056]  [<ffffffffa0002b23>] __dm_destroy+0x2e3/0x320 [dm_mod]
[432011.954484]  [<ffffffffa0003833>] dm_destroy+0x13/0x20 [dm_mod]
[432011.955859]  [<ffffffffa00093de>] dev_remove+0x11e/0x180 [dm_mod]
[432011.957254]  [<ffffffffa00092c0>] ? dev_suspend+0x250/0x250 [dm_mod]
[432011.958718]  [<ffffffffa0009a35>] ctl_ioctl+0x1e5/0x500 [dm_mod]
[432011.960135]  [<ffffffffa0009d63>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[432011.961584]  [<ffffffff81211eb5>] do_vfs_ioctl+0x2d5/0x4b0
[432011.962916]  [<ffffffff812aea3e>] ? file_has_perm+0xae/0xc0
[432011.964201]  [<ffffffff81292e01>] ? wake_up_sem_queue_do+0x11/0x60
[432011.965628]  [<ffffffff81212131>] SyS_ioctl+0xa1/0xc0
[432011.966807]  [<ffffffff81696489>] system_call_fastpath+0x16/0x1b
[432011.968171] ---[ end trace 7f98a93d71d141b8 ]---
[432011.969256] device-mapper: bufio: leaked buffer 7, hold count 1, list 0
[432011.970814] ------------[ cut here ]------------
[432011.971770] kernel BUG at drivers/md/dm-bufio.c:1516!
[432011.971770] invalid opcode: 0000 [#1] SMP 
[432011.971770] Modules linked in: dm_cache_smq dm_cache dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_userspace gfs2 ip6table_filter ip6_tables binfmt_misc dlm sd_mod crc_t10dif crct10dif_generic crct10dif_common sg iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter i2c_piix4 ppdev virtio_balloon pcspkr i6300esb i2c_core parport_pc parport dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net virtio_blk ata_piix libata serio_raw virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[432011.971770] CPU: 0 PID: 3157 Comm: clvmd Tainted: G        W      ------------   3.10.0-511.el7.x86_64 #1
[432011.971770] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[432011.971770] task: ffff88003bac3ec0 ti: ffff88003b5e4000 task.ti: ffff88003b5e4000
[432011.971770] RIP: 0010:[<ffffffffa035ca73>]  [<ffffffffa035ca73>] dm_bufio_client_destroy+0x1b3/0x1f0 [dm_bufio]
[432011.971770] RSP: 0018:ffff88003b5e7b50  EFLAGS: 00010287
[432011.971770] RAX: 0000000000000001 RBX: ffff88001742f400 RCX: 0000000000000000
[432011.971770] RDX: 0000000000000000 RSI: ffff88003fc0f838 RDI: ffff88003fc0f838
[432011.971770] RBP: ffff88003b5e7b78 R08: 0000000000000096 R09: 0000000000000492
[432011.971770] R10: 3120746e756f6320 R11: 30207473696c202c R12: ffff88001742f448
[432011.971770] R13: 0000000000000002 R14: ffff88001742f438 R15: ffff88001742f428
[432011.971770] FS:  00007f92c2c6c700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[432011.971770] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[432011.971770] CR2: 00007febc0be02c8 CR3: 000000003b4fa000 CR4: 00000000000006f0
[432011.971770] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[432011.971770] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[432011.971770] Stack:
[432011.971770]  ffff88000ca3c950 ffff88003c82d1a0 00000000ffffffea ffff880012fbb85c
[432011.971770]  00000000ffffffea ffff88003b5e7b90 ffffffffa03ec2a5 ffff88002b739c00
[432011.971770]  ffff88003b5e7ba8 ffffffffa044c9f8 ffff88002b739c00 ffff88003b5e7bc8
[432011.971770] Call Trace:
[432011.971770]  [<ffffffffa03ec2a5>] dm_block_manager_destroy+0x15/0x20 [dm_persistent_data]
[432011.971770]  [<ffffffffa044c9f8>] __destroy_persistent_data_objects+0x28/0x30 [dm_cache]
[432011.971770]  [<ffffffffa044e8c5>] dm_cache_metadata_abort+0x25/0x60 [dm_cache]
[432011.971770]  [<ffffffffa0447c9d>] metadata_operation_failed+0x8d/0x110 [dm_cache]
[432011.971770]  [<ffffffffa044bab6>] cache_postsuspend+0x296/0x4b0 [dm_cache]
[432011.971770]  [<ffffffff8131b67b>] ? kobject_uevent_env+0x1ab/0x620
[432011.971770]  [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
[432011.971770]  [<ffffffffa00068fa>] dm_table_postsuspend_targets+0x4a/0x60 [dm_mod]
[432011.971770]  [<ffffffffa0002b23>] __dm_destroy+0x2e3/0x320 [dm_mod]
[432011.971770]  [<ffffffffa0003833>] dm_destroy+0x13/0x20 [dm_mod]
[432011.971770]  [<ffffffffa00093de>] dev_remove+0x11e/0x180 [dm_mod]
[432011.971770]  [<ffffffffa00092c0>] ? dev_suspend+0x250/0x250 [dm_mod]
[432011.971770]  [<ffffffffa0009a35>] ctl_ioctl+0x1e5/0x500 [dm_mod]
[432011.971770]  [<ffffffffa0009d63>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[432011.971770]  [<ffffffff81211eb5>] do_vfs_ioctl+0x2d5/0x4b0
[432011.971770]  [<ffffffff812aea3e>] ? file_has_perm+0xae/0xc0
[432011.971770]  [<ffffffff81292e01>] ? wake_up_sem_queue_do+0x11/0x60
[432011.971770]  [<ffffffff81212131>] SyS_ioctl+0xa1/0xc0
[432011.971770]  [<ffffffff81696489>] system_call_fastpath+0x16/0x1b
[432011.971770] Code: 18 49 39 c4 4c 8d 70 e8 75 da b8 01 00 00 00 e9 d0 fe ff ff 0f 0b 31 f6 48 c7 c7 a0 e4 35 a0 31 c0 e8 68 29 32 e1 e9 7b ff ff ff <0f> 0b be 01 00 00 00 48 c7 c7 a0 e4 35 a0 31 c0 e8 4e 29 32 e1 
[432011.971770] RIP  [<ffffffffa035ca73>] dm_bufio_client_destroy+0x1b3/0x1f0 [dm_bufio]
[432011.971770]  RSP <ffff88003b5e7b50>
[432012.057146] ---[ end trace 7f98a93d71d141b9 ]---
[432012.058353] Kernel panic - not syncing: Fatal exception





# host-083

[432015.228371] device-mapper: space map common: unable to decrement a reference count below 0
[432015.230325] device-mapper: cache: 253:6: metadata operation 'dm_cache_set_dirty' failed: error = -22
[432015.232339] device-mapper: cache: 253:6: aborting current metadata transaction
[432015.234879] ------------[ cut here ]------------
[432015.235980] WARNING: at drivers/md/dm-bufio.c:1500 dm_bufio_client_destroy+0x1e0/0x1f0 [dm_bufio]()
[432015.238695] Modules linked in: dm_cache_smq dm_cache dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_userspace gfs2 dlm sd_mod crc_t10dif crct10dif_generic crct10dif_common sg iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter i2c_piix4 ppdev pcspkr i6300esb virtio_balloon i2c_core parport_pc parport dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk virtio_net ata_piix serio_raw libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[432015.253728] CPU: 0 PID: 3151 Comm: clvmd Not tainted 3.10.0-511.el7.x86_64 #1
[432015.255942] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[432015.257371]  0000000000000000 0000000011a565ce ffff88002d26baf8 ffffffff81685e8c
[432015.260464]  ffff88002d26bb30 ffffffff81085820 ffff880014ab9a00 ffff880014ab9a28
[432015.262837]  0000000000000000 ffff88003a9d9818 ffff880014ab9a28 ffff88002d26bb40
[432015.265222] Call Trace:
[432015.266224]  [<ffffffff81685e8c>] dump_stack+0x19/0x1b
[432015.268214]  [<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
[432015.269651]  [<ffffffff8108596a>] warn_slowpath_null+0x1a/0x20
[432015.271532]  [<ffffffffa0343aa0>] dm_bufio_client_destroy+0x1e0/0x1f0 [dm_bufio]
[432015.273207]  [<ffffffffa03552a5>] dm_block_manager_destroy+0x15/0x20 [dm_persistent_data]
[432015.275003]  [<ffffffffa039e9f8>] __destroy_persistent_data_objects+0x28/0x30 [dm_cache]
[432015.276835]  [<ffffffffa03a08c5>] dm_cache_metadata_abort+0x25/0x60 [dm_cache]
[432015.279478]  [<ffffffffa0399c9d>] metadata_operation_failed+0x8d/0x110 [dm_cache]
[432015.281910]  [<ffffffffa039dab6>] cache_postsuspend+0x296/0x4b0 [dm_cache]
[432015.283472]  [<ffffffff8131b67b>] ? kobject_uevent_env+0x1ab/0x620
[432015.285355]  [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
[432015.286689]  [<ffffffffa00068fa>] dm_table_postsuspend_targets+0x4a/0x60 [dm_mod]
[432015.288358]  [<ffffffffa0002b23>] __dm_destroy+0x2e3/0x320 [dm_mod]
[432015.289773]  [<ffffffffa0003833>] dm_destroy+0x13/0x20 [dm_mod]
[432015.291123]  [<ffffffffa00093de>] dev_remove+0x11e/0x180 [dm_mod]
[432015.293458]  [<ffffffffa00092c0>] ? dev_suspend+0x250/0x250 [dm_mod]
[432015.295164]  [<ffffffffa0009a35>] ctl_ioctl+0x1e5/0x500 [dm_mod]
[432015.296523]  [<ffffffffa0009d63>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[432015.297890]  [<ffffffff81211eb5>] do_vfs_ioctl+0x2d5/0x4b0
[432015.299789]  [<ffffffff812aea3e>] ? file_has_perm+0xae/0xc0
[432015.301117]  [<ffffffff81292e01>] ? wake_up_sem_queue_do+0x11/0x60
[432015.302498]  [<ffffffff81212131>] SyS_ioctl+0xa1/0xc0
[432015.303647]  [<ffffffff81696489>] system_call_fastpath+0x16/0x1b
[432015.304996] ---[ end trace 6dba8a4b2fdcc54c ]---
[432015.306763] device-mapper: bufio: leaked buffer 7, hold count 1, list 0
[432015.309293] ------------[ cut here ]------------
[432015.310243] kernel BUG at drivers/md/dm-bufio.c:1516!
[432015.310243] invalid opcode: 0000 [#1] SMP 
[432015.310243] Modules linked in: dm_cache_smq dm_cache dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_userspace gfs2 dlm sd_mod crc_t10dif crct10dif_generic crct10dif_common sg iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter i2c_piix4 ppdev pcspkr i6300esb virtio_balloon i2c_core parport_pc parport dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk virtio_net ata_piix serio_raw libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[432015.310243] CPU: 0 PID: 3151 Comm: clvmd Tainted: G        W      ------------   3.10.0-511.el7.x86_64 #1
[432015.310243] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[432015.310243] task: ffff88003a16edd0 ti: ffff88002d268000 task.ti: ffff88002d268000
[432015.310243] RIP: 0010:[<ffffffffa0343a73>]  [<ffffffffa0343a73>] dm_bufio_client_destroy+0x1b3/0x1f0 [dm_bufio]
[432015.310243] RSP: 0018:ffff88002d26bb50  EFLAGS: 00010206
[432015.310243] RAX: 0000000000000001 RBX: ffff880014ab9a00 RCX: 0000000000000000
[432015.310243] RDX: 0000000000000000 RSI: ffff88003fc0f838 RDI: ffff88003fc0f838
[432015.310243] RBP: ffff88002d26bb78 R08: 0000000000000096 R09: 00000000000004a3
[432015.310243] R10: 3120746e756f6320 R11: 30207473696c202c R12: ffff880014ab9a48
[432015.310243] R13: 0000000000000002 R14: ffff880014ab9a38 R15: ffff880014ab9a28
[432015.310243] FS:  00007f60e23e0700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[432015.310243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[432015.310243] CR2: 00007f55c0001b50 CR3: 000000002d244000 CR4: 00000000000006f0
[432015.310243] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[432015.310243] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[432015.310243] Stack:
[432015.310243]  ffff88001ae23f80 ffff88001d8e99a0 00000000ffffffea ffff88003c86385c
[432015.310243]  00000000ffffffea ffff88002d26bb90 ffffffffa03552a5 ffff880035803800
[432015.310243]  ffff88002d26bba8 ffffffffa039e9f8 ffff880035803800 ffff88002d26bbc8
[432015.310243] Call Trace:
[432015.310243]  [<ffffffffa03552a5>] dm_block_manager_destroy+0x15/0x20 [dm_persistent_data]
[432015.310243]  [<ffffffffa039e9f8>] __destroy_persistent_data_objects+0x28/0x30 [dm_cache]
[432015.310243]  [<ffffffffa03a08c5>] dm_cache_metadata_abort+0x25/0x60 [dm_cache]
[432015.310243]  [<ffffffffa0399c9d>] metadata_operation_failed+0x8d/0x110 [dm_cache]
[432015.310243]  [<ffffffffa039dab6>] cache_postsuspend+0x296/0x4b0 [dm_cache]
[432015.310243]  [<ffffffff8131b67b>] ? kobject_uevent_env+0x1ab/0x620
[432015.310243]  [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
[432015.310243]  [<ffffffffa00068fa>] dm_table_postsuspend_targets+0x4a/0x60 [dm_mod]
[432015.310243]  [<ffffffffa0002b23>] __dm_destroy+0x2e3/0x320 [dm_mod]
[432015.310243]  [<ffffffffa0003833>] dm_destroy+0x13/0x20 [dm_mod]
[432015.310243]  [<ffffffffa00093de>] dev_remove+0x11e/0x180 [dm_mod]
[432015.310243]  [<ffffffffa00092c0>] ? dev_suspend+0x250/0x250 [dm_mod]
[432015.310243]  [<ffffffffa0009a35>] ctl_ioctl+0x1e5/0x500 [dm_mod]
[432015.310243]  [<ffffffffa0009d63>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[432015.310243]  [<ffffffff81211eb5>] do_vfs_ioctl+0x2d5/0x4b0
[432015.310243]  [<ffffffff812aea3e>] ? file_has_perm+0xae/0xc0
[432015.310243]  [<ffffffff81292e01>] ? wake_up_sem_queue_do+0x11/0x60
[432015.310243]  [<ffffffff81212131>] SyS_ioctl+0xa1/0xc0
[432015.310243]  [<ffffffff81696489>] system_call_fastpath+0x16/0x1b
[432015.310243] Code: 18 49 39 c4 4c 8d 70 e8 75 da b8 01 00 00 00 e9 d0 fe ff ff 0f 0b 31 f6 48 c7 c7 a0 54 34 a0 31 c0 e8 68 b9 33 e1 e9 7b ff ff ff <0f> 0b be 01 00 00 00 48 c7 c7 a0 54 34 a0 31 c0 e8 4e b9 33 e1 
[432015.310243] RIP  [<ffffffffa0343a73>] dm_bufio_client_destroy+0x1b3/0x1f0 [dm_bufio]
[432015.310243]  RSP <ffff88002d26bb50>
[432015.400346] ---[ end trace 6dba8a4b2fdcc54d ]---
[432015.401541] Kernel panic - not syncing: Fatal exception

Comment 4 Corey Marthaler 2016-10-05 21:12:27 UTC
Actually, although the scenario in comment #3 was the same test scenario, it did not mix snapshot types, and also used cached _tdata volumes. Filed bug 1382141 for that issue instead. Please disregard  comment #3.

Comment 7 RHEL Program Management 2020-12-15 07:31:25 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.