RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1105732 - "Attempted to decrement suspended device counter below zero" when deactivating VG containing cache volumes
Summary: "Attempted to decrement suspended device counter below zero" when deactivatin...
Keywords:
Status: CLOSED DUPLICATE of bug 1091553
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 1119326 1185916
TreeView+ depends on / blocked
 
Reported: 2014-06-06 20:45 UTC by Corey Marthaler
Modified: 2021-09-03 12:49 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1185916 (view as bug list)
Environment:
Last Closed: 2015-01-26 15:58:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2014-06-06 20:45:36 UTC
Description of problem:
This may just be more fallout from bug 1091553.

lvremove -f /dev/cache_sanity/snap1
couldn't remove volume snap1

Internal error: Performing unsafe table load while 4 device(s) are known to be suspended:  (253:8) 
   device-mapper: resume ioctl on  failed: Invalid argument
   Unable to resume cache_sanity-rename_orig_A-real (253:8)
   Failed to resume rename_orig_A.
   libdevmapper exiting with 4 device(s) still suspended.

sleep 10

[root@host-034 ~]# lvs
  LV            VG            Attr       LSize   Pool          Origin                Data%
  rename_orig_A cache_sanity  owi-s-C---   4.00g rename_pool_A [rename_orig_A_corig]
  rename_pool_A cache_sanity  Cwi-s-C---   4.00g
  snap1         cache_sanity  -wi-a----- 500.00m
  snap2         cache_sanity  swi-s-s--- 500.00m               rename_orig_A           0.54

[root@host-034 ~]# lvremove -f /dev/cache_sanity/snap1
  Logical volume "snap1" successfully removed

[root@host-034 ~]# vgchange -an cache_sanity
  Attempted to decrement suspended device counter below zero.
  Attempted to decrement suspended device counter below zero.
  Attempted to decrement suspended device counter below zero.
  Attempted to decrement suspended device counter below zero.
  Attempted to decrement suspended device counter below zero.
  0 logical volume(s) in volume group "cache_sanity" now active


I'll try and narrow this down and provide more details...



Version-Release number of selected component (if applicable):
3.10.0-123.el7.x86_64
lvm2-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-libs-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-cluster-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-persistent-data-0.3.2-1.el7    BUILT: Thu Apr  3 09:58:51 CDT 2014
cmirror-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014

Comment 2 Jonathan Earl Brassow 2014-09-30 15:08:48 UTC
[root@bp-01 ~]# lvcreate -s -L 500M -n snap vg/lv
  Snapshot of cache LV is not yet supported.

Snapshots are not currently supported.  We can verify that fact to fix this bug.  A new feature bug will need to be created to add snapshot support.

I am marking this bug a duplicate of bug 1091553.

*** This bug has been marked as a duplicate of bug 1091553 ***

Comment 3 Corey Marthaler 2015-01-07 22:53:50 UTC
This message still exists and apparently didn't require snaps of cache volumes to hit. I'll try and look deeper into what's causing this, but could it be a well timed interupted remove attempt, followed by another remove?


# Test output (interupted during remove)
Removing cache pool cache_sanity/1K_cache
^C


Jan  7 16:40:48 host-110 qarshd[4800]: Running cmdline: lvremove -f /dev/cache_sanity/1K_cache
Jan  7 16:40:49 host-110 qarshd[4800]: Sending child 4803 signal 2
Jan  7 16:41:09 host-110 multipathd: dm-5: mapname not found for 253:5
Jan  7 16:41:09 host-110 multipathd: uevent trigger error
Jan  7 16:41:09 host-110 multipathd: dm-4: mapname not found for 253:4
Jan  7 16:41:09 host-110 multipathd: uevent trigger error
Jan  7 16:41:09 host-110 multipathd: dm-3: mapname not found for 253:3
Jan  7 16:41:09 host-110 multipathd: uevent trigger error
Jan  7 16:41:09 host-110 multipathd: dm-5: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-4: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-5: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-3: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-4: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-3: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-2: remove map (uevent)
Jan  7 16:41:09 host-110 multipathd: dm-2: devmap not registered, can't remove
Jan  7 16:41:09 host-110 multipathd: dm-2: remove map (uevent)





[root@host-110 ~]# pvscan
  PV /dev/sdb1   VG cache_sanity    lvm2 [499.00 MiB / 486.92 MiB free]
  PV /dev/sdc1   VG cache_sanity    lvm2 [499.00 MiB / 486.95 MiB free]
  PV /dev/vda2   VG rhel_host-110   lvm2 [7.51 GiB / 40.00 MiB free]
  PV /dev/sdf1                      lvm2 [15.00 GiB]
  PV /dev/sdd1                      lvm2 [15.00 GiB]
  Total: 5 [38.48 GiB] / in use: 3 [8.48 GiB] / in no VG: 2 [30.00 GiB]

[root@host-110 ~]# vgremove cache_sanity
Do you really want to remove volume group "cache_sanity" containing 1 logical volumes? [y/n]: y
Do you really want to remove active logical volume 1K_origin? [y/n]: y
  Flushing cache for 1K_origin.
  Attempted to decrement suspended device counter below zero.
  Logical volume "1K_cache" successfully removed
  Logical volume "1K_origin" successfully removed
  Volume group "cache_sanity" successfully removed


[root@host-110 ~]# /usr/tests/sts-rhel7.1/lvm2/bin/lvm_rpms 
3.10.0-219.el7.x86_64

lvm2-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
lvm2-libs-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
lvm2-cluster-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-libs-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-event-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-event-libs-1.02.92-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015
device-mapper-persistent-data-0.4.1-2.el7    BUILT: Wed Nov 12 12:39:46 CST 2014
cmirror-2.02.114-4.el7    BUILT: Wed Jan  7 07:07:47 CST 2015

Comment 4 Zdenek Kabelac 2015-01-24 23:36:20 UTC
I'm not able to reproduce this bug with my lvm2 code.

Is this bug reproducible with  2.02.115 build ?

If so - please attach 'lvs -a' before you run 'vgremove -vvvv vg'
and append  this vgremove debug trace.

Kernel messages from Comment 3 are unrelated - they go from multipathd - and they are weird and may require fixes - but should not be related to lvm2.

Unless multipath devices are involved as PV for your VG ?

Then please add 'dmsetup table' and 'pvs'

Comment 5 Corey Marthaler 2015-01-25 03:04:26 UTC
Yes, I've see this a couple times now with the latest rpms. I'm still looking for a definative reproducer however.

First reproduction:
[root@host-113 ~]# lvs -a -o +devices
  LV                         VG            Attr       LSize   Pool                 Origin                Data%  Meta% Cpy%Sync Devices
  cache_2_83500              cache_2_8350  Cwims-C---   4.00g [cache_2_83500_fast] [cache_2_83500_corig] 0.04   2.83  0.00     cache_2_83500_corig(0)
  [cache_2_83500_corig]      cache_2_8350  owi-aoC---   4.00g                                                                  /dev/sde1(0)
  [cache_2_83500_fast]       cache_2_8350  Cwi---C---   1.00g                                            0.04   2.83  0.00     cache_2_83500_fast_cdata(0)
  [cache_2_83500_fast_cdata] cache_2_8350  Cwi-ao----   1.00g                                                                  /dev/sdg1(0)
  [cache_2_83500_fast_cmeta] cache_2_8350  ewi-ao----   8.00m                                                                  /dev/sdg1(256)
  [lvol0_pmspare]            cache_2_8350  ewi-------   8.00m                                                                  /dev/sdb1(0)
[root@host-113 ~]# vgremove cache_2_8350
Do you really want to remove volume group "cache_2_8350" containing 1 logical volumes? [y/n]: y
Do you really want to remove active logical volume cache_2_83500? [y/n]: y
  Attempted to decrement suspended device counter below zero.
  Logical volume "cache_2_83500_fast" successfully removed
  Logical volume "cache_2_83500" successfully removed
  Volume group "cache_2_8350" successfully removed



Second reproduction:
[root@host-119 ~]# lvs -a -o +devices
  LV                        VG            Attr       LSize   Pool                Origin               Data%  Meta% Cpy%Sync Devices
  cache_5_4770              cache_5_477   CwimsoC---   4.00g [cache_5_4770_fast] [cache_5_4770_corig] 0.00   2.83  100.00   cache_5_4770_corig(0)
  [cache_5_4770_corig]      cache_5_477   owi-aoC---   4.00g                                                                /dev/sdb1(0)
  [cache_5_4770_fast]       cache_5_477   Cwi---C---   1.00g                                          0.00   2.83  100.00   cache_5_4770_fast_cdata(0)
  [cache_5_4770_fast_cdata] cache_5_477   Cwi-ao----   1.00g                                                                /dev/sdd1(0)
  [cache_5_4770_fast_cmeta] cache_5_477   ewi-ao----   8.00m                                                                /dev/sdd1(256)
  [lvol0_pmspare]           cache_5_477   ewi-------   8.00m                                                                /dev/sdc2(0)
[root@host-119 ~]# lvchange -prw /dev/cache_5_477/cache_5_4770
  Logical volume "cache_5_4770" is already writable
[root@host-119 ~]# lvchange -pr /dev/cache_5_477/cache_5_4770
  Attempted to decrement suspended device counter below zero.
  Logical volume "cache_5_4770" changed.



3.10.0-223.el7.x86_64
lvm2-2.02.115-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
lvm2-libs-2.02.115-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
lvm2-cluster-2.02.115-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
device-mapper-1.02.93-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
device-mapper-libs-1.02.93-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
device-mapper-event-1.02.93-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
device-mapper-event-libs-1.02.93-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015
device-mapper-persistent-data-0.4.1-2.el7    BUILT: Wed Nov 12 12:39:46 CST 2014
cmirror-2.02.115-2.el7    BUILT: Thu Jan 22 06:09:14 CST 2015

Comment 6 Jonathan Earl Brassow 2015-01-26 15:58:20 UTC
Beginning at comment #3, this bug has changed from its original problem (inability to resume the cache device) to a more benign (warning only) issue.  The original bug has been fixed by disabling snapshots of cache LVs.  I've cloned this bug to address the second issue separately.

*** This bug has been marked as a duplicate of bug 1091553 ***


Note You need to log in before you can comment on or make changes to this bug.