RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1348283 - thin pool does not contain 'F' failed attr right away after error target is loaded and resumed
Summary: thin pool does not contain 'F' failed attr right away after error target is l...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.3
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: rc
: ---
Assignee: LVM and device-mapper development team
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-20 16:47 UTC by Corey Marthaler
Modified: 2021-09-03 12:41 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-21 11:38:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2016-06-20 16:47:06 UTC
Description of problem:
This is the same test case used to verify bug 1021878 in rhel6.8.

In rhel7, lvm doesn't pick up the failed attribute unless a 'dmsetup status' is run.


# with or without lvmetad running

    thin_pool_autoextend_threshold = 70

[root@host-083 ~]# ps -ef | grep dmeventd
root      1206     1  0 11:32 ?        00:00:00 /usr/sbin/dmeventd -f



[...]
Swapping the current meta device table (snapper_thinp-POOL_tmeta: 0 8192 linear 8:49 2048) for an error target
dmsetup suspend snapper_thinp-POOL_tmeta
Loading in new meta device table (0 8192 error 8:49 2048)
dmsetup load snapper_thinp-POOL_tmeta --table " 0 8192 error 8:49 2048"
dmsetup resume snapper_thinp-POOL_tmeta

attr=twi-aotz--
thin pool device should have the (F)ailed attribute set

[root@host-083 ~]# lvs -a -o +devices
  LV              Attr       LSize Pool Origin Data%  Meta% Devices
  POOL            twi-aotz-- 5.00g             6.88   4.79  POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 5.00g                          /dev/sdc1(1)
  [POOL_tmeta]    ewi-ao---- 4.00m                          /dev/sdd1(0)
  [lvol0_pmspare] ewi------- 4.00m                          /dev/sdc1(0)
  origin          Vwi-a-tz-- 1.00g POOL        32.93
  other1          Vwi-a-tz-- 1.00g POOL        0.00
  other2          Vwi-a-tz-- 1.00g POOL        0.00
  other3          Vwi-a-tz-- 1.00g POOL        0.00
  other4          Vwi-a-tz-- 1.00g POOL        0.00
  other5          Vwi-a-tz-- 1.00g POOL        0.00
  snap1           Vwi-a-tz-- 1.00g POOL origin 16.63

[root@host-083 ~]# pvscan --cache
[root@host-083 ~]# lvs -a -o +devices
  LV              Attr       LSize Pool Origin Data%  Meta% Devices
  POOL            twi-aotz-- 5.00g             6.88   4.79  POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 5.00g                          /dev/sdc1(1)
  [POOL_tmeta]    ewi-ao---- 4.00m                          /dev/sdd1(0)
  [lvol0_pmspare] ewi------- 4.00m                          /dev/sdc1(0)
  origin          Vwi-a-tz-- 1.00g POOL        32.93
  other1          Vwi-a-tz-- 1.00g POOL        0.00
  other2          Vwi-a-tz-- 1.00g POOL        0.00
  other3          Vwi-a-tz-- 1.00g POOL        0.00
  other4          Vwi-a-tz-- 1.00g POOL        0.00
  other5          Vwi-a-tz-- 1.00g POOL        0.00
  snap1           Vwi-a-tz-- 1.00g POOL origin 16.63


# As soon as dmsetup status is run, it gets triggered:

[root@host-083 ~]# dmsetup status
snapper_thinp-origin: 0 2097152 thin 690688 2097151
snapper_thinp-POOL: 0 10485760 linear 
snapper_thinp-snap1: 0 2097152 thin 348672 2097151
snapper_thinp-other5: 0 2097152 thin 0 -
snapper_thinp-other4: 0 2097152 thin 0 -
snapper_thinp-other3: 0 2097152 thin 0 -
snapper_thinp-POOL-tpool: 0 10485760 thin-pool Error
snapper_thinp-POOL_tdata: 0 10485760 linear 
snapper_thinp-other2: 0 2097152 thin Fail
snapper_thinp-POOL_tmeta: 0 8192 error 
snapper_thinp-other1: 0 2097152 thin Fail

Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: metadata operation 'dm_pool_commit_metadata' failed: error = -5
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: aborting current metadata transaction
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: failed to abort metadata transaction
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: switching pool to failure mode
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: metadata operation 'dm_pool_commit_metadata' failed: error = -22
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: aborting current metadata transaction
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: failed to abort metadata transaction
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: switching pool to failure mode
Jun 20 11:27:04 host-083 kernel: device-mapper: thin metadata: couldn't read superblock
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: failed to set 'needs_check' flag in metadata[  730.758665] device-mapper: thin metadata: couldn't read superblock
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: dm_pool_get_metadata_transaction_id returned -22
Jun 20 11:27:04 host-083 kernel: device-mapper: thin metadata: couldn't read superblock
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: failed to set 'needs_check' flag in metadata
Jun 20 11:27:04 host-083 kernel: device-mapper: thin: 253:4: dm_pool_get_metadata_transaction_id returned -22
Jun 20 11:27:10 host-083 lvm[1118]: WARNING: Thin pool snapper_thinp-POOL-tpool metadata is now 100.00% full.
Jun 20 11:27:10 host-083 lvm[1118]: WARNING: Thin pool snapper_thinp-POOL-tpool data is now 100.00% full.


[root@host-083 ~]# lvs -a -o +devices
  LV              Attr       LSize Pool Origin Data%  Meta% Devices
  POOL            twi-aotzF- 5.00g                          POOL_tdata(0)
  [POOL_tdata]    Twi-ao---- 5.00g                          /dev/sdc1(1)
  [POOL_tmeta]    ewi-ao---- 4.00m                          /dev/sdd1(0)
  [lvol0_pmspare] ewi------- 4.00m                          /dev/sdc1(0)
  origin          Vwi-a-tzF- 1.00g POOL
  other1          Vwi-a-tzF- 1.00g POOL
  other2          Vwi-a-tzF- 1.00g POOL
  other3          Vwi-a-tzF- 1.00g POOL
  other4          Vwi-a-tzF- 1.00g POOL
  other5          Vwi-a-tzF- 1.00g POOL
  snap1           Vwi-a-tzF- 1.00g POOL origin




Version-Release number of selected component (if applicable):
3.10.0-419.el7.x86_64

lvm2-2.02.156-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
lvm2-libs-2.02.156-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
lvm2-cluster-2.02.156-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
device-mapper-1.02.126-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
device-mapper-libs-1.02.126-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
device-mapper-event-1.02.126-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
device-mapper-event-libs-1.02.126-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
device-mapper-persistent-data-0.6.2-0.1.rc8.el7    BUILT: Wed May  4 02:56:34 CDT 2016
cmirror-2.02.156-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016
sanlock-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
sanlock-lib-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
lvm2-lockd-2.02.156-1.el7    BUILT: Mon Jun 13 03:05:51 CDT 2016


How reproducible:
Everytime

Comment 2 Zdenek Kabelac 2016-06-21 11:38:54 UTC
This is actually a 'Feature' (and even major bug fix).

Now the longer story behind - older lvm2 has had a hidden problem in its device scan processing where it has unintentionally called status on device with 'flushing' - so commands like 'lvs' had basically always flushed thin-pool - causing   thin-pool commit point to happen - lots of associated actions....

Lvm2 command now does not cause flushing of thin-pool - also lvs displays  values 'known' in kernel without flushing - this is much faster and saves lots of disk access  at the price of possible impression of presented result.

The reason why we need to do it this way is - we have to avoid 'deadlock' - which would be the case  thin-pool flush would run out-of-space.  While such command would be waiting for resize - this actually could never happen - since we would be already holding VG lock.

So regarding to this BZ - lvm2 command is not supposed to be transitioning pool in 'F' state with i.e. lvs command -  it should be some user's write/fsync whatever command and lvm2 command should be 'plain' observer.

To get the matching 'dmsetup' behavior use:
dmsetup status --noflush --nolockfs


Note You need to log in before you can comment on or make changes to this bug.