RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1177056 - Huge metadata with thinp
Summary: Huge metadata with thinp
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1469559
TreeView+ depends on / blocked
 
Reported: 2014-12-24 01:35 UTC by Matěj Cepl
Modified: 2021-09-03 12:50 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-18 17:08:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
compressed metadata of rhel/pool00 thinp pool (4.56 MB, application/x-xz)
2014-12-24 01:35 UTC, Matěj Cepl
no flags Details
output of vgcfgbackup (14.28 KB, text/plain)
2014-12-24 01:37 UTC, Matěj Cepl
no flags Details
crash on startup of the system with too little space for metadata (perhaps none) (1.61 MB, image/png)
2014-12-24 01:40 UTC, Matěj Cepl
no flags Details
/var/log/messages from the crashed system (upto the crash itself) (187.64 KB, application/x-gz)
2014-12-24 01:41 UTC, Matěj Cepl
no flags Details
/var/log directory from the recovery system when working on the recovery (80.00 KB, application/x-tar)
2014-12-24 01:42 UTC, Matěj Cepl
no flags Details

Description Matěj Cepl 2014-12-24 01:35:13 UTC
Created attachment 972607 [details]
compressed metadata of rhel/pool00 thinp pool

Description of problem:
I have lost two days of recovering my LVM thinp based system (both root,
home, and swap used to be on swap; I have now moved root to normal LVM;
swap and /home are encrypted via cryptsetup), when the metadata on the
thinp pool got to 100%. Originally the size of metadata was 112 MB, but
still even after increasing the size of metadata to 436MB metadata take
around 90% of the metadata space. I am attaching compressed metadata for
the thinp pool (and vgcfgbackup output for whole LVM system).

So the first issue is that metadata seem to grow pretty fast. The second
problem is problem is that the crash happened at all. If the metadata
run out, than I would expect some normal -ENOSPACE or something of that
kind and not complete crash of the system and inability to boot up.
Another screenshot attached.

The third issue is that recovery is damn too hard. I really like the
idea of bug 1136979 comment 2. Admin shouldn’t be asked to run anything
more complicated than something like lvconvert --repair rhel/pool00.


Version-Release number of selected component (if applicable):
kernel-3.10.0-210.el7.x86_64
lvm2-2.02.113-1.el7.x86_64

How reproducible:
Metadata destruction happened once and it was more than enough, however
increasing chunksize to 5MB has been reproduced couple of times, when
I was wrestling with the recovery (with a very kind help of Zdeněk
Kabeláč).

Additional info:

matej@mitmanek: ~$ sudo lvs -a -o+chunksize
  LV                 VG   Attr       LSize   Pool   Origin    Data%  Meta% Chunk  
  debian             rhel Vwi-a-tz--  20.00g pool00           19.14             0 
  filemon            rhel Vwi-aotz--  20.00g pool00           21.01             0 
  home_base          rhel Vwi-aotz-- 171.64g pool00           77.04             0 
  home_bef_recovery  rhel Vwi---tz-k 171.64g pool00 home_base                   0 
  [lvol1_pmspare]    rhel ewi------- 436.00m                                    0 
  [lvol1_pmspare]    rhel ewi------- 436.00m                                    0 
  [lvol1_pmspare]    rhel ewi------- 436.00m                                    0 
  new_home_base      rhel Vwi-a-tz-- 200.00g pool00           14.44             0 
  old_pool00_t_meta0 rhel -wi-a----- 112.00m                                    0 
  old_root           rhel Vwi---tz-k  41.64g pool00                             0 
  pool00             rhel twi-a-tz-- 316.12g                  80.33  91.95 128.00k
  [pool00_tdata]     rhel Twi-ao---- 316.12g                                    0 
  [pool00_tdata]     rhel Twi-ao---- 316.12g                                    0 
  [pool00_tdata]     rhel Twi-ao---- 316.12g                                    0 
  [pool00_tmeta]     rhel ewi-ao---- 436.00m                                    0 
  [pool00_tmeta]     rhel ewi-ao---- 436.00m                                    0 
  [pool00_tmeta]     rhel ewi-ao---- 436.00m                                    0 
  root               rhel -wi-ao---- 100.00g                                    0 
  root-snapshot429   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot453   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot470   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot492   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot515   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot539   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot563   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot585   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot603   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot604   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot605   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot606   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  root-snapshot607   rhel Vri---tz-k  41.64g pool00 old_root                    0 
  swap_base          rhel -wi-ao----   5.86g                                    0 
  test               rhel Vwi---tz-k 121.64g pool00 home_base                   0 
  tmp                rhel -wi-a-----   4.00m                                    0 
  widle              rhel Vwi-a-tz--  50.00g pool00           61.74             0 
matej@mitmanek: ~$

Comment 1 Matěj Cepl 2014-12-24 01:37:41 UTC
Created attachment 972608 [details]
output of vgcfgbackup

Comment 2 Matěj Cepl 2014-12-24 01:40:08 UTC
Created attachment 972609 [details]
crash on startup of the system with too little space for metadata (perhaps none)

Comment 3 Matěj Cepl 2014-12-24 01:41:42 UTC
Created attachment 972610 [details]
/var/log/messages from the crashed system (upto the crash itself)

Comment 4 Matěj Cepl 2014-12-24 01:42:36 UTC
Created attachment 972611 [details]
/var/log directory from the recovery system when working on the recovery

Comment 6 Zdenek Kabelac 2014-12-25 18:53:06 UTC
(In reply to Matěj Cepl from comment #0)
> Created attachment 972607 [details]
> compressed metadata of rhel/pool00 thinp pool
> 
> Description of problem:
> I have lost two days of recovering my LVM thinp based system (both root,
> home, and swap used to be on swap; I have now moved root to normal LVM;
> swap and /home are encrypted via cryptsetup), when the metadata on the
> thinp pool got to 100%. Originally the size of metadata was 112 MB, but
> still even after increasing the size of metadata to 436MB metadata take
> around 90% of the metadata space. I am attaching compressed metadata for
> the thinp pool (and vgcfgbackup output for whole LVM system).
> 
> So the first issue is that metadata seem to grow pretty fast. The second
> problem is problem is that the crash happened at all. If the metadata
> run out, than I would expect some normal -ENOSPACE or something of that
> kind and not complete crash of the system and inability to boot up.
> Another screenshot attached.
> 
> The third issue is that recovery is damn too hard. I really like the
> idea of bug 1136979 comment 2. Admin shouldn’t be asked to run anything
> more complicated than something like lvconvert --repair rhel/pool00.

Just a few comments from me where we need to look at:

Few things happened together - the monitoring daemon was not used - which is the primary guard against running out-of-space.

'lvconvert --repair' failed to do it's work because it's been kind of unexpected, that  repair form 112MB into 'double-sized' new metadata  volume simply resulted into still 100% used metadata - with usage of 436MB it's been started however as mentioned with 90% fullness - here we need few words from Joe about this behaviour.

Other issue seems to be - 'swapping' of metadata is changing weirdly chunksize - which seems to be clear bug in lvm2 code.

Comment 7 Matěj Cepl 2014-12-25 21:43:20 UTC
(In reply to Zdenek Kabelac from comment #6)
> Few things happened together - the monitoring daemon was not used - which is
> the primary guard against running out-of-space.

Just to emphasize ... I hope that the monitoring daemon was used in the normal production usage (although when looking at pgrep -f -l event I don't get anything about running dmeventd, is it run only on deman/per event?), but it was not used only in the recovery image (run from DVD installation "Troubleshooting system"), where for some reason /usr/sbin/dmeventd is not present.

Comment 8 Matěj Cepl 2014-12-25 21:46:22 UTC
(In reply to Zdenek Kabelac from comment #6)
> the monitoring daemon was not used

Actually, on my rather default system I get this:

mitmanek:~# dmevent_tool -m
rhel-old_pool00_t_meta0 not monitored
rhel-tmp not monitored
rhel-swap_base not monitored
rhel-debian not monitored
rhel-root not monitored
rhel-home_base not monitored
rhel-pool00 not monitored
home not monitored
rhel-pool00-tpool not monitored
rhel-pool00_tdata not monitored
rhel-filemon not monitored
rhel-widle not monitored
rhel-pool00_tmeta not monitored
swap not monitored
rhel-new_home_base not monitored
mitmanek:~#

I feel anxious ... am I doomed to experience the same crash again?

Also, /dev/rhel/home_based and /dev/rhel/swap_base are foundation volumes for encryption into home and swap volumes respectively. I don't know if it matters.

Comment 9 Zdenek Kabelac 2014-12-26 23:10:51 UTC
(In reply to Matěj Cepl from comment #8)
> (In reply to Zdenek Kabelac from comment #6)
> > the monitoring daemon was not used
> 
> Actually, on my rather default system I get this:
> 
> rhel-pool00 not monitored

It's probably worth to add comment here:


lvm.conf

    thin_pool_autoextend_threshold = 100
    thin_pool_autoextend_percent = 20

By default the monitoring is NOT enabled - user must enable it on his will - since it obviously require more free space in VG to be available.
(Threshold should be like 75%)

On the other hand tool could be possibly a bit more 'noicy' in case the monitoring is not enabled and thin-pool is created.

Comment 10 Zdenek Kabelac 2015-01-05 13:15:53 UTC
Although we have here 2 bugs-in-one  - changing to Joe to resolve the metadata resize weirdness.

I'll do separate fix for  '--chunksize' & lvconvert.

Metadata in attachment are not the 'original' 112MB (then seems to be lost for now) - but even resized  440MB are showing strange results and occupy 90% of metadata space.

Interestingly its conversion 440->112->224  shows then just 50%.

Comment 11 Mike Snitzer 2015-01-05 13:56:30 UTC
(In reply to Matěj Cepl from comment #2)
> Created attachment 972609 [details]
> crash on startup of the system with too little space for metadata (perhaps
> none)

We really need the preceding thin-pool errors.

This screen grab is useless, other than we know XFS hit a NULL pointer (which obviously is a bug, cc'ing Eric _but_ I really doubt enough context was provided for Eric or any other XFS developer to _really_ fix this.. so they'll have to resort to trying to reproduce).

I suspect that the default 'no_space_timeout' of 60 seconds expired and the thin-pool switched to read-only mode.  At which point write IOs were returned to XFS as errors.

(In reply to Matěj Cepl from comment #3)
> Created attachment 972610 [details]
> /var/log/messages from the crashed system (upto the crash itself)

Again, there are no messages about thinp.. so this messages file is useless.
We need console logging, or if a crashdump was collected then the output of 'dmesg' from the crash utility.

I see no evidence of this being a thin-pool BUG, reassigning to zkabelac as the default of _not_ monitoring thin-pool seems very flawed.  Not to mention the fix needed from comment#10.

Comment 12 Zdenek Kabelac 2015-01-30 12:20:31 UTC
chunksize is now preserved in lvconvert with upstream patch:

https://www.redhat.com/archives/lvm-devel/2015-January/msg00068.html

However lvm2 cannot deal with the problem of doubling metadata size and still having 100% fullness of metadata.

So passing back to Joe.

The 'default' monitoring mechanism is yet to be decided.

Comment 14 Jonathan Earl Brassow 2017-07-27 16:26:42 UTC
(In reply to Zdenek Kabelac from comment #12)
> chunksize is now preserved in lvconvert with upstream patch:
> 
> https://www.redhat.com/archives/lvm-devel/2015-January/msg00068.html
> 
> However lvm2 cannot deal with the problem of doubling metadata size and
> still having 100% fullness of metadata.
> 
> So passing back to Joe.
> 
> The 'default' monitoring mechanism is yet to be decided.

couple things that we need to check on for this bug:
1) is change in metadata size now properly handled
2) we need to make it easier to run 'repair'

zkabelac, could you get this started and check if #1 is still a problem?

Comment 16 Jonathan Earl Brassow 2017-10-04 00:18:46 UTC
(In reply to Jonathan Earl Brassow from comment #14)
> (In reply to Zdenek Kabelac from comment #12)
> > chunksize is now preserved in lvconvert with upstream patch:
> > 
> > https://www.redhat.com/archives/lvm-devel/2015-January/msg00068.html
> > 
> > However lvm2 cannot deal with the problem of doubling metadata size and
> > still having 100% fullness of metadata.
> > 
> > So passing back to Joe.
> > 
> > The 'default' monitoring mechanism is yet to be decided.
> 
> couple things that we need to check on for this bug:
> 1) is change in metadata size now properly handled
> 2) we need to make it easier to run 'repair'
> 
> zkabelac, could you get this started and check if #1 is still a problem?

I believe #1 is already solved (need zkabelac to confirm).

We are working hard on solving #2 - that will be a separate bug.

So, if #1 is solved, then we can close this bug with an appropriate resolution and proceed with #2 on a separate bug.

Comment 18 Zdenek Kabelac 2020-11-18 17:08:55 UTC
There have been many improvements with thin_repair tool - so I'd guess this particular issue would be already handled correctly with current release. But it's too late for any more RH7 investigation - if the issue would have reoccurred on RH8, we need to create a new bug.


Note You need to log in before you can comment on or make changes to this bug.