Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1074630

Summary: system hangs during boot when root FS is a thin volume in a pool on RAID LV
Product: Red Hat Enterprise Linux 7 Reporter: Marian Csontos <mcsontos>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Default / Unclassified QA Contact: cluster-qe <cluster-qe>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, heinzm, jbrassow, lmiksik, msnitzer, nperic, prajnoha, prockai, zkabelac
Version: 7.0   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.105-13.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 11:27:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
blocked pvscans and systemd
none
all processes
none
console with systemd debugging on
none
2-blocked tasks
none
2-stataus of all tasks none

Description Marian Csontos 2014-03-10 17:18:40 UTC
Description of problem:
system hangs during boot when root FS is a thin volume in a pool on RAID LV

Version-Release number of selected component (if applicable):
lvm2-2.02.105-*.el7

How reproducible:
100%

Steps to Reproduce:
1. create a layout with thin LV in a pool built using RAID LV
2. use anaconda to install system
3. boot the system

Actual results:
- system hangs during boot
- rootFS is still RO. Nothing works except SysRq - traces attached

Expected results:
- clea boot

Additional info:

Comment 1 Marian Csontos 2014-03-10 17:19:37 UTC
Created attachment 872803 [details]
blocked pvscans and systemd

Comment 2 Marian Csontos 2014-03-10 17:20:29 UTC
Created attachment 872804 [details]
all processes

Comment 3 Marian Csontos 2014-03-10 17:21:35 UTC
Created attachment 872805 [details]
console with systemd debugging on

Comment 5 Marian Csontos 2014-03-13 12:18:33 UTC
Created attachment 873952 [details]
2-blocked tasks

Comment 6 Marian Csontos 2014-03-13 12:19:53 UTC
Created attachment 873953 [details]
2-stataus of all tasks

There is a couple of pvscans going out of suspend.

Without lvmetad (and pvscans) this boots fine.

Comment 7 Zdenek Kabelac 2014-03-13 12:24:22 UTC
Looks like pvscan --refresh issue  executed when lvmetad discovers whole VG.

Comment 8 Zdenek Kabelac 2014-03-13 12:31:23 UTC
Yep - looks like this refresh is not taking read-lock for VG - which is a problem.

Comment 9 Peter Rajnoha 2014-03-13 12:33:45 UTC
(In reply to Zdenek Kabelac from comment #7)
> Looks like pvscan --refresh issue  executed when lvmetad discovers whole VG.

Yup, refresh is called to fix bug #954061 (we need to refresh tables if the major:minor changes - e.g. someone taking out the disk and then putting it back...)

Comment 10 Peter Rajnoha 2014-03-13 12:51:49 UTC
(would be also fine if we could do the refresh conditionally - only when needed - when tables need reloading, not all the time)

Comment 11 Peter Rajnoha 2014-03-13 13:16:24 UTC
So what happens is that before each autoactivation, the VG refresh is called in case the VG is already activated - this repairs all tables to cover the situation as seen in bug #954061.

BUT, at the same time we need to take the VG write lock for the VG refresh (which is not happening at the moment and this causes the bug seen here).

Well, but even if I wanted, I could not take the VG lock since I don't know the VG name (lvmetad does not return it on the "VG complete" message). What I have is the VGID. And I can't do any locking with VGIDs - I need to call the VG read first to get the name (in this case we call vg_read_internal).

That becomes a bit tricky then...

Comment 12 Zdenek Kabelac 2014-03-13 14:05:12 UTC
Maybe 'lvmetad' could be enhanced to recognize  1st. completion and for all others it would send 'already_complete'

This needs on lvmetad to probably validate it has received already known data (matching content in lvmetad db) - and if VG has been already sending 'complete' - it would start to send  'already_complete' instead.

This could be disabled when PV gets removed or 'different' metadata are submitted to lvmetad.

This minimizes race in parallel '--refresh' execution (though the bug it still there, since it may get dmeventd into serious troubles anyway) - but it looks like a small step forward.

Petr - any better idea ?

Comment 13 Peter Rajnoha 2014-03-14 15:36:28 UTC
Upstream commits:

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=ada47c164ae56b586fae7500be2d6aa66f5323b1
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=ca880a4f130a0d6111613e23f926c344217581a2
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=551b6b799867860fc6a437be5298bc74b814ab19
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=67c539f346a8259a4100ab10196486f8793fecce
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=816197aaabbd41dd85d7728a6fd2992c148e124f
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=5eef269f7719089db2474b3f732f3630467d1bbc

They enhance lvmetad protocol to include the vgname and a "changed" flag returned from lvmetad as a response to the message from the PV scan. This additional information is then used to do a VG refresh only when necessary (the PV is new/reappeared or its metadata changed when compared to previous state). So during boot we avoid numerous refreshes that also caused the bug reported here (we're not detaching and attaching PVs back during boot - it's possible, but normally it does not happen).

There were numerous refreshes because the VG was already activated in initramfs and during the udev trigger done at boot after initramfs stage, *each* PV scan was marked with the "complete" flag since the VG was already complete from initramfs, hence causing the number of refreshes to be equal to the number of PVs for which the VG was already activated before in initramfs. 

Also, there's a VG read lock used now for the refresh before autoactivation and the autoactivation itself which helps protect the VG handling in case there's a VG write lock held already (e.g. running a command that modifies metadata and running udev trigger at the same time).


Though we should still protect the refresh in some better way in general. The same problem would arise if running several refreshes at once by calling "vgchange --refresh". The solution here would be to:

  - take the VG write lock for the refresh

  - or better take the per-LV lock for each LV being refreshed, probably counting with the LV deptree as well (the per-LV locking is planned to also resolve bug #878948)

For now, I'm moving this to POST for the original problem reported (so we can add a patch for RHEL7).

Comment 14 Peter Rajnoha 2014-03-19 09:07:09 UTC
Requesting blocker for this one since without patching, we can end up with a hang at boot when more complex stacks are used (like here in this report with root FS on thion on RAID LV).

Comment 16 Marian Csontos 2014-03-27 10:27:22 UTC
I have checked thin-pool on RAID and other complex stacks are bootable now (excluding dm-cache - Bug 1081435).

Comment 17 Nenad Peric 2014-03-27 12:24:02 UTC
Marking this BZ verified then based on the test which passed successfuly (the one which discovered the bug in the first place). 

VERIFIED based on Comment #16

Comment 18 Ludek Smid 2014-06-13 11:27:52 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.