Bug 1339210

Summary: dmeventd does not see a PV inserted after it's been started
Product: [Community] LVM and device-mapper Reporter: Vratislav Podzimek <vpodzime>
Component: lvm2Assignee: David Teigland <teigland>
lvm2 sub component: dmeventd QA Contact: cluster-qe <cluster-qe>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, anprice, bmarzins, bmr, dwysocha, heinzm, jbrassow, jonathan, lvm-team, msnitzer, prajnoha, teigland, thornber, zkabelac
Version: 2.02.174Flags: rule-engine: lvm-technical-solution?
rule-engine: lvm-test-coverage?
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-02 22:37:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journal from the point in time when the disk was plugged in
none
lvm.conf none

Description Vratislav Podzimek 2016-05-24 11:58:13 UTC
Created attachment 1160993 [details]
journal from the point in time when the disk was plugged in

Description of problem:
If I plug in an external USB disk with thin pool and thin LVs, errors are reported and when trying to write to one of the LVs, it gets unmounted because dmeventd thinks the PV (disk) is missing even though it is available all time.

Version-Release number of selected component (if applicable):
lvm2-2.02.150-1.fc24.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare an external (or virtual) disk with a thin pool and thin LVs
2. plug the disk in
3. see the errors journal output
4. mount one of the thin LVs and try to write to it (so that more metadata needs to be allocated?)
5. see that the thin LV got unmounted

Actual results:
dmeventd does not see the PV

Expected results:
dmeventd sees the PV and works with it just fine

Additional info:
Setting 'use_lvmetad=0' in lvm.conf fixes the issue, but the thin LVs are not auto-activated.

Comment 1 Vratislav Podzimek 2016-05-24 11:59:14 UTC
Created attachment 1160995 [details]
lvm.conf

Comment 2 Zdenek Kabelac 2016-05-24 12:02:52 UTC
dmeventd is not 'rescanning' device list and thinks  PV id  obtained from lvmetad is simply missing.

Comment 3 David Teigland 2016-05-24 14:54:39 UTC
A command that will "do something" because a device is missing should check that the device is really missing by reading from disk, it should not depend on lvmetad.  This is not quite the same as "not using lvmetad", because if the command changes metadata it should still tell lvmetad about that change.

The other problem here is that dmeventd is unmounting a file system.  It should only do that if a user specifically configures that (even then I don't think dm/lvm should be meddling with file systems which are outside its domain.)

Comment 4 David Teigland 2016-05-25 20:35:27 UTC
The commit on temp branch dev-dct-toollib-scan fixes the problem by updating the dev cache (list of devices on the system) explicitly at the start of each command.  A normal command does this anyway, so there's no change in that case.  But, in the case of lvm shell or dmeventd, a new command could run without updating the dev cache.  In those cases, the command may not see the latest devices added to the system, which happens in this bz.

However, one reason to use lvm shell is to avoid repeating some per-command overhead, like updating the dev cache.  So, the commit to update dev cache at the start of each command could reduce some efficiency when using lvm shell.

It's possible that a different patch could be written that would both fix the problem and preserve the current lvm shell behavior.  This would wait until a missing device became apparent somewhere during processing, at which point the command would update the dev cache and check again for the missing device.  It's not yet clear where a good place would be to do that.

Comment 5 Fedora End Of Life 2017-07-25 20:52:38 UTC
This message is a reminder that Fedora 24 is nearing its end of life.
Approximately 2 (two) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 24. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '24'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 24 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 7 Jonathan Earl Brassow 2018-04-04 20:08:25 UTC
This is related to Bug 1351777.  I suspect they will be fixed together.

Comment 8 David Teigland 2018-04-04 21:30:38 UTC
The patch I mentioned in comment 4 was never added to lvm:
https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=c391117eaa92ae5a8ae50fa2f1f81d87f406cc28

Testing with lvm shell, the new scanning code appears to fix this (that is the issue of dev cache not being updated with the new device, I'm not sure if that's what is also happening in the other bz.)