Created attachment 1160993 [details] journal from the point in time when the disk was plugged in Description of problem: If I plug in an external USB disk with thin pool and thin LVs, errors are reported and when trying to write to one of the LVs, it gets unmounted because dmeventd thinks the PV (disk) is missing even though it is available all time. Version-Release number of selected component (if applicable): lvm2-2.02.150-1.fc24.x86_64 How reproducible: 100% Steps to Reproduce: 1. prepare an external (or virtual) disk with a thin pool and thin LVs 2. plug the disk in 3. see the errors journal output 4. mount one of the thin LVs and try to write to it (so that more metadata needs to be allocated?) 5. see that the thin LV got unmounted Actual results: dmeventd does not see the PV Expected results: dmeventd sees the PV and works with it just fine Additional info: Setting 'use_lvmetad=0' in lvm.conf fixes the issue, but the thin LVs are not auto-activated.
Created attachment 1160995 [details] lvm.conf
dmeventd is not 'rescanning' device list and thinks PV id obtained from lvmetad is simply missing.
A command that will "do something" because a device is missing should check that the device is really missing by reading from disk, it should not depend on lvmetad. This is not quite the same as "not using lvmetad", because if the command changes metadata it should still tell lvmetad about that change. The other problem here is that dmeventd is unmounting a file system. It should only do that if a user specifically configures that (even then I don't think dm/lvm should be meddling with file systems which are outside its domain.)
The commit on temp branch dev-dct-toollib-scan fixes the problem by updating the dev cache (list of devices on the system) explicitly at the start of each command. A normal command does this anyway, so there's no change in that case. But, in the case of lvm shell or dmeventd, a new command could run without updating the dev cache. In those cases, the command may not see the latest devices added to the system, which happens in this bz. However, one reason to use lvm shell is to avoid repeating some per-command overhead, like updating the dev cache. So, the commit to update dev cache at the start of each command could reduce some efficiency when using lvm shell. It's possible that a different patch could be written that would both fix the problem and preserve the current lvm shell behavior. This would wait until a missing device became apparent somewhere during processing, at which point the command would update the dev cache and check again for the missing device. It's not yet clear where a good place would be to do that.
This message is a reminder that Fedora 24 is nearing its end of life. Approximately 2 (two) weeks from now Fedora will stop maintaining and issuing updates for Fedora 24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '24'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 24 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This is related to Bug 1351777. I suspect they will be fixed together.
The patch I mentioned in comment 4 was never added to lvm: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=c391117eaa92ae5a8ae50fa2f1f81d87f406cc28 Testing with lvm shell, the new scanning code appears to fix this (that is the issue of dev cache not being updated with the new device, I'm not sure if that's what is also happening in the other bz.)