Description of problem: Minor issue, but could cause (maybe) significant problems if lvm metadata recovery is needed at a later time (or other unexplored possibilities). Version-Release number of selected component (if applicable): Atomic Host 7.2.4 atomic-1.9-4.gitff44c6a.el7.x86_64 ostree-2016.1-2.atomic.el7.x86_64 How reproducible: Trivial, with precise timing :( Steps to Reproduce: 1. Run 'atomic host upgrade' 2. At the same time, expand a LV or make some other LVM metadata change. Actual results: # atomic host upgrade Updating from: rhel-atomic-host-ostree:rhel-atomic-host/7/x86_64/standard 99 metadata, 639 content objects fetched; 190201 KiB transferred in 229 seconds Copying /etc changes: 39 modified, 4 removed, 157955 added error: During /etc merge: Failed to read modified config file 'lvm/archive/.lvm_$HOSTNAME_$NUMBER_$OTHERNUMBER': No such file or directory Exit code 1 Expected results: Exit-code 0 and contents of /etc/lvm/ tree correctly reflect prior metadata state. Additional info: Found this by accident, so an acceptable resolution is: "Don't do that". Opening bug for reporting purposes and in case anyone else hits this or maybe there could be a more sinister problem here.
Anyy chance we could support storing this data in `/var/lib/lvm`? For more information, see https://ostree.readthedocs.io/en/latest/manual/adapting-existing/ Component: rhel-server-atomic → lvm2 Assignee: walters → lvm-team QA Contact: atomic-bugs → cluster-qe [reply] [−] Private Comment 2 Colin Walters 2016-08-08 16:37:25 EDT At some point we'll probably tweak the ostree process such that we only do the config merge before rebooting. This would obviate this problem as well as the "config changes i make after preparing an upgrade are gone". But there are other advantages of having lvm store state in /var - we don't copy it at all, and it's also not something administrators should edit with `vi` etc. It was noted that LVM_SYSTEM_DIR supports changing this today, but that *also* changes the configuration of the human-edited /etc/lvm/lvm.conf which should remain in /etc.
The problem here is that these backups might be needed to recover the system and so we were reluctant to put them outside the root filesystem by default. You might want to be able to read the backup before you can get access to /var. (We also considered that - recent backups at least - should go into /boot.) If you understand the risks and accept them, or have imposed a mount point layout where the issue doesn't arise, you can put the directories elsewhere by editing backup/backup_dir and/or backup/archive_dir in lvm.conf.
Is use of the existing config settings sufficient here?
(In reply to Alasdair Kergon from comment #2) > The problem here is that these backups might be needed to recover the system > and so we were reluctant to put them outside the root filesystem by default. > You might want to be able to read the backup before you can get access to > /var. (We also considered that - recent backups at least - should go into > /boot.) Did you consider/is it feasable to store the backup metadata inside LVM itself? I don't know offhand how much space is in the header (on each PV right?) The thing is that it's really common to have / on LVM (it's the Fedora/RHEL default), so no matter where we put it on the filesystem it's going to be LVM-data-in-filesystem-in-LVM, so we'll have a bootstrapping problem unless we can get it without mounting filesystems. > If you understand the risks and accept them, or have imposed a mount point > layout where the issue doesn't arise, you can put the directories elsewhere > by editing backup/backup_dir and/or backup/archive_dir in lvm.conf. In the short term, I think we could change ostree-managed systems like Atomic Host to use /var/lib/lvm by default. I'll investigate this, though if we do this we'll immediately have an upgrade problem; maybe we can add a systemd unit to migrate on boot? Another idea that would help would be a method/API for rpm-ostree to quiesce lvmetad temporarily while it's copying /etc
lvm2 stores metadata history in lvm2 ringbuffer inside a PV header. But default this ringbuffer is about ~1MB - and needs to handle 1 extra copy. So user max metadata size are limited to be below 0.5M It's somewhat more 'obscure' to get those in 'easily' readable form them reading archive data from filesystem. User is 'free' to take the risk and drop archiving and backup-ing at the price of more complicated 'retrieval' of history of metadata changes. lvm2 filesystem archive stores metadata in a bit more 'readable' fashion. You could also have large history if you want so... All depends upon a user. We normally advice standard archive. If user is aware of risk he takes - archiving & backup can be 'relatively' safe disabled....
Might be worth to note - in the enterprise world it's quite common to have proper backups or /etc - so it's then easy for support to get some 'history' info even without having physical access to the machine. It's quite different case from a 'single' Fedora user - where he typically has no backups - and needs to retrieve metadata out of disk header anyway.... We have long term BZ somewhere to make this a bit more easier for a user - but ATM is does require some brain capacity to go through this process.. (though not much - metadata are still all stored in ASCII form - just finding the right one and also handling ring-buffer logic makes it a bit challenging....)
This is definitely a real-world issue, it shows up in our CI for example: https://aos-ci.s3.amazonaws.com/ghprb/projectatomic/rpm-ostree/105b954aac26feb7eed139c4c42fb2b8f7fa5273.21579.100/artifacts/vmcheck.log So...I'm willing to carry the changes to move the bits by default to /var/lib/lvm in our post-scripts. It'd be useful though if LVM itself owned /var/lib/lvm, even if we didn't put anything in it initially.
But what is the real-world issue here? I'm interested only in the "how to recover when something went wrong or failed" aspects. If that means scripts have to be more intelligent about the layouts they choose, then so be it. Firstly, what is the mount point layout of the system you are considering? Is /var a different filesystem from / ? If it's different, is it in the same VG or a different VG?
OR are all these things - mount point layouts and everything - entirely in the hands of each user?
My feeling is still that moving from /etc to /var can only reduce - not enhance - the recovery options available (in general, across the complete userbase) and so isn't something we should be recommending. If atomic has some extra constraints it needs to impose, then the defaults could be changed on those systems, but how would that be identified? Does it use the same rpms or specially-built ones? Is there some identifer that says "I am an atomic system" that can be detected either at package install time or at runtime?
As for laying claim to a /var/lib/ subdir for lvm, yes, we could easily do that. (What we used to have in /var got moved to /run.)
The real-world issue is that right now, OSTree makes an assumption that the only content changing in /etc is a result of human In general we assume /var is on /, but we do want to support /var being a full tmpfs. For now, I propose we change the defaults on OSTree-managed systems, which is currently just Atomic Host, although there will soon be an "Atomic Workstation" or "Workstation OSTree"[1]. Anyways, here's the PR: https://pagure.io/fedora-atomic/pull-request/27 It's not beautiful, but it works. However, I think we should also enhance ostree to be more robust with things like this; I filed: https://github.com/ostreedev/ostree/issues/545 [1] https://lists.fedoraproject.org/archives/list/desktop@lists.fedoraproject.org/thread/BVHY7IIYFUIVGSUFKY4WWHDIEURBOLDX/
(In reply to Colin Walters from comment #12) > The real-world issue is that right now, OSTree makes an assumption that the > only content changing in /etc is a result of human > > In general we assume /var is on /, but we do want to support /var being a > full tmpfs. > What I'm missing here as primary info: Is atomic supposed to be working with 'standard' released rpm packages ? Is there assumption the lvm2 should auto-magically look out for files and archives and backups elsewhere if some i.e. ENVVAR is set ? What is actually recommended storage for backup/archive files (as clearly storing such files in tmpfs doesn't make any sense at all - you could then disable their creation/generation instead) Is atomic taking assumption it will not make ANY changes to standard config files of rpm i.e. /etc/lvm/lvm.conf ? (since there are all other settings already fully configurable - the only tricky one is the location of config file itself - which can be controled by LVM_SYSTEM_DIR) > Anyways, here's the PR: > https://pagure.io/fedora-atomic/pull-request/27 > > It's not beautiful, but it works. > > However, I think we should also enhance ostree to be more robust with things > like this; I filed: https://github.com/ostreedev/ostree/issues/545 So you've likely changed this already in your wrapper - so is there any problem with this logic ? I could hardly imagine some automatic support in lvm2 - unless we would provide some 'configs' for special ENVVARS in some way. We already support some cascading of configs so there could be some 'smart' support - but still there needs to be clear way how to identify atomic system, since lvm2 clearly cannot change its default just to make atomic 'happy' and break everyone elses system.
Also is there any document describing purpose/usage of lvm2 in Atomic ? Clearly you MAY NOT use lvm2 inside container - lvm2 is usable only at HOST level. There are number of daemons and settings which might need changing their defaults if there is a target of i.e. minimal memory footprint - number of daemons might be contra-productive in this 'small' environment.
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle. Changing version to '26'.
Still applies. @Zdenek: There's no real difference in the *purpose* of LVM for Atomic/(rpm-)ostree systems versus yum-managed. Nor ideally for usage. This bug results from a conflict between ostree creating "snapshots" of /etc versus LVM writing files there. At some point we'll address most of the issue here with https://github.com/projectatomic/rpm-ostree/issues/40 But even then, down the line we want to better support "sealing" /etc: https://github.com/projectatomic/rpm-ostree/issues/702 And in that model, LVM still shouldn't write snapshots to /etc. Although in practice...maybe this bug is really just greatly exacerbated by the Docker use of devicemapper, and as we migrate to overlay2, we won't be changing the LVM state, and hence we won't need to back it up?
This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle. Changing version to '27'.
(In reply to Colin Walters from comment #16) > Still applies. > > At some point we'll address most of the issue here with > https://github.com/projectatomic/rpm-ostree/issues/40 We're going to move forward with fixing issue #40 to solve this problem and others like it.
(In reply to Dusty Mabe from comment #18) > (In reply to Colin Walters from comment #16) > > Still applies. > > > > At some point we'll address most of the issue here with > > https://github.com/projectatomic/rpm-ostree/issues/40 > > We're going to move forward with fixing issue #40 to solve this problem and > others like it. Also #40 links to #545 from ostreedev/ostree, which is where the problem will likely be fixed rather than in rpm-ostree. https://github.com/ostreedev/ostree/issues/545
This one will be fixed soon in libostree.