Bug 1366584 - storing non-human-edited data in /etc/lvm/archive clashes with ostree model
Summary: storing non-human-edited data in /etc/lvm/archive clashes with ostree model
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: lvm2
Version: 27
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: LVM and device-mapper development team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-12 12:00 UTC by Colin Walters
Modified: 2018-04-06 22:08 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-06 22:08:00 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1365297 0 unspecified CLOSED atomic host upgrade: Failed to read modified config file lvm/archive/ 2023-04-26 08:41:15 UTC

Internal Links: 1365297

Description Colin Walters 2016-08-12 12:00:01 UTC
Description of problem:
Minor issue, but could cause (maybe) significant problems if lvm metadata recovery is needed at a later time (or other unexplored possibilities).

Version-Release number of selected component (if applicable):
Atomic Host 7.2.4
atomic-1.9-4.gitff44c6a.el7.x86_64
ostree-2016.1-2.atomic.el7.x86_64

How reproducible:
Trivial, with precise timing :(

Steps to Reproduce:
1. Run 'atomic host upgrade'
2. At the same time, expand a LV or make some other LVM metadata change.

Actual results:
# atomic host upgrade
Updating from: rhel-atomic-host-ostree:rhel-atomic-host/7/x86_64/standard

99 metadata, 639 content objects fetched; 190201 KiB transferred in 229 seconds
Copying /etc changes: 39 modified, 4 removed, 157955 added
error: During /etc merge: Failed to read modified config file 'lvm/archive/.lvm_$HOSTNAME_$NUMBER_$OTHERNUMBER': No such file or directory

Exit code 1


Expected results:
Exit-code 0 and contents of /etc/lvm/ tree correctly reflect prior metadata state.

Additional info:
Found this by accident, so an acceptable resolution is: "Don't do that".

Opening bug for reporting purposes and in case anyone else hits this or maybe there could be a more sinister problem here.

Comment 1 Colin Walters 2016-08-12 12:03:51 UTC
Anyy chance we could support storing this data in `/var/lib/lvm`?  For more information, see https://ostree.readthedocs.io/en/latest/manual/adapting-existing/

Component: rhel-server-atomic → lvm2
Assignee: walters → lvm-team
QA Contact: atomic-bugs → cluster-qe
[reply] [−]
Private
Comment 2 Colin Walters 2016-08-08 16:37:25 EDT

At some point we'll probably tweak the ostree process such that we only do the config merge before rebooting.  This would obviate this problem as well as the "config changes i make after preparing an upgrade are gone".

But there are other advantages of having lvm store state in /var - we don't copy it at all, and it's also not something administrators should edit with `vi` etc.

It was noted that LVM_SYSTEM_DIR supports changing this today, but that *also* changes the configuration of the human-edited /etc/lvm/lvm.conf which should remain in /etc.

Comment 2 Alasdair Kergon 2016-08-15 19:02:57 UTC
The problem here is that these backups might be needed to recover the system and so we were reluctant to put them outside the root filesystem by default.  You might want to be able to read the backup before you can get access to /var.  (We also considered that - recent backups at least - should go into /boot.)

If you understand the risks and accept them, or have imposed a mount point layout where the issue doesn't arise, you can put the directories elsewhere by editing backup/backup_dir and/or backup/archive_dir in lvm.conf.

Comment 3 Alasdair Kergon 2016-08-23 22:49:50 UTC
Is use of the existing config settings sufficient here?

Comment 4 Colin Walters 2016-08-26 14:42:44 UTC
(In reply to Alasdair Kergon from comment #2)
> The problem here is that these backups might be needed to recover the system
> and so we were reluctant to put them outside the root filesystem by default.
> You might want to be able to read the backup before you can get access to
> /var.  (We also considered that - recent backups at least - should go into
> /boot.)

Did you consider/is it feasable to store the backup metadata inside LVM itself? I don't know offhand how much space is in the header (on each PV right?) 

The thing is that it's really common to have / on LVM (it's the Fedora/RHEL default), so no matter where we put it on the filesystem it's going to be LVM-data-in-filesystem-in-LVM, so we'll have a bootstrapping problem unless we can get it without mounting filesystems.

> If you understand the risks and accept them, or have imposed a mount point
> layout where the issue doesn't arise, you can put the directories elsewhere
> by editing backup/backup_dir and/or backup/archive_dir in lvm.conf.

In the short term, I think we could change ostree-managed systems like Atomic Host to use /var/lib/lvm by default.  I'll investigate this, though if we do this we'll immediately have an upgrade problem; maybe we can add a systemd unit to migrate on boot?

Another idea that would help would be a method/API for rpm-ostree to quiesce lvmetad temporarily while it's copying /etc

Comment 5 Zdenek Kabelac 2016-08-26 14:52:26 UTC
lvm2  stores  metadata history in  lvm2 ringbuffer inside a PV header.

But default this ringbuffer is about ~1MB - and needs to handle 1 extra copy.
So user max metadata size are limited to be below  0.5M

It's somewhat more 'obscure' to get those in 'easily' readable form them reading archive data from filesystem.

User is 'free' to take the risk and drop archiving and backup-ing at the price of more  complicated 'retrieval' of history of metadata changes.

lvm2 filesystem archive stores metadata in a bit more 'readable' fashion.
You could also have large history if you want so...

All depends upon a user. 

We normally advice standard archive.

If user is aware of risk he takes -  archiving & backup can be 'relatively' safe disabled....

Comment 6 Zdenek Kabelac 2016-08-26 15:11:10 UTC
Might be worth to note -   in the enterprise world it's quite common to have proper backups or  /etc  - so it's then easy for support to get some 'history' info even without having physical access to the machine.

It's quite different case from a 'single' Fedora user - where he typically has no backups - and needs to retrieve  metadata out of disk header anyway....

We have long term BZ somewhere to make this a bit more easier for a user - but ATM is does require some brain capacity to go through this process..
(though not much -  metadata are still all stored in ASCII form - just finding the right one and also handling ring-buffer logic makes it a bit challenging....)

Comment 7 Colin Walters 2016-10-21 20:40:53 UTC
This is definitely a real-world issue, it shows up in our CI for example:

https://aos-ci.s3.amazonaws.com/ghprb/projectatomic/rpm-ostree/105b954aac26feb7eed139c4c42fb2b8f7fa5273.21579.100/artifacts/vmcheck.log

So...I'm willing to carry the changes to move the bits by default to /var/lib/lvm in our post-scripts.  It'd be useful though if LVM itself owned /var/lib/lvm,  even if we didn't put anything in it initially.

Comment 8 Alasdair Kergon 2016-10-21 21:07:51 UTC
But what is the real-world issue here?

I'm interested only in the "how to recover when something went wrong or failed" aspects.  If that means scripts have to be more intelligent about the layouts they choose, then so be it.

Firstly, what is the mount point layout of the system you are considering?
Is /var a different filesystem from / ?
If it's different, is it in the same VG or a different VG?

Comment 9 Alasdair Kergon 2016-10-21 21:16:05 UTC
OR are all these things - mount point layouts and everything - entirely in the hands of each user?

Comment 10 Alasdair Kergon 2016-10-21 21:21:27 UTC
My feeling is still that moving from /etc to /var can only reduce - not enhance - the recovery options available (in general, across the complete userbase) and so isn't something we should be recommending.

If atomic has some extra constraints it needs to impose, then the defaults could be changed on those systems, but how would that be identified?  Does it use the same rpms or specially-built ones?  Is there some identifer that says "I am an atomic system" that can be detected either at package install time or at runtime?

Comment 11 Alasdair Kergon 2016-10-21 21:40:46 UTC
As for laying claim to a /var/lib/ subdir for lvm, yes, we could easily do that.
(What we used to have in /var got moved to /run.)

Comment 12 Colin Walters 2016-10-24 18:28:21 UTC
The real-world issue is that right now, OSTree makes an assumption that the only content changing in /etc is a result of human

In general we assume /var is on /, but we do want to support /var being a full tmpfs.

For now, I propose we change the defaults on OSTree-managed systems, which is currently just Atomic Host, although there will soon be an "Atomic Workstation" or "Workstation OSTree"[1].


Anyways, here's the PR:
https://pagure.io/fedora-atomic/pull-request/27

It's not beautiful, but it works.

However, I think we should also enhance ostree to be more robust with things like this; I filed: https://github.com/ostreedev/ostree/issues/545



[1] https://lists.fedoraproject.org/archives/list/desktop@lists.fedoraproject.org/thread/BVHY7IIYFUIVGSUFKY4WWHDIEURBOLDX/

Comment 13 Zdenek Kabelac 2016-10-25 06:52:28 UTC
(In reply to Colin Walters from comment #12)
> The real-world issue is that right now, OSTree makes an assumption that the
> only content changing in /etc is a result of human
> 
> In general we assume /var is on /, but we do want to support /var being a
> full tmpfs.
> 

What I'm missing here as primary info:

Is atomic supposed to be working with 'standard' released rpm packages ?

Is there assumption the  lvm2 should auto-magically look out for files and archives and backups elsewhere if some i.e.  ENVVAR is set ?

What is actually recommended storage for backup/archive files
(as clearly storing such files in tmpfs doesn't make any sense at all - 
you could then disable their creation/generation instead)

Is atomic taking assumption it will not make ANY changes to standard
config files of rpm i.e. /etc/lvm/lvm.conf ?
(since there are all other settings already fully configurable - the only
tricky one is the location of config file itself - which can be controled by
LVM_SYSTEM_DIR)

> Anyways, here's the PR:
> https://pagure.io/fedora-atomic/pull-request/27
> 
> It's not beautiful, but it works.
> 
> However, I think we should also enhance ostree to be more robust with things
> like this; I filed: https://github.com/ostreedev/ostree/issues/545

So you've likely changed this already in your wrapper - so is there any problem with this logic ?

I could hardly imagine some automatic support in lvm2 - unless we would provide
some 'configs'  for special ENVVARS in some way.

We already support some cascading of configs so there could be some 'smart' support - but still there needs to be clear way how to identify atomic system,
since  lvm2 clearly cannot change its default just to make atomic 'happy' and break everyone elses system.

Comment 14 Zdenek Kabelac 2016-10-25 06:58:40 UTC
Also is there any document describing purpose/usage of lvm2 in Atomic ?

Clearly you MAY NOT use lvm2 inside container - lvm2 is usable only at HOST level.

There are number of daemons and settings which might need changing their defaults if there is a target of i.e. minimal memory footprint - number of daemons might be contra-productive in this 'small' environment.

Comment 15 Fedora End Of Life 2017-02-28 10:05:30 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 16 Colin Walters 2017-05-10 14:09:19 UTC
Still applies.

@Zdenek: There's no real difference in the *purpose* of LVM for Atomic/(rpm-)ostree systems versus yum-managed.  Nor ideally for usage.

This bug results from a conflict between ostree creating "snapshots" of /etc versus LVM writing files there.


At some point we'll address most of the issue here with https://github.com/projectatomic/rpm-ostree/issues/40

But even then, down the line we want to better support "sealing" /etc: https://github.com/projectatomic/rpm-ostree/issues/702

And in that model, LVM still shouldn't write snapshots to /etc.

Although in practice...maybe this bug is really just greatly exacerbated by the Docker use of devicemapper, and as we migrate to overlay2, we won't be changing the LVM state, and hence we won't need to back it up?

Comment 17 Jan Kurik 2017-08-15 09:03:08 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle.
Changing version to '27'.

Comment 18 Dusty Mabe 2018-01-23 16:34:17 UTC
(In reply to Colin Walters from comment #16)
> Still applies.
> 
> At some point we'll address most of the issue here with
> https://github.com/projectatomic/rpm-ostree/issues/40

We're going to move forward with fixing issue #40 to solve this problem and others like it.

Comment 19 Dusty Mabe 2018-01-23 16:58:43 UTC
(In reply to Dusty Mabe from comment #18)
> (In reply to Colin Walters from comment #16)
> > Still applies.
> > 
> > At some point we'll address most of the issue here with
> > https://github.com/projectatomic/rpm-ostree/issues/40
> 
> We're going to move forward with fixing issue #40 to solve this problem and
> others like it.

Also #40 links to #545 from ostreedev/ostree, which is where the problem will likely be fixed rather than in rpm-ostree. 

https://github.com/ostreedev/ostree/issues/545

Comment 20 Colin Walters 2018-04-06 22:08:00 UTC
This one will be fixed soon in libostree.


Note You need to log in before you can comment on or make changes to this bug.