Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1365297

Summary:	atomic host upgrade: Failed to read modified config file lvm/archive/
Product:	Red Hat Enterprise Linux 7	Reporter:	Chris Evich <cevich>
Component:	ostree	Assignee:	Colin Walters <walters>
Status:	CLOSED WONTFIX	QA Contact:	atomic-bugs <atomic-bugs>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	7.5	CC:	agk, bmr, dornelas, heinzm, jbrassow, miabbott, msnitzer, Philippe.Lafoucriere, prajnoha, smilner, thornber, walters, zkabelac
Target Milestone:	rc	Keywords:	Extras, Reopened
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-12-15 07:44:12 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1420851

Description Chris Evich 2016-08-08 20:30:50 UTC

Description of problem:
Minor issue, but could cause (maybe) significant problems if lvm metadata recovery is needed at a later time (or other unexplored possibilities).

Version-Release number of selected component (if applicable):
Atomic Host 7.2.4
atomic-1.9-4.gitff44c6a.el7.x86_64
ostree-2016.1-2.atomic.el7.x86_64

How reproducible:
Trivial, with precise timing :(

Steps to Reproduce:
1. Run 'atomic host upgrade'
2. At the same time, expand a LV or make some other LVM metadata change.

Actual results:
# atomic host upgrade
Updating from: rhel-atomic-host-ostree:rhel-atomic-host/7/x86_64/standard

99 metadata, 639 content objects fetched; 190201 KiB transferred in 229 seconds
Copying /etc changes: 39 modified, 4 removed, 157955 added
error: During /etc merge: Failed to read modified config file 'lvm/archive/.lvm_$HOSTNAME_$NUMBER_$OTHERNUMBER': No such file or directory

Exit code 1


Expected results:
Exit-code 0 and contents of /etc/lvm/ tree correctly reflect prior metadata state.

Additional info:
Found this by accident, so an acceptable resolution is: "Don't do that".

Opening bug for reporting purposes and in case anyone else hits this or maybe there could be a more sinister problem here.

Comment 1 Colin Walters 2016-08-08 20:35:30 UTC

Hey lvm team, any chance we could support storing this data in `/var/lib/lvm`?  For more information, see https://ostree.readthedocs.io/en/latest/manual/adapting-existing/

Comment 2 Colin Walters 2016-08-08 20:37:25 UTC

At some point we'll probably tweak the ostree process such that we only do the config merge before rebooting.  This would obviate this problem as well as the "config changes i make after preparing an upgrade are gone".

But there are other advantages of having lvm store state in /var - we don't copy it at all, and it's also not something administrators should edit with `vi` etc.

Comment 3 Chris Evich 2016-08-08 21:15:30 UTC

IIRC the important difference here is /etc is "guaranteed" to be on the / filesystem whereas /var is not.  For low-level facilities (like LVM) I can see an argument for wanting to possibly ease the pain for a (probably) small-number of cases where someone's important data is on the line.  

However, this problem is also likely reproducible using any tool/service that writes/changes/moves/locks files under /etc during an update.  Which is why I'm fine with a WONTFIX / "Don't do that" (i.e. docs) resolution.  I guess it depends how much low-level and manual-recovery options we want to enable/support on this platform.

Comment 5 Bryn M. Reeves 2016-08-09 10:25:55 UTC

Set LVM_SYSTEM_DIR:

       LVM_SYSTEM_DIR
              Directory containing lvm.conf(5) and other LVM system files.  
              Defaults to "/etc/lvm".

man 7 lvm.

Comment 6 Bryn M. Reeves 2016-08-09 10:26:30 UTC

Err, man *8* lvm..

Comment 8 Chris Evich 2016-08-11 17:26:40 UTC

I think NOTABUG is fine.

Maybe we should have a feature to configure LVM_SYSTEM_DIR in the atomic image to point at /var/... ?

Otherwise it's probably more practical to simply recommend not touching storage while you're running the os-tree update.  I'll see about getting this into the knowledge base in case a customer hits it.

Comment 9 Chris Evich 2016-08-11 19:14:45 UTC

Looks great, thanks Derrick.

Comment 11 Philippe Lafoucriere 2016-08-26 14:05:24 UTC

I have this issue on 2 centos atomic nodes today.
The root partition has been extended lately because it was full:

-bash-4.2# lvs
  LV          VG   Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool cah  twi-aot--- 64.60g             85.85  53.13
  root        cah  -wi-ao---- 10.00g
  swap        cah  -wi-ao----  5.00g

Now we have enough space:

-bash-4.2# df -h
Filesystem                                      Size  Used Avail Use% Mounted on
/dev/mapper/cah-root                             10G  7.0G  3.0G  71% /
devtmpfs                                        3.9G     0  3.9G   0% /dev
tmpfs                                           3.9G     0  3.9G   0% /dev/shm
tmpfs                                           3.9G  4.3M  3.9G   1% /run
tmpfs                                           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda1                                       297M  146M  152M  49% /boot
[...]

and I'm NOT in the middle of a lvm expansion (or else I missed a command after `xfs_growfs /`):

-bash-4.2# atomic host upgrade -r
Updating from: centos-atomic-host:centos-atomic-host/7/x86_64/standard
1 metadata, 0 content objects fetched; 313 B transferred in 0 seconds
Copying /etc changes: 26 modified, 4 removed, 179400 added
error: During /etc merge: Failed to read modified config file 'lvm/archive/.lvm_atomic-test-node-2.priv.tech-angels.net_967_1993706434': No such file or directory

Thanks

Comment 12 Chris Evich 2016-08-26 19:36:30 UTC

Oh, now that is interesting!  Right, because the thin-pool auto-extends, so if that were to happen at the same time as the upgrade, you'd hit this.

The error is a TOCTOU race on the temp/lock file.  Probably the code just needs to catch that error and refresh / retry the copy.

So, clearly not as "corner-case" of a problem as I thought.  Okay, re-opening this for further investigation.

Comment 13 Micah Abbott 2016-09-22 13:34:36 UTC

I think the correct component is probably 'ostree' for this.

Comment 15 Colin Walters 2018-01-15 15:28:36 UTC

This will be fixed by https://github.com/ostreedev/ostree/issues/545

Comment 16 Steve Milner 2018-02-12 15:15:14 UTC

This is related to /etc merge before reboot.

Comment 18 RHEL Program Management 2020-12-15 07:44:12 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.