Bug 2127118

Summary: Adjust location of aux-cache for image-based update systems
Product: Red Hat Enterprise Linux 9 Reporter: Colin Walters <walters>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED NOTABUG QA Contact: qe-baseos-tools-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0CC: ashankar, codonell, dj, fweimer, mnewsome, pfrankli, sipoyare
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-15 19:56:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Colin Walters 2022-09-15 11:58:51 UTC
For image based update systems (e.g. ostree and transitively rpm-ostree, used by RHEL for Edge and RHEL CoreOS today), one typically wants to define clear rules for which data is part of the read-only OS update stream and which is per-machine mutable state.  A bit on this in https://ostreedev.github.io/ostree/adapting-existing/#system-layout

The current glibc /var/cache/ldconfig/aux-cache is part of the former - it is a cache of data derived from /usr, and hence should live in /usr.  Having it be e.g. /usr/lib/ldconfig/aux-cache seems fine to me.

(I think it's possible for the ldconfig cache to also pick up entries from e.g. /usr/local - in that case, we want a separate cache at /usr/local/lib/ldconfig/aux-cache)


xref https://github.com/ostreedev/ostree-rs-ext/pull/367

Comment 1 Florian Weimer 2022-09-15 12:06:48 UTC
It's not accurate that the cache file is derived from /usr. It is derived from the paths listed in /etc/ld.so.conf, which could point outside of /usr. At least it's a true cache, in the sense that it is optional (unlike /etc/ld.so.cache, which is not actually cache and thus misnamed).

As far as I can tell, glibc already complies with all FHS requirements around /var/cache (the main issue being that you can delete /var/cache/ldconfig at any time, and everything will still work), and as it's a true (optional) cache, it should indeed go under /var/cache. I'm not sure there is anything left to do for us here.

Comment 2 Colin Walters 2022-09-15 12:58:07 UTC
>  It is derived from the paths listed in /etc/ld.so.conf, which could point outside of /usr.

Yes, though on a default install it doesn't.  

> As far as I can tell, glibc already complies with all FHS requirements around /var/cache (the main issue being that you can delete /var/cache/ldconfig at any time, and everything will still work), and as it's a true (optional) cache, it should indeed go under /var/cache. I'm not sure there is anything left to do for us here.

The FHS is an old document and was written before image-based update systems were popular (in some sense).  

Yes, it's a cache.  But what's crucial to note here is that because it's in /var, there's only *one copy* of it.  Whereas on an image-based system, there may be two (or more) copies of /usr.  Hence, having cache data not "lifecycle bound" with /usr can create atomicity problems.

What I'm suggesting is e.g. that we have one cache for /usr, and one for anything not in /usr.

I should be clear: This is not a critical problem today - today rpm-ostree just drops the file at build time, and I'm going to adjust the new container-native flow (https://fedoraproject.org/wiki/Changes/OstreeNativeContainer) to do so as well.  We can regenerate it per-machine on boot, or not do so.

But things would work better and be more elegant if data for /usr lived under /usr.

For another example of something that would work better if it was in /usr: OSTree supports IMA: https://ostreedev.github.io/ostree/ima/ - and a flow we want here is that the files under /usr are signed at build time.  There should not be mutated per-machine.  But a model where we regenerate /var/cache/ldconfig per-machine would inherently then not be signed.  And the data there is part of the critical "TCB" path, right?  If it's malicious data, then I could have glibc load a library I wrote (that's also not signed) in /usr/local/lib.  Whereas what we really want is to configure image-based RHEL systems (ostree, but also e.g. dm-verity and others) to *only execute signed code*, and to not allow code execution by writing to per-machine stateful partitions (/var).

Comment 3 Florian Weimer 2022-09-15 13:03:19 UTC
(In reply to Colin Walters from comment #2)
> For another example of something that would work better if it was in /usr:
> OSTree supports IMA: https://ostreedev.github.io/ostree/ima/ - and a flow we
> want here is that the files under /usr are signed at build time.  There
> should not be mutated per-machine.  But a model where we regenerate
> /var/cache/ldconfig per-machine would inherently then not be signed.  And
> the data there is part of the critical "TCB" path, right?

No, it's only used by ldconfig, to speed up the generation of /etc/ld.so.cache. Maybe that's the core of the misunderstanding? You don't actually need it at all during normal operation, it's only purpose is to make ldconfig significantly faster (particularly on non-SSD storage).

Comment 4 Colin Walters 2022-09-15 19:56:25 UTC
> No, it's only used by ldconfig, to speed up the generation of /etc/ld.so.cache. Maybe that's the core of the misunderstanding? You don't actually need it at all during normal operation, it's only purpose is to make ldconfig significantly faster (particularly on non-SSD storage).

Yes now that I look, I think you're right that I had confused this cache with /etc/ld.so.cache.
(Which would be nice to have in /usr potentially but it's not a big deal)

I did
https://github.com/ostreedev/ostree-rs-ext/pull/367
which will ensure we're consistently cleaning out /var/cache entirely (along with this cache file)
as part of OS container builds.

Thanks for the quick reply and discussion and pointing me in the right direction!