Bug 2034360 - rpm-ostree and nss-altfiles integration is broken
Summary: rpm-ostree and nss-altfiles integration is broken
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: authselect
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Pavel Březina
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: sync-to-jira
Depends On: 2051545
Blocks: IoT F36BetaBlocker 2019052
TreeView+ depends on / blocked
 
Reported: 2021-12-20 18:54 UTC by Colin Walters
Modified: 2022-06-17 12:27 UTC (History)
12 users (show)

Fixed In Version: authselect-1.3.0-9.fc36
Clone Of:
Environment:
Last Closed: 2022-02-07 13:56:50 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github authselect authselect pull 289 0 None open Default to `--no-backup` if system is not initialized 2022-01-05 17:47:07 UTC
Red Hat Issue Tracker SSSD-4255 0 None None None 2022-06-17 12:27:39 UTC

Description Colin Walters 2021-12-20 18:54:23 UTC
Splitting this out from https://bugzilla.redhat.com/show_bug.cgi?id=2019052#c1


See https://github.com/coreos/fedora-coreos-tracker/issues/1051

We need nss-altfiles in /etc/nsswitch.conf for ostree based systems right now.

This is all the same as https://github.com/authselect/authselect/issues/48 etc.

Perhaps short term we can disable the script aspects of authselect.  

But let's avoid shipping this feature in Fedora 36 until this is working with ostree.  Can you take a look at this and comment?

My strawman proposal here is that rpm-ostree gains a simple way to inject this requirement.

A simple implementation of this would be detecting the presence of /usr/lib64/libnss_altfiles.so.2
or perhaps a "stamp file" like /usr/lib/nss-altfiles/required ?  (We can't rely on querying the rpm database
due to locking issues on traditional RPM and rpm-ostree explicitly denies reading it at all to scripts)

Or it could be an environment variable like `AUTHSELECT_FLAGS=nss-altfiles`.

Comment 1 Dusty Mabe 2021-12-20 20:58:19 UTC
Just for a little clarity, we opened this because we hit an issue in the FCOS rawhide stream (linked above by Colin in https://github.com/coreos/fedora-coreos-tracker/issues/1051). There are two problems that I see Colin is describing above:

1. dynamic run of authselect in %posttrans doesn't work for rpm-ostree composes
2. authselect doesn't consider nss-altfiles when generating nsswitch.conf

For 1. the problem here is that the authselect command tries to create files in /var/, which isn't allowed when doing rpm-ostre composes for good reason (see https://bugzilla.redhat.com/show_bug.cgi?id=1352154#c6). The request here is that we are able to teach authselect how to not create files in that location OR otherwise make it so we can configure things without having to run the authselect command.

For 2. I think Colin went into more detail in the original description.

Comment 2 Colin Walters 2022-01-05 18:42:41 UTC
ostree was created over 10 years ago now - I've spent more than a *decade* of my life now trying hard to make transactional, image based updates the default model for Fedora derivatives (along with other OSes).

I strongly believe that users of our operating systems should feel the confidence to perform OS updates while their laptop is low on battery, and server users should always be able to reliably roll back to a previous working configuration.
And that's been shipped - it works!  

And we're doing so while not compromising the ability to perform local modifications/overrides - it's still a Linux system.
xref https://blog.verbum.org/2019/12/23/starting-from-open-and-foss/

And we have exciting things in the pipeline, such as natively booting from container images: https://fedoraproject.org/wiki/Changes/OstreeNativeContainer

Now, as part of having transactional updates, we need to carry the "altfiles" bits for nsswitch.conf.
This isn't news to you of course, xref https://github.com/authselect/authselect/issues/48
which links to https://pagure.io/authconfig/c/78bb70e103907a47bfdcbc2dc1651d64eeb58147?branch=master
which is me fixing this bug 7 years ago.

I will continue to write what code needs to be written to keep this working, so that users who choose rpm-ostree based Fedora editions continue to have functional transactional, image based updates.

And right now, I have opened up the authselect codebase and am looking at this.
xref https://github.com/authselect/authselect/pull/288
xref https://github.com/authselect/authselect/pull/289 etc.

However, this is now something like the 4th time that authconfig/authselect has broken ostree.  All I ask really is - the next time there's some fundamental change to authselect, it'd be nice to have a direct ping about it.  Maybe even you could try it out on a Fedora CoreOS system - follow our documented build process: https://github.com/coreos/coreos-assembler/
But even just a heads-up would be nice, and one of the CoreOS team would be happy to try to test things out ourselves, work with you on design and implementation *before* it gets shipped in main Fedora.

Comment 3 Pavel Březina 2022-01-10 11:35:51 UTC
Hi, I'm sorry for the troubles, I was on PTO.

AFAIK latest authselect release did not change much in respect of ostree. It unconditionally injects altfiles to passwd and group if ostree system is detected. The scriptlet is located here:
https://src.fedoraproject.org/rpms/authselect/blob/rawhide/f/authselect.spec#_326
Only the sed is little bit different (pre-F36 used the one on line 330, rawhide uses the one on line 332). But otherwise, it works as before. So the profiles on ostree system should have altfiles enabled unconditionally.

Reading https://github.com/coreos/fedora-coreos-tracker/issues/1051, it looks like `grep OSTREE_VERSION` is not sufficient at this time. Why was it sufficient before and not now? Is it because authselect was not part of ostree compose but installed later by the user or anaconda and now it is part of the base ostree compose?

authselect needs to write some data, it writes them into /var/lib (_localstatedir) which should be use by applications to store their state. We can easily avoid backup at this point, but it still needs to store copy of the currently selected profile to be able to provide automatic updates, imho _localstatedir is proper place for it. Is /var read-only only during rpm transaction?

Comment 4 Dusty Mabe 2022-01-10 14:12:47 UTC
I think the thing that did change recently was that now authselect owns nsswitch.conf and the authselect command needs to run in posttrans, which is the bug we're hitting today. I think the description of https://github.com/coreos/fedora-coreos-tracker/issues/1051 lays it out pretty well.

Comment 5 Dusty Mabe 2022-01-10 14:16:38 UTC
(In reply to Pavel Březina from comment #3)
>
> <snip>
>
> authselect needs to write some data, it writes them into /var/lib
> (_localstatedir) which should be use by applications to store their state.
> We can easily avoid backup at this point, but it still needs to store copy
> of the currently selected profile to be able to provide automatic updates,
> imho _localstatedir is proper place for it. Is /var read-only only during
> rpm transaction?

Yes, Colin can correct me if I'm wrong, but /var/ is read-only during the
rpm-ostree compose. If a user manually runs authselect on a running system
then /var/ is read/write.

Comment 6 Colin Walters 2022-01-10 17:00:17 UTC
Yes, the interaction with `/var` is quite tricky here because authselect is something that can happen at both build time *and* per-machine.  Not many other things are in this set; there's e.g. systemd units as well as selinux policy.

It's important to understand we're not just making different paths read-only for fun.  The reason /var is read-only is at build time is because all data shipped there is "owned" by the OS builder.
As the OS builder, we are responsible for the files we ship that live in `/usr`.  To say it again: we own them.

Notice it's actually the opposite for most processes run per machine - `/var` is writable, but `/usr` is not.  That's exactly because the idea is that all the data there is "owned" by the user.
Operating system updates should *never* put your vacation photos in `/var/home` at risk.  They should never be rolled back when OS updates are rolled back, etc.
This relates to https://blog.verbum.org/2020/08/22/immutable-%E2%86%92-reprovisionable-anti-hysteresis/

Comment 7 Pavel Březina 2022-01-11 09:56:15 UTC
(In reply to Dusty Mabe from comment #4)
> I think the thing that did change recently was that now authselect owns
> nsswitch.conf and the authselect command needs to run in posttrans, which is
> the bug we're hitting today. I think the description of
> https://github.com/coreos/fedora-coreos-tracker/issues/1051 lays it out
> pretty well.

Perhaps for someone who understands the internals. My question is, if grepping for ostree in /etc/os-release is not sufficient anymore, how can I know at this point that it runs on ostree system?

(In reply to Colin Walters from comment #6)
> Yes, the interaction with `/var` is quite tricky here because authselect is
> something that can happen at both build time *and* per-machine.  Not many
> other things are in this set; there's e.g. systemd units as well as selinux
> policy.
> 
> It's important to understand we're not just making different paths read-only
> for fun.  The reason /var is read-only is at build time is because all data
> shipped there is "owned" by the OS builder.
> As the OS builder, we are responsible for the files we ship that live in
> `/usr`.  To say it again: we own them.
> 
> Notice it's actually the opposite for most processes run per machine -
> `/var` is writable, but `/usr` is not.  That's exactly because the idea is
> that all the data there is "owned" by the user.
> Operating system updates should *never* put your vacation photos in
> `/var/home` at risk.  They should never be rolled back when OS updates are
> rolled back, etc.
> This relates to
> https://blog.verbum.org/2020/08/22/immutable-%E2%86%92-reprovisionable-anti-
> hysteresis/

per-machine means that user invoked the process, build time means that it is run during rpm transaction?

build time - /usr is writable, /var read-only
per-machine - /usr is read-only, /var is writable

Do I get it right?

We can disable backups and switch to xattrs as suggested to prevent authselect writing to /var at this point. This should not be a problem. Or is there some workaround that can be applied directly to the spec file?

Comment 8 Dusty Mabe 2022-01-11 14:37:40 UTC
(In reply to Pavel Březina from comment #7)
> (In reply to Dusty Mabe from comment #4)
> > I think the thing that did change recently was that now authselect owns
> > nsswitch.conf and the authselect command needs to run in posttrans, which is
> > the bug we're hitting today. I think the description of
> > https://github.com/coreos/fedora-coreos-tracker/issues/1051 lays it out
> > pretty well.
> 
> Perhaps for someone who understands the internals. My question is, if
> grepping for ostree in /etc/os-release is not sufficient anymore, how can I
> know at this point that it runs on ostree system?

This wasn't completely obvious to me either so don't feel bad. You can use
`/run/ostree-booted` (yes, this works even when your builder isn't running
on an OSTree based system).

For clarity I'll reproduce some of the info from #1051 here:

There are actually two issues here:

 - the grep OSTREE_VERSION check won't have the desired effect because mutate-os-release won't run until after the RPM transaction is complete.
  - the /usr/bin/authselect select sssd with-silent-lastlog --force is trying to access /var/ which is marked readonly for scriptlets.


I tried a patch to see if I could workaround the issue:

```diff
diff --git a/authselect.spec b/authselect.spec
index 52bfdd4..2c4b3bf 100644
--- a/authselect.spec
+++ b/authselect.spec
@@ -309,7 +309,7 @@ fi
 
 # Keep nss-altfiles for all rpm-ostree based systems.
 # See https://github.com/authselect/authselect/issues/48
-if %__grep "OSTREE_VERSION=" /etc/os-release &> /dev/null; then
+if test -e /run/ostree-booted; then
     for PROFILE in `ls %{_datadir}/authselect/default`; do
         %{_bindir}/authselect create-profile $PROFILE --vendor --base-on $PROFILE --symlink-pam --symlink-dconf --symlink=REQUIREMENTS --symlink=README &> /dev/null
 %if %{with_user_nsswitch}
@@ -323,7 +323,11 @@ fi
 # If we are upgrading from pre authselect-1.3.0 or this is a new installation
 # select the default configuration.
 if [ -f %{forcefile} ]; then
-    %{_bindir}/authselect select %{default_profile} --force &> /dev/null
+    if [ -e /run/ostree-booted ]; then
+        %{_bindir}/authselect select --nobackup %{default_profile} --force &> /dev/null
+    else
+        %{_bindir}/authselect select %{default_profile} --force &> /dev/null
+    fi
     %__rm -f %{forcefile}
 fi

```

But that only fixed one of the OSTREE_VERSION issue. Even with `--nobackup` I still couldn't get authselect to run with a readonly /var. We need a fix for that in authselect itself I presume.

Comment 9 Pavel Březina 2022-01-12 13:02:22 UTC
Ok, can you please try: https://koji.fedoraproject.org/koji/taskinfo?taskID=81141383

There are two change:
1) It uses /run/ostree-booted to check if it is ostree system (if yes, altfiles is enabled unconditionally)
2) It moves files from /var/lib/authselect to /etc/authselect/.state and /var/lib/authselect/backups to /etc/authselect/backups

I'm not particularly happy about moving the files from /var to /etc, though if it works, I'll open a self-contained changed for it. There would have to be fallback mechanism for case when xattrs are not supported anyway and it also makes backup work so it is the most straightforward way to make it work.

Comment 10 Pavel Březina 2022-01-12 13:56:03 UTC
Btw, the scriptlets also writes to %{_localstatedir}/lib/rpm-state/%{name}.force to share state between %pre and %posttrans. This is the recommended way to share state between scriptlets. Is it allowed in ostree?

Comment 11 Timothée Ravier 2022-01-12 14:24:07 UTC
I think that you should just skip all the backup and state logic when detecting that you are running as part of an rpm-ostree compose (build). We don't need backups as they don't make sense because every run comes from a fresh installation. This would simplify the logic.

Comment 12 Jonathan Lebon 2022-01-12 15:57:56 UTC
(In reply to Pavel Březina from comment #9)
> Ok, can you please try:
> https://koji.fedoraproject.org/koji/taskinfo?taskID=81141383
> 
> There are two change:
> 1) It uses /run/ostree-booted to check if it is ostree system (if yes,
> altfiles is enabled unconditionally)

Sounds good. I realized we had a gap related to this which is fixed by https://github.com/coreos/rpm-ostree/pull/3325.

> 2) It moves files from /var/lib/authselect to /etc/authselect/.state and
> /var/lib/authselect/backups to /etc/authselect/backups
>
> I'm not particularly happy about moving the files from /var to /etc, though
> if it works, I'll open a self-contained changed for it. There would have to
> be fallback mechanism for case when xattrs are not supported anyway and it
> also makes backup work so it is the most straightforward way to make it work.

To expand a bit on what Timothée said, it should be fine to keep using `/var` for backups. E.g. when the user is interactively running `authselect select ...`, there's nothing preventing authselect from writing to `/var`. But in scriptlets on fresh installs there's no need to backup anything if there's nothing there yet, right? Ideally that part should be the same regardless of whether you're on an OSTree system or not. (I believe this is what the first commit in Colin's PR tries to achieve.)

For `.state`, yes I think this will need to go in `/etc` instead.

(In reply to Pavel Březina from comment #10)
> Btw, the scriptlets also writes to
> %{_localstatedir}/lib/rpm-state/%{name}.force to share state between %pre
> and %posttrans. This is the recommended way to share state between
> scriptlets. Is it allowed in ostree?

Yes, this should be fine.

Wanted to say, thank you for taking a look at this and addressing these issues! If you'd like to discuss this in real time, feel free to come over to #fedora-coreos in the Libera IRC, or we can also schedule a video meeting.

Comment 13 Colin Walters 2022-01-12 17:49:52 UTC
> Btw, the scriptlets also writes to %{_localstatedir}/lib/rpm-state/%{name}.force to share state between %pre and %posttrans. This is the recommended way to share state between scriptlets. Is it allowed in ostree?

Yes, this is documented RPM functionality so we carve out a special exception for it.

(I think it should be /run/rpm-state to make very clear it's not persistent and has nothing to do with user data, but...some day)

Comment 14 Colin Walters 2022-01-12 21:26:59 UTC
Hi, I tested https://koji.fedoraproject.org/koji/taskinfo?taskID=81141383
and I've also updated my PR in https://github.com/coreos/rpm-ostree/pull/3308
and things seem to work. 

Can you push the PRs for the code corresponding to that to authselect upstream so I can more easily review it?

Comment 15 Pavel Březina 2022-01-13 11:10:52 UTC
(In reply to Colin Walters from comment #14)
> Hi, I tested https://koji.fedoraproject.org/koji/taskinfo?taskID=81141383
> and I've also updated my PR in https://github.com/coreos/rpm-ostree/pull/3308
> and things seem to work. 

If I understand it correctly, you override the scriptlet to always touch /var/lib/rpm-state/authselect.force. There are two scenarios when the file needs to be created, see my answer to Jonathan below.

> Can you push the PRs for the code corresponding to that to authselect
> upstream so I can more easily review it?

Only spec file changes are needed, if we don't do xattrs:
https://src.fedoraproject.org/fork/pbrezina/rpms/authselect/c/4c82ac047365eeeb365f48c75f3e38ea53ab1f0f?branch=rawhide

(In reply to Jonathan Lebon from comment #12)
> To expand a bit on what Timothée said, it should be fine to keep using
> `/var` for backups. E.g. when the user is interactively running `authselect
> select ...`, there's nothing preventing authselect from writing to `/var`.
> But in scriptlets on fresh installs there's no need to backup anything if
> there's nothing there yet, right? Ideally that part should be the same
> regardless of whether you're on an OSTree system or not. (I believe this is
> what the first commit in Colin's PR tries to achieve.)

The scriptlet runs 'authselect select <default profile> --force' under two situations:

1) fresh installation, no configuration exists and backup does not make sense here

2) upgrade from F35 to F36 from non-authselect configuration, in this case it is crucial to backup previous configuration so the user can restore it if needed. This must be possible without downgrading/rollback the whole system. Is this scenario supported with ostree? If not, then we need authselect to handle this.
 
> For `.state`, yes I think this will need to go in `/etc` instead.

xattrs are widely supported but there are still some file systems that may not have the support, so authselect would need to fallback to files in that case. If the fallback directory must be outside /var then it does not make much sense to implement it and I'd stick with the move to /etc. My question is:

Is it guaranteed that ostree system will always have xattrs support?

(In reply to Colin Walters from comment #2)
> But even just a heads-up would be nice, and one of the CoreOS team would be
> happy to try to test things out ourselves, work with you on design and
> implementation *before* it gets shipped in main Fedora.

Would it be possible for you to include https://copr.fedorainfracloud.org/coprs/g/authselect/nigthly in your tests?

Comment 16 Colin Walters 2022-01-13 15:15:50 UTC
> 2) upgrade from F35 to F36 from non-authselect configuration, in this case it is crucial to backup previous configuration so the user can restore it if needed. This must be possible without downgrading/rollback the whole system. Is this scenario supported with ostree? If not, then we need authselect to handle this.

In rpm-ostree systems, we build a new image server side from scratch for every update.   %post scripts are run server side.  They cannot make any decision based on the state of individual users' systems.

The way ostree works, a 3 way merge of /etc is performed.  Modified config files win, otherwise new default config files are applied.

It's also relatedly crucial to understand that each deployment has its own copy of /etc.  Which acts also as a backup.

So, I think this still needs to be verified (obviously) but it should just work to upgrade from f35 to f36 on rpm-ostree systems with this.

Comment 17 Colin Walters 2022-01-13 15:36:45 UTC
Yep, tested an upgrade which went fine:

(/var/srv/walters has my local FCOS build with the patched authselect built using a patched rpm-ostree)

$ coreos-installer download -p qemu -f qcow2.xz --decompress
Downloading Fedora CoreOS stable x86_64 qemu image (qcow2.xz) and signature
> Read disk 622.7 MiB/622.7 MiB (100%)   
gpg: Signature made Tue Jan  4 16:30:50 2022 EST
gpg:                using RSA key 787EA6AE1147EEE56C40B30CDB4639719867C58F
gpg: Good signature from "Fedora (35) <fedora-35-primary>" [ultimate]
./fedora-coreos-35.20211215.3.0-qemu.x86_64.qcow2
$ cosa run --qemu-image fedora-coreos-35.20211215.3.0-qemu.x86_64.qcow2 --bind-ro /var/srv/walters,/var/srv/walters
...

[root@cosa-devsh ~]# rpm-ostree status
syState: idle
AutomaticUpdatesDriver: Zincati
  DriverState: active; periodically polling for updates (last checked Thu 2022-01-13 15:28:43 UTC)
Deployments:
* fedora:fedora/x86_64/coreos/stable
                   Version: 35.20211215.3.0 (2022-01-04T18:57:51Z)
                    Commit: 30c82ee684674b9a552ffee709501f981f35f36408085f089686e43b09aeca1b
              GPGSignature: Valid signature by 787EA6AE1147EEE56C40B30CDB4639719867C58F
[root@cosa-devsh ~]# systemctl stop zincati
[root@cosa-devsh ~]# rpm-ostree rebase --experimental ostree-unverified-image:oci-archive:/var/srv/walters/builds/fcos/builds/36.20220113.dev.0/x86_64/fedora-coreos-36.20220113.dev.0-ostree.x86_64.ociarchive
Pulling manifest: ostree-unverified-image:oci-archive:/var/srv/walters/builds/fcos/builds/36.20220113.dev.0/x86_64/fedora-coreos-36.20220113.dev.0-ostree.x86_64.ociarchive
Importing: ostree-unverified-image:oci-archive:/var/srv/walters/builds/fcos/builds/36.20220113.dev.0/x86_64/fedora-coreos-36.20220113.dev.0-ostree.x86_64.ociarchive (digest: sha256:19a9b19ef9c2a45d858815b9827c89d0130b3a6b67174efec321d7e185426f34)
Downloading base layer: sha256:e059c8704f38d97a1ce159565933bfd6d1cdd8bd550c419e7b5fe713730fda98 (1.5?GB)
Staging deployment... done
Upgraded:
  ...
Downgraded:
  zram-generator 1.1.1-3.fc35 -> 1.1.1-2.fc36
Added:
  authselect-1.3.0-5.fc36.1.x86_64
  authselect-libs-1.3.0-5.fc36.1.x86_64 
  ...

[root@cosa-devsh ~]# rpm-ostree status
State: idle
AutomaticUpdatesDriver: Zincati
  DriverState: inactive
Deployments:
  ostree-unverified-image:oci-archive:/var/srv/walters/builds/fcos/builds/36.20220113.dev.0/x86_64/fedora-coreos-36.20220113.dev.0-ostree.x86_64.ociarchive
                    Digest: sha256:19a9b19ef9c2a45d858815b9827c89d0130b3a6b67174efec321d7e185426f34
                   Version: 36.20220113.dev.0 (2022-01-13T15:22:41Z)
                      Diff: 273 upgraded, 1 downgraded, 8 added

* fedora:fedora/x86_64/coreos/stable
                   Version: 35.20211215.3.0 (2022-01-04T18:57:51Z)
                    Commit: 30c82ee684674b9a552ffee709501f981f35f36408085f089686e43b09aeca1b
              GPGSignature: Valid signature by 787EA6AE1147EEE56C40B30CDB4639719867C58F
[root@cosa-devsh ~]# ls -al /etc/nsswitch.conf
-rw-r--r--. 1 root root 2159 Jan  4 19:24 /etc/nsswitch.conf

At this point, the update is queued for the next boot.  My running system is untouched.

[root@cosa-devsh ~]# systemctl reboot
...
[root@cosa-devsh ~]# ls -al /etc/nsswitch.conf 
lrwxrwxrwx. 1 root root 29 Jan 13 15:29 /etc/nsswitch.conf -> /etc/authselect/nsswitch.conf

Comment 18 Pavel Březina 2022-01-14 11:27:51 UTC
(In reply to Colin Walters from comment #16)
> > 2) upgrade from F35 to F36 from non-authselect configuration, in this case it is crucial to backup previous configuration so the user can restore it if needed. This must be possible without downgrading/rollback the whole system. Is this scenario supported with ostree? If not, then we need authselect to handle this.
> 
> In rpm-ostree systems, we build a new image server side from scratch for
> every update.   %post scripts are run server side.  They cannot make any
> decision based on the state of individual users' systems.
> 
> The way ostree works, a 3 way merge of /etc is performed.  Modified config
> files win, otherwise new default config files are applied.
> 
> It's also relatedly crucial to understand that each deployment has its own
> copy of /etc.  Which acts also as a backup.
> 
> So, I think this still needs to be verified (obviously) but it should just
> work to upgrade from f35 to f36 on rpm-ostree systems with this.

1. Fresh installation of FCOS, with initial authselect configuration
2. User runs 'authselect select winbind' to switch to winbind profiles, overwrites contents of /etc/authselect
3. New authselect update is available, which ships updated winbind profile
4. User updates the system - at this point 'authselect apply-changes' should be called to update the configuration

So if I understand it correctly:
a) since contens of /etc/authselect are modified, it will keep them and not use what was built on the server (authselect select sssd --force from %posttrans)
b) 'authselect apply-changes' from %posttrans is never called on user host, therefore the configuration will not be updated. 

Is it correct? If yes, then clearly ostree model contradicts what authselect is solving.

Comment 19 Pavel Březina 2022-01-14 11:34:22 UTC
If (In reply to Colin Walters from comment #16)
> > 2) upgrade from F35 to F36 from non-authselect configuration, in this case it is crucial to backup previous configuration so the user can restore it if needed. This must be possible without downgrading/rollback the whole system. Is this scenario supported with ostree? If not, then we need authselect to handle this.
> 
> In rpm-ostree systems, we build a new image server side from scratch for
> every update.   %post scripts are run server side.  They cannot make any
> decision based on the state of individual users' systems.
> 
> The way ostree works, a 3 way merge of /etc is performed.  Modified config
> files win, otherwise new default config files are applied.

(In reply to Colin Walters from comment #17)
> Yep, tested an upgrade which went fine:

This however assumes that the configuration was not locally modified. If it was locally modified, then it won't be overwritten by the update, right? Which is precisely what we want to do in F36, see the change page.
https://fedoraproject.org/wiki/Changes/Make_Authselect_Mandatory

Comment 20 Colin Walters 2022-01-14 14:07:24 UTC
Right.  We're simply at the first stage of having this whole thing not immediately fail in default configurations.  I'd like to proceed with getting that fixed and shipped first, because it will block a whole lot of things until it's done.


I see two further sub-threads/tasks here:

- Ensuring *client side* authselect works on new (i.e. newly provisioned) F36 rpm-ostree systems
  to choose a different profile.  This is actually something that would make a lot of sense to
  have declarative sugar for in https://github.com/coreos/butane
  In this case, to reiterate: /var and /etc is writable, /usr is not.  But I still think it
  makes sense to write backups to /etc anyways always, not /var.
- What you're talking about: Forcing authselect to win over modified config files across upgrades
  It is quite intentional today that ostree's core has *absolutely no client side scripting capability*.
  It's an image-based update system.   Instead, what we tend to do for these types of "one off transition"
  type things is put them in systemd units.  So if we truly want to have authselect forcibly win
  on modified config files on upgrade, then that's how we'd need to do it.  I personally am a bit
  skeptical of the value of this though.

Comment 21 Pavel Březina 2022-01-14 14:31:59 UTC
(In reply to Colin Walters from comment #20)
> Right.  We're simply at the first stage of having this whole thing not
> immediately fail in default configurations.  I'd like to proceed with
> getting that fixed and shipped first, because it will block a whole lot of
> things until it's done.

So, we can ignore backups on ostree and start writing the state into /etc/authselect/.state. I think we can avoid change page for this, since this data should not be access by users at all. I'll leave backup in /var/lib/authselect/backups since that is user data, not configuration and can be removed at will without affecting anything. This will make authselect work in scriptlets, does it make sense?

authselect select sssd --force needs to be run in order to provide initial configuration

authselect apply-changes is a noop here (since 'select' is run always on ostree) and it does not affect client side

> I see two further sub-threads/tasks here:
> 
> - Ensuring *client side* authselect works on new (i.e. newly provisioned)
> F36 rpm-ostree systems
>   to choose a different profile.  This is actually something that would make
> a lot of sense to
>   have declarative sugar for in https://github.com/coreos/butane
>   In this case, to reiterate: /var and /etc is writable, /usr is not.  But I
> still think it
>   makes sense to write backups to /etc anyways always, not /var.

To enable auto update of PAM and nsswitch.conf on the client side, 'authselect apply-changes' must be run after the update. Is this possible?

> - What you're talking about: Forcing authselect to win over modified config
> files across upgrades
>   It is quite intentional today that ostree's core has *absolutely no client
> side scripting capability*.
>   It's an image-based update system.   Instead, what we tend to do for these
> types of "one off transition"
>   type things is put them in systemd units.  So if we truly want to have
> authselect forcibly win
>   on modified config files on upgrade, then that's how we'd need to do it. 
> I personally am a bit
>   skeptical of the value of this though.

I think that is is not needed for ostree users, at least from what I read here. It will override unchanged configuration and keep changed configuration. This is pretty much a good thing. It would be good to document it and ask users to switch to authselect and create a custom profile for modified configs.

Comment 22 Colin Walters 2022-01-14 16:37:33 UTC
> To enable auto update of PAM and nsswitch.conf on the client side, 'authselect apply-changes' must be run after the update. Is this possible?

Yes, via a systemd unit.   Which can also be used on traditional yum systems.

Doing major changes in %post today has a lot of problems.  For one thing, actually today the %post invocation in authselect, much like many other %post scripts, directs log messages to /dev/null.
It also ignores errors.

That's...not great for trying to debug something that's making fundamental changes to your PAM configuration ;)

Now the reason we tend to do this is really because traditional yum isn't transactional - if authselect fails for some reason in the middle of a live update...well, there's no going back.
rpm-ostree can do much better here.  At some point I'd like to try to drive a change so that rpm-ostree can force all scripts to both log and error out if anything fails.

Comment 23 Pavel Březina 2022-01-17 13:03:24 UTC
Can you share an example unit file with me that would be run once after a package update?

Would it also be possible to move 'authselect select --force' out of %post into the unit file (authselect would not be called in %post at all and we would not need to move files to /etc)? Though that would require that %{_localstatedir}/lib/rpm-state/%{name}.force is present on the client side (to detect upgrade/new installation path from %pre). It would also mean that nsswitch.conf and pam config is not present in the image - not sure if is is a problem or not.

Comment 24 Pavel Březina 2022-01-18 14:58:53 UTC
I chose to create a change page after all, to be on the safe side. Today is the last day for submission, hopefully it gets through.

https://fedoraproject.org/wiki/Changes/Authselect_Move_State_Files_To_Etc

Comment 25 Colin Walters 2022-01-20 19:13:26 UTC
> Can you share an example unit file with me that would be run once after a package update?

The simplest way to implement "run once" semantics is via e.g. `ConditionPathExists=!/etc/authselect/.transitioned" and then having the completion do `touch /etc/authselect/.transitioned`.

Actually in this case, it might be easiest to use `ConditionPathIsSymbolicLink=!/etc/nsswitch.conf` or so.  Then the systemd unit should be invoked when that is not a symlink, which is already a marker for authselect completion, right?

If you want to wire this together to "edge trigger" after a package change then I think it's doable via inside `%post` systemctl start authselect-transition.service` or so.


Anyways, I am not strongly arguing for this switch to a systemd unit.  Personally in this case, I think it works out OK to have yum-based systems do the transition via %post - and ostree systems should have it basically work for free.

Comment 26 Pavel Březina 2022-01-24 13:26:05 UTC
I'm waiting for the change page to get accepted, then I'll ask you for one more test of the final change and submit a build to rawhide.

Comment 27 Colin Walters 2022-01-24 23:21:56 UTC
OK, thanks!

Comment 28 Pavel Březina 2022-01-27 19:30:48 UTC
I'm not sure if the change will be accepted, it seems to be even more controversial then I thought.

I might have an elegant solution though, but I want to check if there isn't the same obstacle.

The problem is that /var is read-only during %post. What if we move the `authselect select --force` to %install instead and the files (in both /etc and /var) are part of the rpm? Is contents of /var from rpm copied to the image or is it facing the same problem?

Comment 29 Colin Walters 2022-01-31 17:58:46 UTC
I got pinged about this, so updating the status here:

Pavel has updated https://github.com/authselect/authselect/pull/291 and also https://copr.fedorainfracloud.org/coprs/g/authselect/pr291/

I tried this out, but I ended up with a system without `altfiles` in `/etc/authselect/nsswitch.conf`.

I think this is because the check for OSTREE_VERSION in /etc/os-release never actually worked (the change to do that happens only *after* the authselect bits today).
This relates to https://github.com/coreos/rpm-ostree/pull/3325

It should work to change the spec file to do `test -f /run/ostree-booted` instead.

Comment 30 Colin Walters 2022-01-31 22:04:39 UTC
OK this works for me with https://github.com/coreos/rpm-ostree/pull/3378

Comment 31 Dusty Mabe 2022-02-01 01:55:35 UTC
For the OSTREE_VERSION thing, shouldn't we fix the specfile? I could take that fix from the proposed patch in https://bugzilla.redhat.com/show_bug.cgi?id=2034360#c8 that only fixed half the problem and submit it against distgit.

Comment 32 Pavel Březina 2022-02-01 09:03:25 UTC
(In reply to Colin Walters from comment #29)
 Pavel has updated https://github.com/authselect/authselect/pull/291 and also
> https://copr.fedorainfracloud.org/coprs/g/authselect/pr291/
> 
> I tried this out, but I ended up with a system without `altfiles` in
> `/etc/authselect/nsswitch.conf`.

The reason is that I forgot to apply patches from downstream to upstream and the copr repository is built out of the pull request with upstream spec file. I added the patches now, it should correctly detect ostree and enable altfiles.

(In reply to Colin Walters from comment #30)
> OK this works for me with https://github.com/coreos/rpm-ostree/pull/3378

This should not be needed anymore.

New build is available here: https://copr.fedorainfracloud.org/coprs/g/authselect/pr291/build/3288607/

Comment 33 Colin Walters 2022-02-01 16:06:28 UTC
Hi, I've tested that latest build along with reverting https://github.com/coreos/rpm-ostree/pull/3378 and it does work.  
Pushed a PR for the revert: https://github.com/coreos/rpm-ostree/pull/3386

Thanks!

Comment 34 Pavel Březina 2022-02-03 10:58:17 UTC
Colin, can you please try https://koji.fedoraproject.org/koji/taskinfo?taskID=82330872? This is what will go to Fedora, once you test it I'll push it and build it.

Comment 35 Pavel Březina 2022-02-04 12:49:05 UTC
You can see the changes here https://src.fedoraproject.org/rpms/authselect/pull-request/16 but its pretty much authselect master.

Comment 36 Pavel Březina 2022-02-07 13:58:24 UTC
I wanted to meet tomorrows code completion deadline, so I went ahead and build the packages. It will be include in the next compose.

Comment 37 Colin Walters 2022-02-07 22:06:38 UTC
The rpm-ostree https://github.com/coreos/rpm-ostree/releases/tag/v2022.2 release is required for this too, which is also in rawhide/f36.

For FCOS, https://github.com/coreos/fedora-coreos-config/pull/1492  is pending to un-pin authselect there.


Note You need to log in before you can comment on or make changes to this bug.