Bug 2123812
Summary: | ostree installer composes fail in mock using systemd-nspawn | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Kevin Fenzi <kevin> |
Component: | distribution | Assignee: | Kevin Fenzi <kevin> |
Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | dustymabe, jonathan, kevin, klember, lsedlar, lucab, miabbott, ngompa13, robertthomasfairley, travier, walters |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-09-16 00:43:41 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Kevin Fenzi
2022-09-02 16:47:07 UTC
Are there AVC denials on the host system? It may be that previously we were operating in an install_t context with cap_mac_admin, but not anymore. (Another way to check this is to compare ps axZ output of the running programs) Now, IMO our use of mock for things like this should be considered legacy. I think we should be building things using more standard container tools - namely podman and Kubernetes. For Fedora CoreOS we have heavily invested in having our build and test tooling run as a standard container in standardized ways. For *this specific* problem, we end up launching a transient VM inside the container, because this ensures strong isolation from the host. mock privileged containers aren't doing that and are inherently going to lead to problems like this - nspawn doesn't help here. All the koji builders are in selinux permissive mode. ;( How tenable is it to use the old chroot mode just for this task? What we're getting here is EINVAL, which I'm pretty sure is happening here https://github.com/torvalds/linux/blob/80e78fcce86de0288793a0ef0f6acf37656ee4cf/security/selinux/hooks.c#L3189 Crucially, I think this error isn't dependent on whether or not the system is in SELinux permissive mode, it is dependent on whether or not the caller has CAP_MAC_ADMIN: https://github.com/torvalds/linux/blob/80e78fcce86de0288793a0ef0f6acf37656ee4cf/security/selinux/hooks.c#L3136 It seems likely to me that nspawn is dropping this permission. Adding an invocation of `capsh --print` before the relevant command would likely say. Medium term, we will make rpm-ostree work fully unprivileged - this was part of the big goal of https://github.com/coreos/rpm-ostree/issues/729 and we're pretty close, but not there yet. So short term, I think we need to do old chroot or figure out how to get nspawn to give us the credentials. (Or of course, also medium term, use podman/kubernetes which is how we should be running containers in production) Moving back to distribution to denote this is not short term actionable by (rpm-)ostree issue today and must be fixed in the infrastructure invoking us (whether that's mock/koji/nspawn/etc.) Alright. Thanks. So, currently there's only 2 places we can control this: 1) The koji tag can set to use old-chroot or nspawn. This will apply to basically everything for that branch. 2) We can adjust site-defaults.cfg on builders. This will however apply to every branch/all things, but if we only change config_opts['nspawn_args'] it will be ignored by the non nspawn branches. So, we could add: config_opts['nspawn_args'] = ['--capability=cap_mac_admin'] But that would then apply to every build using nspawn. Is that something that would be bad to have enabled for all builds? Alternately, perhaps we could get pungi to do ostree_installer runroot tasks with old-chroot passed to koji? Adding Lsedlar for comment on that approach. :) This is actually interesting. Pungi always submits ostree tasks with the --new-chroot option. https://pagure.io/pungi/pull-request/411 If it no longer works, I don't see a problem with making it configurable. ostree is always new-choot, but ostree_installer (which is what is breaking here) is not. From what I can tell it's just using whatever default/setting koji has. So, can we adjust ostree_installer phase to always use old chroot? Couldn't we go new-chroot and add the nspawn args Kevin suggested in comment 5 from pungi for ostree_installer? This would switch the option: https://pagure.io/pungi/pull-request/1636 Using new-chroot and customizing mock options is not possible from Pungi. The `koji runroot` does not have that level of granularity. Thanks. After pondering some, I realized I can isolate the additional nspawn capability/arg to just runroot builders. So, thats a much smaller area. So, I'd like to try that first, and if it doesn't do the trick, then go with the change in pungi... I'll know more with tomorrows compose. I think it would be better to move everything we can to new-chroot. ok, that got them to compose. Can someone test that they actually work ? :) todays rawhide 20220915n.1 I gave https://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Silverblue/x86_64/iso/Fedora-Silverblue-ostree-x86_64-Rawhide-20220915.n.1.iso a quick spin and it seemed to work fine here. Cool. Then, I think we have worked around this and don't need the pungi change after all. Many thanks to everyone for tracking things down here. I'm late to the party but thanks a lot for fixing this one |