Bug 2075049
| Summary: | Don't default to FCOS | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Colin Walters <walters> |
| Component: | rust-coreos-installer | Assignee: | Antonio Murdaca <amurdaca> |
| Status: | CLOSED DEFERRED | QA Contact: | |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | CentOS Stream | CC: | amurdaca, bgilbert, bstinson, jwboyer, perobins, travier |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-28 09:42:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Colin Walters
2022-04-13 13:55:29 UTC
I can help throw a patch for this Colin - do we want to introduce a --distro flag and use the base url for the given distro, if any? this way, shipping a wrapper in fedora would be backward compatible too and in RHEL we just error out requiring a flag or osmet? I've only contributed a few patches to coreos-installer myself. I think I'd propose as a strawman though:
- Add a Cargo feature `fcos-default` (default to...off?)
- Add a conditional in the Fedora spec `%if !0%{rhel}` that turns it on
- Change the option processing code to have the "stream" argument under `if cfg!("feature = fcos-default")`
- Change all code using StreamLocation::new to also be under cfg! and return an error if fcos-default is unset
(as you say, we basically error if no osmet)
I think adding --distro is a harder and more involved task, and probably raises questions around how this
intersects with Edge and requires more design. The strawman above just turns off the default for FCOS
when built for RHEL.
(In reply to Colin Walters from comment #2) > I've only contributed a few patches to coreos-installer myself. I think I'd > propose as a strawman though: > > - Add a Cargo feature `fcos-default` (default to...off?) > - Add a conditional in the Fedora spec `%if !0%{rhel}` that turns it on > - Change the option processing code to have the "stream" argument under `if > cfg!("feature = fcos-default")` > - Change all code using StreamLocation::new to also be under cfg! and return > an error if fcos-default is unset > (as you say, we basically error if no osmet) > > I think adding --distro is a harder and more involved task, and probably > raises questions around how this agreed - wanted to triple check on that before just adding a cargo feature which is trivial compared to new flag anyway if nobody claimed the issue already, I can work on it :) > intersects with Edge and requires more design. The strawman above just > turns off the default for FCOS > when built for RHEL. It's not that simple, unfortunately. The Fedora CoreOS docs heavily recommend using the container, which is built upstream and doesn't have the opportunity to ship differentiated wrapper scripts. Even if that weren't true, I think it's net _more_ confusing if different builds of coreos-installer have different defaults. (It's legal to use coreos-installer on e.g. Fedora to install FCOS, or RHCOS, or whatever.) So we'd need to solve this in a unified way upstream, which I think means adding a `--distro` flag and going through a deprecation period where we warn if the flag is missing and default to `fcos`. We'd also need to decide what non-`fcos` distros should do. I agree that there's no reasonable default image source for RHCOS. Does RHEL for Edge offer stream metadata that we can use to locate a default image? > The Fedora CoreOS docs heavily recommend using the container, which is built upstream and doesn't have the opportunity to ship differentiated wrapper scripts. But since the binary already defaults to FCOS, what would be the problem with the container including a wrapper? > Even if that weren't true, I think it's net _more_ confusing if different builds of coreos-installer have different defaults. I totally agree with this. But I'm not arguing for different defaults, but for the RHEL build to have *no* default. > Does RHEL for Edge offer stream metadata that we can use to locate a default image? I won't speak for them but AIUI it's really important to keep in mind that Image Builder is designed to make *custom* derived images, and...I don't think we want to try to enumerate all of those in our shipped binary right? I guess we could try to support a drop-in config file in /usr/lib and /etc or so. But it really seems simplest to just not have a default (unless osmet is detected). >> The Fedora CoreOS docs heavily recommend using the container, which is built upstream and doesn't have the opportunity to ship differentiated wrapper scripts. > But since the binary already defaults to FCOS, what would be the problem with the container including a wrapper? I understood you to be proposing that the wrapper would be a downstream packaging thing, and not in the upstream repo at all. Even if we shipped it upstream, the goal is to avoid confusing non-FCOS users, right? Some of them will use the upstream container. >> Even if that weren't true, I think it's net _more_ confusing if different builds of coreos-installer have different defaults. > I totally agree with this. But I'm not arguing for different defaults, but for the RHEL build to have *no* default. I understand, but I don't think that completely solves the issue here. Scripts (or users) that invoke `coreos-installer install /dev/qda` will start failing in some scenarios but not others. > But it really seems simplest to just not have a default (unless osmet is detected). I suppose that makes sense. I'm not thrilled about the awkwardness of the --image-url flow, especially with the likely need for --insecure, but it does work. Oh, there's also a corner case involving installation kargs. coreos-installer-service will happily run in OS images that don't ship osmet. I'm not sure if anyone uses that case right now, but we'd need to either disallow it or add something like a coreos.inst.distro karg. The `download` and `list-stream` subcommands will need a --distro argument too. In a world where OS images are signed, it'd make sense to always require --distro, even if --image-file or --image-url is specified. We can use that to select the correct verification keyring for the distro, rather than having a TLS CA situation where any distro can sign any image. > I understood you to be proposing that the wrapper would be a downstream packaging thing, and not in the upstream repo at all. Even if we shipped it upstream, the goal is to avoid confusing non-FCOS users, right? Some of them will use the upstream container. Ah...non-FCOS users are confused today and that's how we got here right? How would this proposal be more confusing? But to expand on this, I am arguing for the creation of a `fedora-coreos-installer` script which does boil down to something like: ``` #!/bin/sh exec coreos-installer --distro fcos "$@" ``` So I do agree we want --distro, but it'd *only* support "fcos" to start; we wouldn't *block* on trying to expand --distro to everything. That can be a phase 2. And broadly speaking we'd try to re-train people to invoke `fedora-coreos-installer` if that's what they want and not just `coreos-installer`, maybe via the classic approach of "echo 'Please use fedora-coreos-installer or coreos-installer --distro fedora'; sleep 2" approach. In practice, this change might not happen for e.g. a year or more. Or maybe never. Dunno. > I understand, but I don't think that completely solves the issue here. Scripts (or users) that invoke `coreos-installer install /dev/qda` will start failing in some scenarios but not others. Well, I think I would say "the issue" of installing FCOS would be solved. Just failing (in the no-osmet case) with a useful error message I think is way, way better than installing FCOS in the case of Edge or RHCOS. Could it be better? Yes definitely. > I suppose [no default] makes sense. I'm not thrilled about the awkwardness of the --image-url flow, especially with the likely need for --insecure, but it does work. Agree. > We can use that to select the correct verification keyring for the distro, rather than having a TLS CA situation where any distro can sign any image. I like this goal, but managing keyrings and such is a whole big task that I wouldn't want to block on versus the IMO simple approach of just not having a default for the CentOS/RHEL coreos-installer binary. > Ah...non-FCOS users are confused today and that's how we got here right? How would this proposal be more confusing? Rather than behaving in a confusing but consistent way, we'd behave confusingly only in some flows, which is even more confusing. > So I do agree we want --distro, but it'd *only* support "fcos" to start; we wouldn't *block* on trying to expand --distro to everything. That can be a phase 2. Right, I'm on board with that, but see the keyring discussion below. >> I understand, but I don't think that completely solves the issue here. Scripts (or users) that invoke `coreos-installer install /dev/qda` will start failing in some scenarios but not others. > Well, I think I would say "the issue" of installing FCOS would be solved. Just failing (in the no-osmet case) with a useful error message I think is way, way better than installing FCOS in the case of Edge or RHCOS. Anyone who accidentally installs FCOS is not using the program correctly; they're failing to specify the image they want to install. (Or, in some of the historical cases, we've broken something in the OS image.) I agree that that's confusing, and should be improved, but it's basically a papercut. You're proposing to fix that papercut by actively breaking FCOS users (in some flows) who are using the program as intended. I agree that we probably need to do that, but we need a proper deprecation period and we need to do it consistently in all flows. >> We can use that to select the correct verification keyring for the distro, rather than having a TLS CA situation where any distro can sign any image. > I like this goal, but managing keyrings and such is a whole big task that I wouldn't want to block on versus the IMO simple approach of just not having a default for the CentOS/RHEL coreos-installer binary. Right, I'm not proposing that we do the keyring management stuff now. What I'm proposing is that we consider always requiring --distro, even in cases where that currently wouldn't change our behavior, so that we don't have to make another breaking change later. I'm not 100% sold on the idea, but I think it's worth thinking about. Moving discussion upstream https://github.com/coreos/coreos-installer/issues/1225 |