Bug 1908047
Summary: | dnf downgrade does not work on Stream | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Phil Perry <phil> |
Component: | distribution | Assignee: | Brian Stinson <bstinson> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Release Test Team <release-test-team-automation> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | CentOS Stream | CC: | abokovoy, ajb, bstinson, carl, davide, donhoover, gianluca.cecchi, havard, james.antill, jwboyer, klaas, leonfauster, markus.falb, mattdm, me, mharri, mihai, ngompa13, pasik, pasteur, peter.georg, plarsen, ralloway, ricardo.barberis, swardle, toracat |
Target Milestone: | rc | ||
Target Release: | 8.4 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-05-06 02:02:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Phil Perry
2020-12-15 18:17:53 UTC
Is it even possible with our current tooling to preserve old builds previously pushed out when regenerating the repository? As far as I know, we don't even have this in Fedora because of technical limitations. Is there a way we can do this in CentOS Stream to begin with? (In reply to Neal Gompa from comment #1) > Is it even possible with our current tooling to preserve old builds > previously pushed out when regenerating the repository? Yes, it's technically possible. RHEL does this. > As far as I know, we don't even have this in Fedora because of technical limitations. Is there a > way we can do this in CentOS Stream to begin with? I'm not aware of what those limitations would be outside of storage and being unkind to mirrors. Would those concerned with this be willing to draft a proposal for us to look at? Off the top of my head, I can think of the following concerns: 1) rollback applicability is a per RPM item. The existence of multiple NVRs allows for the possibility of rollback, but it really depends on each update as to whether that will be successful. People need to know this isn't a magically solution and the real solution is to fix the broken RPMs with a newer update. 2) Mirror concerns. More RPMs is more load, storage, etc. 3) How many NVRs would you expect? My initial guess would be N and N-1 (e.g. foo-1.0-1.el8 and foo-1.0-2.el8). That makes the utility of this possible, albeit within a shorter time window for frequently updated packages. I'm sure the Stream team has other concerns at the implementation level, because it would require significant work to make sure the build process and tagging is handled in a way that could accommodate this. Having a proposal to look at would help that team scope that work. Thanks Josh for your reply. (In reply to Josh Boyer from comment #2) > > 3) How many NVRs would you expect? My initial guess would be N and N-1 > (e.g. foo-1.0-1.el8 and foo-1.0-2.el8). That makes the utility of this > possible, albeit within a shorter time window for frequently updated > packages. > If I may reply to this point. I would say that RHEL is able to keep all previous packages available. In the past CentOS Linux has kept all packages from within a point release directly available, and packages from all previous point releases available in a archive type (vault) repository. So we've been doing it for the last 15 years. This is a regression on that expected behaviour. I don't care what fedora does or doesn't do, this is not fedora. As Stream has no 'point' releases, keeping all previous package versions makes most sense. At the very least it needs to be usable, which means users MUST have the ability to roll back packages if they experience issues until the bug is able to be reported, triaged, fixed, tested and released. In the meantime, numerous further package releases may happen in a fast moving CI environment like Stream which may or may not fix the particular bug in question. Think kernel packages which may eventually receive nightly builds. It may be N+10 that eventually receives the fix, and if the user wants to update and test each new package, and then roll back to the last working package, N+1 doesn't cut it. If this isn't possible with current infrastructure, then the infrastructure is also broken and needs fixing. Can we get this escalated to those who are able to fix it? (In reply to Phil Perry from comment #3) > Thanks Josh for your reply. > > (In reply to Josh Boyer from comment #2) > > > > 3) How many NVRs would you expect? My initial guess would be N and N-1 > > (e.g. foo-1.0-1.el8 and foo-1.0-2.el8). That makes the utility of this > > possible, albeit within a shorter time window for frequently updated > > packages. > > > > If I may reply to this point. I would say that RHEL is able to keep all > previous packages available. In the past CentOS Linux has kept all packages > from within a point release directly available, and packages from all > previous point releases available in a archive type (vault) repository. So > we've been doing it for the last 15 years. This is a regression on that > expected behaviour. I don't care what fedora does or doesn't do, this is not > fedora. Eh, it's slightly semantics. I agree you lack the ability to do rollbacks. What CentOS Linux has is different from what Stream would have/need under the same context. For example, CentOS Linux gets updates roughly every 6 months, which means twice a year you get a new NVR (plus any fixes released in the interim). With Stream, there may be a much faster update cadence to a subset of the packages which increases the retention requirements significantly. The problem spaces are not the same. > As Stream has no 'point' releases, keeping all previous package versions > makes most sense. At the very least it needs to be usable, which means users I don't think you want that. First, it's not particularly useful over time. Even internally, we don't keep builds that are unreleased in some manner because literal per build retention is unnecessary. Also, as you add more NVRs to repos you increase the amount of metadata required for that repo. For example, today Stream has: Stream BaseOS primary.xml.gz of ~1MB and other.xml.gz of ~1.4MB Stream AppStream primary.xml.gz of ~1.6MB and other.xml.gz of ~1MB on RHEL 8 CDN, those equivalent files are: BaseOS primary.xml.gz of ~14MB and other.xml.gz of ~278MB AppStream primary.xml.gz of ~6MB and other.xml.gz of ~80MB That's quite a bump and those files are downloaded every time the dnf cache times out. The RHEL numbers only capture updates actually sent to customers. It doesn't include builds that were only part of development. Stream would include builds that pass testing but aren't what wind up shipping to customers, increasing the overall size. Adding all builds ever shipped in Stream is going to make that worse. > MUST have the ability to roll back packages if they experience issues until > the bug is able to be reported, triaged, fixed, tested and released. In the > meantime, numerous further package releases may happen in a fast moving CI > environment like Stream which may or may not fix the particular bug in > question. Think kernel packages which may eventually receive nightly builds. The kernel receives more than nightly builds. The important part here is that it doesn't land in Stream until it has been tested. That's a key element. While testing isn't going to catch everything, it's not about the number of builds done. It's about the ones that make it through testing that are still possibly buggy. We should focus on that for this proposal. > It may be N+10 that eventually receives the fix, and if the user wants to > update and test each new package, and then roll back to the last working > package, N+1 doesn't cut it. It's worth pointing out the kernel is probably a really bad example because you can install multiple kernels and do rollback that way. Let's use glibc or something else instead. Also, I think we should consider that there are alternative forms of rollback available for systems that are configured appropriately. Doing it at the RPM level is technically possible in some cases, but using tools like ostree or Boom (https://www.redhat.com/en/blog/boom-booting-rhel-lvm-snapshots) might be a more robust solution and I'd advocate for those where applicable. > If this isn't possible with current infrastructure, then the infrastructure > is also broken and needs fixing. > > Can we get this escalated to those who are able to fix it? If your proposal is "ship everything ever built" or even "ship everything ever released in Stream", that's going to be harder to pull off. I really don't think you want either though given some of the points above. That kind of repo sprawl costs *every* Stream user for what they may never need. Can we start with something smaller than that and see how it works out? (In reply to Josh Boyer from comment #4) > (In reply to Phil Perry from comment #3) > > Thanks Josh for your reply. > > > > (In reply to Josh Boyer from comment #2) > > > > > > 3) How many NVRs would you expect? My initial guess would be N and N-1 > > > (e.g. foo-1.0-1.el8 and foo-1.0-2.el8). That makes the utility of this > > > possible, albeit within a shorter time window for frequently updated > > > packages. > > > > > > > If I may reply to this point. I would say that RHEL is able to keep all > > previous packages available. In the past CentOS Linux has kept all packages > > from within a point release directly available, and packages from all > > previous point releases available in a archive type (vault) repository. So > > we've been doing it for the last 15 years. This is a regression on that > > expected behaviour. I don't care what fedora does or doesn't do, this is not > > fedora. > > Eh, it's slightly semantics. I agree you lack the ability to do rollbacks. > What CentOS Linux has is different from what Stream would have/need under > the same context. For example, CentOS Linux gets updates roughly every 6 > months, which means twice a year you get a new NVR (plus any fixes released > in the interim). With Stream, there may be a much faster update cadence to > a subset of the packages which increases the retention requirements > significantly. The problem spaces are not the same. > > > As Stream has no 'point' releases, keeping all previous package versions > > makes most sense. At the very least it needs to be usable, which means users > > I don't think you want that. First, it's not particularly useful over time. > Even internally, we don't keep builds that are unreleased in some manner > because literal per build retention is unnecessary. Also, as you add more > NVRs to repos you increase the amount of metadata required for that repo. Sure, we only need packages that are relesed to remain available. > For example, today Stream has: > > Stream BaseOS primary.xml.gz of ~1MB and other.xml.gz of ~1.4MB > Stream AppStream primary.xml.gz of ~1.6MB and other.xml.gz of ~1MB > > on RHEL 8 CDN, those equivalent files are: > > BaseOS primary.xml.gz of ~14MB and other.xml.gz of ~278MB > AppStream primary.xml.gz of ~6MB and other.xml.gz of ~80MB > > That's quite a bump and those files are downloaded every time the dnf cache > times out. The RHEL numbers only capture updates actually sent to > customers. It doesn't include builds that were only part of development. > Stream would include builds that pass testing but aren't what wind up > shipping to customers, increasing the overall size. > > Adding all builds ever shipped in Stream is going to make that worse. > I totally get that. There are workarounds, like for example how CentOS Linux currently archives off at point release time. Stream could do the same - move older packages to an archive, but importantly still configurably available within yum for those who may still need them. > > MUST have the ability to roll back packages if they experience issues until > > the bug is able to be reported, triaged, fixed, tested and released. In the > > meantime, numerous further package releases may happen in a fast moving CI > > environment like Stream which may or may not fix the particular bug in > > question. Think kernel packages which may eventually receive nightly builds. > > The kernel receives more than nightly builds. The important part here is > that it doesn't land in Stream until it has been tested. That's a key > element. While testing isn't going to catch everything, it's not about the > number of builds done. It's about the ones that make it through testing > that are still possibly buggy. We should focus on that for this proposal. > > > It may be N+10 that eventually receives the fix, and if the user wants to > > update and test each new package, and then roll back to the last working > > package, N+1 doesn't cut it. > > It's worth pointing out the kernel is probably a really bad example because > you can install multiple kernels and do rollback that way. Let's use glibc > or something else instead. > Acknowledged, but you get the point. > Also, I think we should consider that there are alternative forms of > rollback available for systems that are configured appropriately. Doing it > at the RPM level is technically possible in some cases, but using tools like > ostree or Boom > (https://www.redhat.com/en/blog/boom-booting-rhel-lvm-snapshots) might be a > more robust solution and I'd advocate for those where applicable. > The issue here is that there is a regression in the behavior of dnf in Stream over RHEL/CentOS Linux. I'm sure there are 100 ways we could all work around the problem, but it would be better if we didn't cause the problem in the first place. > > If this isn't possible with current infrastructure, then the infrastructure > > is also broken and needs fixing. > > > > Can we get this escalated to those who are able to fix it? > > If your proposal is "ship everything ever built" or even "ship everything > ever released in Stream", that's going to be harder to pull off. I really > don't think you want either though given some of the points above. That > kind of repo sprawl costs *every* Stream user for what they may never need. > Can we start with something smaller than that and see how it works out? It's not really my place to make proposals. I simply pointed out the issue that the expected behaviour of dnf is broken on Stream. One obvious solution is to ship everything (released) as RHEL or CentOS Linux does. If there other back end solutions that seamlessly allows 'dnf downgrade' and 'dnf install foo-%(old_version}-%{old_release}' to work as it currently works on RHEL / CentOS Linux, that would be great. +1 to have some "back end solutions that seamlessly allows 'dnf downgrade' and 'dnf install foo-%(old_version}-%{old_release}' to work as it currently works on RHEL / CentOS Linux" How about having some "Archive" repo a la CentOS Vault? It'd be disabled by default so as not to bother the majority of users the majority of the time, but available to do a "dnf/yum --enablerepo=Archive downgrade foo" if needed. This repo could keep everithing relesead, or N-10 versions, or whatever RH and Stream users consider sensible. (In reply to Ricardo J. Barberis from comment #7) > How about having some "Archive" repo a la CentOS Vault? > > It'd be disabled by default so as not to bother the majority of users the > majority of the time, but available to do a "dnf/yum --enablerepo=Archive > downgrade foo" if needed. > > This repo could keep everithing relesead, or N-10 versions, or whatever RH > and Stream users consider sensible. Yes, as proposed in Comment 5, this would certainly provide a solution to the issue and works well in current CentOS Linux 8 as you state. In the absence of any other solutions being put forward, can we get a commitment to move ahead with the proposed solution? Can you action this @Josh, or do we need someone else involved here? I have this captured for me and my team to work on. I'll post updates here. +1 to "dnf/yum --enablerepo=Archive downgrade foo" (In reply to Phil Perry from comment #3) > Thanks Josh for your reply. > > (In reply to Josh Boyer from comment #2) > > > > 3) How many NVRs would you expect? My initial guess would be N and N-1 > > (e.g. foo-1.0-1.el8 and foo-1.0-2.el8). That makes the utility of this > > possible, albeit within a shorter time window for frequently updated > > packages. > > > > If I may reply to this point. I would say that RHEL is able to keep all > previous packages available. In the past CentOS Linux has kept all packages > from within a point release directly available, and packages from all > previous point releases available in a archive type (vault) repository. So > we've been doing it for the last 15 years. This is a regression on that > expected behaviour. I don't care what fedora does or doesn't do, this is not > fedora. > It matters because CentOS Stream uses the *same* tooling as Fedora to compose and release. Having worked on said tooling, I'm familiar with how it works and a number of the problems it has. > As Stream has no 'point' releases, keeping all previous package versions > makes most sense. At the very least it needs to be usable, which means users > MUST have the ability to roll back packages if they experience issues until > the bug is able to be reported, triaged, fixed, tested and released. In the > meantime, numerous further package releases may happen in a fast moving CI > environment like Stream which may or may not fix the particular bug in > question. Think kernel packages which may eventually receive nightly builds. > It may be N+10 that eventually receives the fix, and if the user wants to > update and test each new package, and then roll back to the last working > package, N+1 doesn't cut it. > > If this isn't possible with current infrastructure, then the infrastructure > is also broken and needs fixing. > > Can we get this escalated to those who are able to fix it? It probably makes sense that once things are fixed up to do this, we'd want a sliding window between the main repos and the vault/archive repo. The challenge we faced with this was we lacked the ability to merge modular metadata due to the old version of createrepo_c on our CentOS 7 compose machine. We have recently deployed a new CentOS Stream 8 compose machine with createrepo_c 0.16.2, which includes that feature [0]. I've updated our staging scripts to merge new composes rather than replacing the previous content. We've started using that process and now some packages and modules have multiple versions available. # dnf --quiet list --available --showduplicates openssh Available Packages openssh.x86_64 8.0p1-5.el8 baseos openssh.x86_64 8.0p1-7.el8 baseos openssh.x86_64 8.0p1-8.el8 baseos # dnf module info python39 | grep Version Version : 8050020210330215900 Version : 8050020210422164837 We do not yet know what our retention strategy will be or how we will prune older content to reclaim space. Regardless of that, going forward users can expect that at least some content will be eligible to be downgraded. [0] https://github.com/rpm-software-management/createrepo_c/commit/69382e396d4c041ef4e5f130edde90338233f82d |