Bug 1767351 - Cannot upgrade to Fedora 32: Modules blocking the upgrade path [NEEDINFO]
Summary: Cannot upgrade to Fedora 32: Modules blocking the upgrade path
Keywords:
Status: ON_QA
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf-plugins-extras
Version: 32
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Jaroslav Mracek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedPreviousReleaseBlocker
Depends On:
Blocks: BetaBlocker, F32BetaBlocker 1804564
TreeView+ depends on / blocked
 
Reported: 2019-10-31 09:30 UTC by Miro Hrončok
Modified: 2020-03-16 22:04 UTC (History)
22 users (show)

Fixed In Version: dnf-plugins-extras-4.0.8-2.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1804564 (view as bug list)
Environment:
Last Closed: 2020-03-06 02:11:41 UTC
Type: Bug
awilliam: needinfo? (jmracek)


Attachments (Terms of Use)
/var/log from the failed test (2.94 MB, application/octet-stream)
2020-03-06 22:39 UTC, Adam Williamson
no flags Details

Description Miro Hrončok 2019-10-31 09:30:39 UTC
This is a followup Fedora 32 bug after bz1747408 and bz1762751 has been explicitly workarounded for updates to Fedora 31 only.

It was expected that the Fedora 31 workaround is "hackish" because it was needed fast. I'm opening this bug to track the proper solution.

tl;dr problem:

Users that have (often without they intention) activated a modular stream cannot properly upgrade to a new Fedora version where this modular stream is no longer available.

I'm proposing this as a beta blocker, so we don't need to invent a new "hackish" workarounds during beta.

Comment 1 Stephen Gallagher 2019-11-11 13:57:26 UTC
+1 blocker

Comment 2 Stephen Gallagher 2019-11-11 16:40:38 UTC
FESCo voted in today's meeting to declare this a blocker for Fedora 32 Beta release. (+7, 0, -0).

Comment 3 Miro Hrončok 2020-01-29 09:16:36 UTC
We are 4 weeks from Beta Freeze, 7 weeks from Beta Release. I doubt that Josh (default assignee for the distribution component) is actually working on this.

Re-assigning to nobody. CCing Ben, the program manager.

Comment 4 Ben Cotton 2020-01-29 12:41:09 UTC
Reassigning to the development team lead for Modularity.

Comment 5 Miro Hrončok 2020-01-29 12:50:13 UTC
Thanks Ben.

Daniel, if you want to pick my brain about how we could resolve this, feel free to ping me over IRC etc.

Comment 6 Ben Cotton 2020-02-11 17:18:52 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 7 Jaroslav Mracek 2020-02-19 07:46:43 UTC
I am proposing to extend hack originally applied only for libgit2 and Fedora 31. The new hack will reset all modules when target releasever will be 31 or 32. (patch - https://github.com/rpm-software-management/dnf-plugins-extras/pull/174) The hack is only applied for system-upgrade. If anyone wants to manage modules manually it will be still possible with other commands including offline-upgrade or offline-distrosync. 


Auto switching to use defaults again and reset the rest of modules:
dnf system-upgrade download --releasever=32
dnf system-upgrade reboot


How to switch modules during system-upgrade manually (nodejs:8 to nodejs:12) and keep other modules unchanged:
dnf module reset nodejs
dnf module enable nodejs:12 --releasever=32
dnf offline-distrosync download --releasever=32
dnf offline-distrosync reboot


Anyway we need a long-term solution. I would recommend to use a similar approach that is used for packages - obsoletes. It means that maintainer of module will provide an upgrade path (maintainer will be responsible to delivery a functional upgrade path) and dnf will follow it when user allows to follow it.

Comment 8 Miro Hrončok 2020-02-19 16:33:20 UTC
I agree that resetting all modules on distro upgrade is a good path here:

 - most users most likely got modules by implicit action without their knowledge and they just want the defaults
 - there is still a way to preserve the modules for the users who care (and we should document that in the common bugs page and in the how to upgrade documentation)

In fact, given the state of things, I believe this should be the behavior until a proper "upgrade path" exists, hence I don't think the code should have "if releasever is 31/32" in the code. If it ends up like that, expect another blocker from me for Fedora 33. While if this is applied unconditionally, we are good and the "optimal solution" can be invented without deadline based stress, because the problem would have  very limited scope (affecting only users who opted in for some modules).

Comment 9 Jaroslav Mracek 2020-02-26 11:15:41 UTC
(In reply to Miro Hrončok from comment #8)
> I agree that resetting all modules on distro upgrade is a good path here:
> 
>  - most users most likely got modules by implicit action without their
> knowledge and they just want the defaults
>  - there is still a way to preserve the modules for the users who care (and
> we should document that in the common bugs page and in the how to upgrade
> documentation)
> 
> In fact, given the state of things, I believe this should be the behavior
> until a proper "upgrade path" exists, hence I don't think the code should
> have "if releasever is 31/32" in the code. If it ends up like that, expect
> another blocker from me for Fedora 33. While if this is applied
> unconditionally, we are good and the "optimal solution" can be invented
> without deadline based stress, because the problem would have  very limited
> scope (affecting only users who opted in for some modules).

I believe that for fedora 33 we will have an alternative solution but anyway I am removing it.

Comment 10 Zbigniew Jędrzejewski-Szmek 2020-02-26 12:17:40 UTC
Yep, this looks good.

Comment 11 Adam Williamson 2020-02-26 17:33:44 UTC
Just a note: due to the issue with one package from an old slf4j build being in the 'maven' module which was a stream default in F30/F31, this problem affects FreeIPA server upgrades from F30/F31 to F32. See https://bugzilla.redhat.com/show_bug.cgi?id=1801882 . Removing all default streams from F32 fixed that problem for *new F32 installs*, but upgrades from F30/F31 still run into it and presumably will until this is implemented.

Comment 12 Fedora Update System 2020-02-28 01:27:02 UTC
dnf-plugins-extras-4.0.9-3.fc32 has been pushed to the Fedora 32 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-8e06529b39

Comment 13 Adam Williamson 2020-02-29 23:31:57 UTC
Doesn't this need to be backported to F30 and F31 to fix the problem? The module disablement happens pre-upgrade, right?

I did run this through the openQA tests and it does not seem to fix the problem on upgrade from F31 to F32: an upgrade from F31 to F32 run with this package included in the F32 repo set still hits the slf4j problem.

Comment 15 František Zatloukal 2020-03-02 12:21:56 UTC
dnf-plugins-extras-4.0.8-2.fc31 fixes the issue - system with modular maven/slf4j present can now upgrade just fine and maven module stream gets reset.

Comment 16 Miroslav Suchý 2020-03-04 14:58:27 UTC
For the record - I made this change to fedora-upgrade(8) script
https://github.com/xsuchy/fedora-upgrade/commit/2057ac121edcc73daeb06b6ce3e3121832bf7b41

Comment 17 Adam Williamson 2020-03-04 17:43:53 UTC
I've edited the F32 update to *not* be marked as fixing this bug, so we don't unnecessarily push that through freeze, and don't close this bug by doing that when we need to push to F30 and F31 to fix this bug.

Also, changing to AcceptedPreviousReleaseBlocker, as that's what this is (we need to push fixes to 'previous releases' - F30 and F31 - to resolve the blocker problem in F32).

Comment 18 Fedora Update System 2020-03-04 17:44:40 UTC
FEDORA-2020-4d8d435767 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2020-4d8d435767

Comment 19 Adam Williamson 2020-03-04 17:45:09 UTC
F31 update has gone stable, so I've marked the F30 update as fixing the bug. When it goes stable we can close this.

Comment 20 Adam Williamson 2020-03-04 19:06:23 UTC
Um.

I'm still seeing this problem in F31 - F32 upgrade tests *with* dnf-plugins-extras 4.0.8-2.fc31 :(

https://openqa.fedoraproject.org/tests/532914# is an example. That's a FreeIPA server upgrade test. You can see the DNF log in https://openqa.fedoraproject.org/tests/532914/file/upgrade_run-dnf.log . From that you can see that we install dnf-plugin-system-upgrade and dnf-plugins-extras 4.0.8-2.fc31 , then we install FreeIPA (which pulls in slf4j from a module - it's shown as coming from updates-modular repo), then we upgrade the system to F32. In the upgrade transaction we still see slf4j - and some bits that depend on it which are critical to FreeIPA - being removed due to module issues:

 Problem: package pki-base-java-10.7.3-6.fc32.noarch requires slf4j-jdk14, but none of the providers can be installed
  - problem with installed package pki-base-java-10.7.3-3.fc31.noarch
  - package slf4j-jdk14-1.7.30-1.fc32.noarch requires mvn(org.slf4j:slf4j-api) = 1.7.30, but none of the providers can be installed
  - slf4j-jdk14-1.7.25-8.fc31.noarch does not belong to a distupgrade repository
  - pki-base-java-10.7.3-3.fc31.noarch does not belong to a distupgrade repository
  - package slf4j-1.7.30-1.fc32.noarch is filtered out by modular filtering

The test first tries 'dnf -y --releasever=32 system-upgrade download' - which fails due to the error - then does 'dnf -y --releasever=32 --allowerasing system-upgrade download', which succeeds but removes the affected packages and breaks FreeIPA.

Comment 21 Fedora Update System 2020-03-06 02:11:41 UTC
dnf-plugins-extras-4.0.8-2.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 22 Adam Williamson 2020-03-06 02:27:47 UTC
Given #20, re-opening this. I'm definitely still seeing this in upgrades where 4.0.8-2 is used.

Comment 23 Adam Williamson 2020-03-06 02:28:18 UTC
Frantisek, how did you test exactly?

Comment 24 František Zatloukal 2020-03-06 06:30:44 UTC
(In reply to Adam Williamson from comment #23)
> Frantisek, how did you test exactly?

I am afraid I've already wiped the VM (don't have enough space on HDD :( ), but from memory, it was something like:
- installed maven (which enabled/installed slf4j as a module)
- upgraded to f32
- checked that slf4j remained installed

Comment 25 Jaroslav Mracek 2020-03-06 08:06:23 UTC
The issue is valid. Here is the patch https://github.com/rpm-software-management/libdnf/pull/910

Comment 26 Jaroslav Mracek 2020-03-06 10:17:14 UTC
I also created a test: https://github.com/rpm-software-management/ci-dnf-stack/pull/802

Comment 28 Adam Williamson 2020-03-06 15:49:25 UTC
Thanks! For the record, we figured out Frantisek's test 'passed' because he only installed slf4j but not slf4j-jdk14 . Both packages need to be installed for a problem to show up on upgrade (because slf4j is in the maven module but slf4j-jdk14 is not, and the non-modular slf4j-jdk14 package is newer than the slf4j in the maven module so they cannot be installed together). If you have just slf4j installed you will not observe a dependency issue on upgrade.

Comment 29 Adam Williamson 2020-03-06 16:24:50 UTC
do both updates need to be installed together for the bug to be fixed? or is the libdnf update for the PackageKit version of this bug?

Comment 30 Adam Williamson 2020-03-06 22:38:19 UTC
Unfortunately, we still have problems :/

I tweaked openQA to test an upgrade with those new packages. It now makes it through the `system-upgrade download` phase without needing `--allowerasing` and without errors about slf4j, but when we do `system-upgrade reboot`, the actual upgrade process fails. You see the system boot then almost immediately reboot back to F31. The logs from the failed upgrade attempt show:

Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: xerces-j2-2.12.0-4.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: xml-commons-apis-1.4.01-29.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: slf4j-1.7.30-1.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: httpcomponents-client-4.5.10-2.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: httpcomponents-core-4.4.12-2.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: apache-commons-cli-1.4-8.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: apache-commons-codec-1.13-2.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: apache-commons-io-1:2.6-8.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: xml-commons-resolver-1.2-29.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: apache-commons-logging-1.2-20.fc32.noarch fedora
Mar 06 14:31:10 ipa001.domain.local dnf[637]: Unable to match package: xalan-j2-2.7.2-2.fc32.noarch fedora

at a guess this may be similar to the issue I fixed before:

https://github.com/rpm-software-management/dnf-plugins-extras/pull/163

a difference between the transaction calculations at 'download' and 'install' phase?

Comment 31 Adam Williamson 2020-03-06 22:39:15 UTC
Created attachment 1668224 [details]
/var/log from the failed test

Comment 32 Adam Williamson 2020-03-06 22:48:53 UTC
so, yeah, we're definitely in that same area of the code. The fact that the repo name is in the error message shows that we are at the point where the upgrade plugin tries to reapply the list of packages to be *installed*, not the list of packages to be *removed*:

        for repo_id, pkg_spec_list in self.state.install_packages.items():
            for pkgspec in pkg_spec_list:
                try:
                    self.base.install(pkgspec, reponame=repo_id)
                except dnf.exceptions.MarkingError:
                    msg = _('Unable to match package: %s')
                    logger.info(msg, self.base.output.term.bold(pkgspec + " " + repo_id))
                    errs.append(pkgspec)

I can't figure out off the top of my head *why* this is failing, though.

Comment 33 Adam Williamson 2020-03-07 00:55:32 UTC
so, one theory: a difference I can think of to the previous case here (where we reset the libgit2 module on upgrade) is that, in that case, the libgit2 default stream was dropped *on the source releases too*.

That's not the case here. All default module streams have been dropped for F32, but F31 still has default module streams. maven is still one of them.

I *think* the problem may be that when we 'reset' the maven module in the upgrade download transaction, we wind up with it being considered 'disabled' for the purpose of that download transaction (i.e. the default F32 config), but then when we reboot to do the 'upgrade' transaction, the maven module is still considered to be 'enabled' (i.e. the default F31 config). That would explain why it can't be matched in the fedora repo - because it's filtered out by the module config.

I cannot yet get a handle on exactly where the module stream defaults are actually read from and when and what exactly the 'reset' operation run during the download phase *does* in permanent terms, like all dnf stuff it's stuck between four different bits (dnf-plugins-extras, dnf, libdnf, modulemd) and very difficult to figure out if you don't already know how it works...

Comment 34 Miro Hrončok 2020-03-07 01:11:41 UTC
(In reply to Adam Williamson from comment #33)
> so, one theory: a difference I can think of to the previous case here (where
> we reset the libgit2 module on upgrade) is that, in that case, the libgit2
> default stream was dropped *on the source releases too*.
> 
> That's not the case here. All default module streams have been dropped for
> F32, but F31 still has default module streams. maven is still one of them.
> 
> I *think* the problem may be that when we 'reset' the maven module in the
> upgrade download transaction, we wind up with it being considered 'disabled'
> for the purpose of that download transaction (i.e. the default F32 config),
> but then when we reboot to do the 'upgrade' transaction, the maven module is
> still considered to be 'enabled' (i.e. the default F31 config). That would
> explain why it can't be matched in the fedora repo - because it's filtered
> out by the module config.

According to bz1807832 even no-longer-default-but-default-on-GA modular streams are considered as default by dnf, so as long as libgit2 had a default stream on Fedora 30 GA (I don't know how to check), it seems this would bite as well, all things considered.

Comment 35 Adam Williamson 2020-03-07 01:24:15 UTC
Dunno about that.

Some observations whose significance I'm not sure of yet:

1. /var/lib/dnf/system-upgrade.json shows '"target_releasever": "32"', and system_upgrade.py `pre_configure_upgrade` does `self.base.conf.releasever = self.state.target_releasever`, but the /var/log/dnf.log section for the failed upgrade attempt still shows the releasever as 31:

2020-03-07T00:52:25Z INFO --- logging initialized ---
...
2020-03-07T00:52:25Z DDEBUG Command: dnf system-upgrade upgrade 
2020-03-07T00:52:25Z DDEBUG Installroot: /
2020-03-07T00:52:25Z DDEBUG Releasever: 31
...
2020-03-07T00:52:25Z DDEBUG Base command: system-upgrade
2020-03-07T00:52:25Z DDEBUG Extra commands: ['system-upgrade', 'upgrade']
...
2020-03-07T00:52:25Z DEBUG User-Agent: constructed: 'libdnf (Fedora 31; generic; Linux.x86_64)'
...
2020-03-07T00:52:27Z INFO Unable to match package: slf4j-1.7.30-1.fc32.noarch fedora

2. /var/lib/dnf/system-upgrade.json shows '"module_platform_id": null' .

Comment 36 Adam Williamson 2020-03-07 02:40:34 UTC
OK, so, a bit more info. On observation 1 - the 'Releasever' is logged there before `pre_configure_upgrade` runs, so it makes sense that it shows 31. That's *probably* not the problem.

Now, I threw some logging about the state of the maven module into various places. It looks like this:

        maveninfo = module_base._get_info(["maven"]).splitlines()
        mavenstream = [line for line in maveninfo if 'Stream' in line]
        logger.info('\n'.join(mavenstream))

so we're basically logging just the 'Stream' lines from the same function that backs 'dnf module info maven', wherever we run that. I stuck it in three places:

1. system_upgrade.py `run_download` right after it calls `module_base.reset(["*"])`
2. system_upgrade.py `run_upgrade` near the start, after `disable_blanking()`
3. base.py `resolve`, right after the interesting bit marked "auto-enable module streams based on installed RPMs"

If you do 'system-upgrade clean' then 'system-upgrade download' then 'system-upgrade reboot', this is what the logs show:

* First, during 'system-upgrade download' (just before "Starting dependency resolution"), we hit the `run_download` instance. At that point no maven stream is enabled. That is as expected.
* Second, later during 'system-upgrade download' (just after "Finished dependency resolution"), we hit the `resolve` instance. At that point no maven stream is enabled. That's good - I added the log there because I suspected that "auto-enable module streams" code was turning the module back on again, but it seems like it isn't?
* Third, after we run 'system-upgrade reboot', during the failed upgrade attempt, we hit the `run_upgrade` instance. At this point, the maven 3.5 stream *is* shown as enabled, and right after that, the slf4j error is logged:

2020-03-07T02:30:19Z INFO XXX IN UPGRADE RUN
2020-03-07T02:30:19Z INFO Stream           : 3.5 [e] [a]
Stream           : 3.6
2020-03-07T02:30:19Z INFO Unable to match package: slf4j-1.7.30-1.fc32.noarch fedora

so, it definitely looks like the problem is that, somehow, the maven stream gets enabled again during the 'upgrade' transaction. I haven't yet figured out how or why, though.

Comment 37 Adam Williamson 2020-03-07 02:53:33 UTC
Hmm, well, a fairly obvious fix works: just duplicate the module reset logic in `run_upgrade`. I don't know if that's the most correct fix, but...it does work. I will do a build with the patch updated to do that, and put it in the update.

Comment 38 Adam Williamson 2020-03-07 18:38:54 UTC
OK, so I edited the libdnf updates and added the new dnf-plugins-extras builds to them, as it seems the libdnf and dnf-plugins-extras changes are both intended for this bug, it makes sense for them to be in the same update (not separate updates). That has obsoleted the separate dnf-plugins-extras updates, and everything should be good now. openQA tests are showing FreeIPA upgrades succeeding, so my fix seems to be working.

Fedora 30: https://bodhi.fedoraproject.org/updates/FEDORA-2020-02ee4b1a1c
Fedora 31: https://bodhi.fedoraproject.org/updates/FEDORA-2020-717d521d35
Fedora 32 (probably unnecessary as all module stream defaults are disabled on F32 anyway): https://bodhi.fedoraproject.org/updates/FEDORA-2020-ba034f2c34

to test, install the update on the 'from' release *before* you run any dnf system-upgrade commands.

Comment 39 amatej 2020-03-07 22:26:41 UTC
Yes, the libdnf and dnf-plugins-extras changes are both intended for this bug, I should have put them into one update, my bad.

Thank you for fixing it.

Comment 40 Adam Williamson 2020-03-09 15:49:57 UTC
Is there an upstream branch or anything where I should submit the modified patch as a PR? Or are we solely keeping track of it downstream? I just don't want it to get lost if someone re-generates the patch set or something. Thanks!

Comment 41 Fedora Update System 2020-03-13 03:57:21 UTC
FEDORA-2020-02ee4b1a1c has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2020-02ee4b1a1c

Comment 42 Fedora Update System 2020-03-13 03:57:23 UTC
FEDORA-2020-02ee4b1a1c has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2020-02ee4b1a1c

Comment 43 Fedora Update System 2020-03-14 02:20:51 UTC
PackageKit-1.1.12-8.fc30, dnf-plugins-extras-4.0.8-4.fc30, libdnf-0.43.1-5.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-02ee4b1a1c

Comment 44 Fedora Update System 2020-03-16 22:04:32 UTC
PackageKit-1.1.12-8.fc30, dnf-plugins-extras-4.0.8-4.fc30, libdnf-0.43.1-5.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.