Bug 2419512 - GNOME Software update after clean install stalls for ten minutes then error out since dnf5-5.3.0.0-2.fc44 landed (new rpm keys) [NEEDINFO]
Summary: GNOME Software update after clean install stalls for ten minutes then error o...
Keywords:
Status: ON_QA
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf5
Version: rawhide
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: rpm-software-management
QA Contact:
URL:
Whiteboard: openqa
Depends On:
Blocks: BetaBlocker, F44BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2025-12-05 19:38 UTC by Adam Williamson
Modified: 2026-02-13 07:13 UTC (History)
8 users (show)

Fixed In Version: gnome-software-50~beta-3.fc45 gnome-software-50~beta-3.fc44
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:
amatej: needinfo? (mblaha)


Attachments (Terms of Use)
journal from an affected run (706.02 KB, text/plain)
2025-12-05 19:42 UTC, Adam Williamson
no flags Details
backtrace from dnf5daemon-server when gnome-software is stuck (207.16 KB, text/plain)
2025-12-05 20:29 UTC, Adam Williamson
no flags Details
backtrace from gnome-software when it's stuck (152.53 KB, text/plain)
2025-12-05 20:30 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2025-12-05 19:38:03 UTC
Going through Rawhide openQA test failures, I noticed that the GNOME Software update test has failed every run since Fedora-Rawhide-20251106.n.0. The obvious change in that compose is dnf5:

Package:      dnf5-5.3.0.0-2.fc44
Old package:  dnf5-5.2.17.0-2.fc44

gnome-software did not change.

The failure is always the same. After we do our preparatory steps to ensure an update is available (downgrading a single package to a dummy build with a very low version, so the current official build of it will be available as an update), we run GNOME Software, go to the Updates page, click Refresh, wait for the refresh, then click the Download button. The Download button is immediately swapped to a "Cancel" button with a small progress bar about 30% full at the bottom, and then...it stays like that. The progress bar never moves. The operation never completes. We never see the "Restart & Update..." button appear.

Weirdly, this is only affecting compose tests, not update tests. We run the same test on updates and that one is passing. I can't find a single instance of this failure on an update test.

The obvious difference is that on compose tests, we have one test which runs an install from the compose's Workstation live image, runs through gnome-initial-setup, then uploads a disk image of the installed system. This test boots from that disk image and immediately runs the update test procedure. On update tests, the test boots from a disk image created by virt-install which is updated every two weeks, sets up a side repository containing the packages under test, runs a `dnf -y --refresh update`, reboots, *then* runs the update test procedure.

I'll attach the journal from a failed test.

Comment 1 Adam Williamson 2025-12-05 19:42:11 UTC
Created attachment 2117685 [details]
journal from an affected run

Here's the journal from an affected run, but there's not a lot in it. The test clicked 'Refresh' around 02:20:05, which looks like it produces a little flurry of activity ending at 02:20:07. Then the test clicked 'Download' at 02:20:24, but there appears to be nothing at all in the journal (or any other log files, I checked) at this time.

I'll see if I can reproduce this manually and find anything out that way.

Comment 2 Adam Williamson 2025-12-05 20:27:08 UTC
I reproduced this easily, without any side repo stuff. I just ran an install from a slightly older Workstation live image - Fedora-Workstation-Live-Rawhide-20251128.n.0.x86_64.iso - booted the installed system, ran Software, and tried to update. In fact, the first time I ran it, it got stuck at a "Refreshing data" page without me doing anything. After I cancelled that and ran it again, it behaved like openQA: I hit the refresh button, then I hit Download, and it got stuck. Tried again, same thing.

I attached gdb to both the gnome-software and dnf5daemon-server processes and got backtraces of both, I'll attach them. While I was getting the dnf5daemon-server backtrace, gnome-software exited/crashed and there was an error in the journal:

Dec 05 12:21:39 ibm-p8-kvm-03-guest-02.virt.pnr.lab.eng.rdu2.redhat.com dnf5daemon-server[4217]: Error sending D-Bus reply to org.rpm.dnf.v0.rpm.Rpm:list() call: [System.Error.ENOMSG] Failed to send D-Bus message (No message of desired type)

not sure if that's significant.

Comment 3 Adam Williamson 2025-12-05 20:27:53 UTC
Proposing as an F44 Beta blocker as this seems trivially reproducible, and violates Beta criterion "The installed system must be able appropriately to install, remove, and update software with the default tool for the relevant software type in all release-blocking desktops (e.g. default graphical package manager)."

Comment 4 Adam Williamson 2025-12-05 20:29:03 UTC
Created attachment 2117688 [details]
backtrace from dnf5daemon-server when gnome-software is stuck

Comment 5 Adam Williamson 2025-12-05 20:30:37 UTC
Created attachment 2117691 [details]
backtrace from gnome-software when it's stuck

Comment 6 Adam Williamson 2025-12-05 20:40:37 UTC
Oh, hey, I just saw that if I leave it in this state for *ten minutes*, it eventually clears. I get an "Unable to download updates" error, with the details:

"Failed to run transaction: offline rpm transaction test failed with code 6."

There's nothing in the journal or dnf5.log. The available update list remains the same and the button goes back to being a Download button. If I click it, the same cycle happens.

Comment 7 Petr Pisar 2025-12-08 09:56:10 UTC
Why do you think this caused by DNF5? dnf5-5.3.0.0-2.fc44 is in F44 repository since 2025-11-05.

Few days ago I did dnf5-5.3.0.0-3.fc44 update <https://bodhi.fedoraproject.org/updates/FEDORA-2025-c2c0380ce6> and fedora-ci.koji-build.installability.functional reported failures on update like this:

Transaction failed: Signature verification failed.
OpenPGP check for package "libdnf5-plugin-rhsm-5.3.0.0-3.fc44.x86_64" (/var/cache/libdnf5/_dnf_local-71c913707df56d1b/packages/libdnf5-plugin-rhsm-5.3.0.0-3.fc44.x86_64.rpm) from repo "_dnf_local" has failed: The package is not signed.

Your error message "Failed to run transaction: offline rpm transaction test failed with code 6." mentions code 6. 6 stands for ERROR_GPG_CHECK constant in libdnf5::rpm::Transaction::TransactionRunResult enum. I suspect it is the same issue. Probably triggered by <https://fedoraproject.org/wiki/Changes/Enforcing_signature_checking_by_default>.

Comment 8 Petr Pisar 2025-12-08 10:02:07 UTC
Do you run updates on unsigned packages? Do you install them from a repository or from a file?

dnf5-5.3.0.0-3.fc44 fixes disabling signature verification IF --no-gpgchecks option is passed. Can you try your test again with that build?
Strangely first rpm implementing the Fedora change is 6.0.0-2, but that has not yet been tagged into F44.

Comment 9 Adam Williamson 2025-12-08 16:15:35 UTC
> Why do you think this caused by DNF5? dnf5-5.3.0.0-2.fc44 is in F44 repository since 2025-11-05.

From the first line of the original description:

"I noticed that the GNOME Software update test has failed every run since Fedora-Rawhide-20251106.n.0."

> Do you run updates on unsigned packages? Do you install them from a repository or from a file?

The downgraded package we install to ensure an update is available is unsigned, but the package being upgraded *to* is signed.

Comment 10 Milan Crha 2026-02-03 14:01:57 UTC
I tried both with installing plain .rpm or through the .repo file from [1] (where I only changed to enable the repo) and it worked fine with `dnf5-5.3.0.0-7.fc44.x86_64` (+/- [2], but it's minor), no problem to update with the gnome-software using the dnf5deamon-server under the hood, neither with `dnf update`. I had patched gnome-software with changes for bug #2392057 , even that bug was not about signed packages.

Either it has got fixed meanwhile or I did something "wrong", that I did not trigger the problem.

[1] https://fedorapeople.org/groups/qa/openqa-repos/
[2] https://github.com/rpm-software-management/dnf5/issues/2595

Comment 11 Milan Crha 2026-02-03 14:14:07 UTC
I looked into the backtrace and the Thread 20 of the gnome-software is running an update, downloading packaged for offline update, currently waiting for the transaction to be finished. Thread 2 calls the `list` method of the `org.rpm.dnf.v0.rpm.Rpm` interface. Other threads seemed idle or boring.

The daemon backtrace seems to be waiting for a confirmation of the key import, specifically for key `/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-rawhide-x86_64` and `/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-45-x86_64`. It does that in two threads.

The gnome-software is supposed to react to the key import confirmation signal. I'm looking whether it does or not.

Comment 12 Milan Crha 2026-02-03 14:28:32 UTC
Okay, the prompt is somehow blocked in the gnome-software (my user is a sudoer, thus I should not be asked, but it can be unrelated). When I click the "Cancel" button the dialog to confirm import of the key shows up. Cancelling the dialog itself does not return gnome-software back to the normal.

I reproduced it by removing the Fedora keys from the RPM by:

   rpmkeys -l

then

   rpmkeys --delete HEXHEXHEX

for each listed key.

Your test case took a package signed either with the rawhide or with the f45 while those I tried here used f44 key.

I'm not moving this to the gnome-software yet, give me some time to investigate what failed on the gnome-software side and whether there's anything needed on the dnf5 side or not, please.

Comment 13 Milan Crha 2026-02-03 15:09:47 UTC
The gnome-software receives the D-Bus signal about key import confirmation during the transaction run as expected. The gnome-software wants to figure out whether the key comes from an installed repository, thus it runs a `list` command on the same RPM (D-Bus) proxy, the same proxy as the transaction is running, and under the same session (in fact it's the Thread 2 in the gnome-software backtrace I mentioned in the comment #11). 

With dnf5-5.2.17.0-2.fc44.x86_64, the first build in koji before 5.3.0.0-2, the daemon can run the list with no problem, it returns the package information for the key file, then gnome-software sees it comes from the repository and tells the dnf5daemon-server that the key can be imported, without bothering a (sudo) user.

With the newer dnf5 the D-Bus proxy is locked or something, it does not allow to run other operations (at least the list not), while waiting for the response for the key import confirmation, thus the call from the gnome-software to list the packages which provide the file is starving on the dnf5daemon-server side.

Can you (dnf5) do anything about it? Should I (gnome-software) do anything about it? (Like open a new session and check the package there, instead of in the "busy" session. The downside can be limited session count in such case, as I recall you have the number very low.)

Comment 14 Adam Williamson 2026-02-03 17:37:43 UTC
Thanks a lot for looking into this, Milan.

Comment 15 amatej 2026-02-04 13:10:40 UTC
I haven't 100% confirmed this but I am convinced the deadlock is caused by https://github.com/rpm-software-management/dnf5/pull/2448.

I would like to say the read-only `list` should be possible during a transaction waiting on key import confirmation but it would be best to discuss this with @mblaha

Comment 16 Milan Crha 2026-02-04 15:16:20 UTC
> I haven't 100% confirmed this but I am convinced the deadlock is caused by...

Looks like that, I agree. That change makes sense, especially for the thread unsafety of those libraries mentioned in the pull request.

If I read it correctly, it is possible to workaround "the problem" by opening a new session and run the list in it (the lock seems to be per session). I do not know the consequences and how it works under, but I guess when some of the libsolv/libdnf5 files are shared between sessions it can still cause trouble, like in a concurrent write of such file in both sessions?

You probably do not want to read the libsolv/libdnf5 files while other thread writes to it, but I do not know the code.

Comment 17 Milan Crha 2026-02-06 08:09:11 UTC
As this is sort of important, I added a patch to the dnf5 plugin to use a temporary session when searching for the source of the RPM key which is to be imported. It fixes the starving on the dnf5daemon side. I think it is a good change, the locking for thread safety on the dnf5daemon side makes sense.

The relevant builds are gnome-software-50~beta-3.fc45 gnome-software-50~beta-3.fc44

I keep this open in case you'd want to do anything on the dnf5daemon side, but feel free to move this to the gnome-software and close it.


Note You need to log in before you can comment on or make changes to this bug.