Bug 1629340 - PackageKit update crashes at end of transaction with "TransactionItem state is not set: grub2-tools-1:2.02-57.fc29.x86_64"
Summary: PackageKit update crashes at end of transaction with "TransactionItem state i...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libdnf
Version: 29
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: rpm-software-management
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
: 1599185 1608685 (view as bug list)
Depends On:
Blocks: F29BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2018-09-15 01:47 UTC by Adam Williamson
Modified: 2019-03-08 06:44 UTC (History)
16 users (show)

Fixed In Version: libdnf-0.19.1-3.fc29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1632527 (view as bug list)
Environment:
Last Closed: 2018-09-20 22:35:41 UTC


Attachments (Terms of Use)
backtrace of the crash (from pkcon) (33.37 KB, text/plain)
2018-09-15 21:31 UTC, Adam Williamson
no flags Details
screencast of the bug happening with dnf 3.5.1 and libdnf 0.19.1 (clean Beta RC3 Workstation live install) (273.32 KB, application/octet-stream)
2018-09-18 16:30 UTC, Adam Williamson
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1599185 None CLOSED dnf update crashed with exception: TransactionItem state is not set 2019-07-31 11:03:05 UTC
Red Hat Bugzilla 1601877 None CLOSED RuntimeError: TransactionItem not found for key: grub2-tools-minimal 2019-07-31 11:03:05 UTC
Red Hat Bugzilla 1603148 None CLOSED [abrt] dnf: endTransaction(): transaction.py:742:endTransaction:RuntimeError: C++ std::exception: TransactionItem state ... 2019-07-31 11:03:05 UTC
Red Hat Bugzilla 1608685 None CLOSED [abrt] dnf: endTransaction(): transaction.py:742:endTransaction:RuntimeError: C++ std::exception: TransactionItem state ... 2019-07-31 11:03:05 UTC
Red Hat Bugzilla 1622449 None CLOSED _transaction.Swdb_endTransaction(): RuntimeError: ... TransactionItem state is not set: 2019-07-31 11:03:06 UTC
Red Hat Bugzilla 1642796 None CLOSED PackageKit terminated before end of offline update: TransactionItem state is not set (when any multiarch package is inst... 2019-07-31 11:03:06 UTC


Description Adam Williamson 2018-09-15 01:47:22 UTC
I wanted to test the DNF 3.5.1 update[1] to see if we should pull it into F29 Beta, so I built a live image containing those packages. That worked. I ran an install and boot of that live image. That worked. Then I tried running an offline update from the installed system. The update process got to 97% and then seemed to get stuck. I left it for over half an hour, then shut down and rebooted the system. Looking at the logs from the update boot, I see this:

Sep 14 17:56:38 localhost.localdomain packagekitd[639]: terminate called after throwing an instance of 'std::runtime_error'
Sep 14 17:56:38 localhost.localdomain packagekitd[639]:   what():  TransactionItem state is not set: grub2-tools-1:2.02-57.fc29.x86_64
Sep 14 17:56:38 localhost.localdomain systemd[1]: packagekit.service: Main process exited, code=killed, status=6/ABRT
Sep 14 17:56:38 localhost.localdomain systemd[1]: packagekit.service: Failed with result 'signal'.

it does not seem like the crash was actually captured by coredumpctl or abrt, unfortunately.

This error looks a lot like one that was claimed fixed in dnf a while ago:

https://bugzilla.redhat.com/show_bug.cgi?id=1603148 , marked dupe of:
https://bugzilla.redhat.com/show_bug.cgi?id=1601877

It also looks similar to a couple other reports:

https://bugzilla.redhat.com/show_bug.cgi?id=1622449 (for which the commit claimed as a fix is in 0.19.1, so that shouldn't be the problem here)
https://bugzilla.redhat.com/show_bug.cgi?id=1599185 (an older report which has not apparently been followed up or closed)
https://bugzilla.redhat.com/show_bug.cgi?id=1608685 (another report involving grub from July, so may be a dupe of one of the others)

I will see if this reproduces on a second try, and also see if it happens on upgrade from the RC1 image (which had dnf 3.2.0).

[1]: https://bodhi.fedoraproject.org/updates/FEDORA-2018-f16a71bc92

Comment 1 Adam Williamson 2018-09-15 03:36:37 UTC
Happened again, exactly the same way, on the second try. Will now test RC1.

Note the update that seems to have the trouble is from https://koji.fedoraproject.org/koji/buildinfo?buildID=1143964 (which is on the system after install) to https://koji.fedoraproject.org/koji/buildinfo?buildID=1144121 (which is currently in updates-testing).

Comment 2 Adam Williamson 2018-09-15 04:09:07 UTC
Beta RC1 (with dnf 3.2.0 and libdnf 0.17.0) behaves just the same :(

Proposing this as a Beta blocker, as a violation of "The installed system must be able appropriately to install, remove, and update software with the default tool for the relevant software type in all release-blocking desktops (e.g. default graphical package manager). This includes downloading of packages to be installed/updated." - GNOME Software offline update is the 'default tool' for Workstation, and with current u-t packages, this seems to reliably fail and cause a hung update.

I'll do some tests with plain dnf (not g-s) next.

Comment 3 Adam Williamson 2018-09-15 05:33:28 UTC
Just booting the installed system fresh and doing 'dnf update grub2*' doesn't hit the problem (that transaction completes fine).

Comment 4 Adam Williamson 2018-09-15 19:45:34 UTC
Is it possible PackageKit needs a change along the lines of https://github.com/rpm-software-management/dnf/pull/1134 ?

Comment 5 Adam Williamson 2018-09-15 20:48:57 UTC
`pkcon update` (from a clean Workstation live install) crashes the same way.

Comment 6 Adam Williamson 2018-09-15 21:31:46 UTC
Created attachment 1483579 [details]
backtrace of the crash (from pkcon)

Comment 7 Adam Williamson 2018-09-16 15:53:28 UTC
Just 'pkcon update grub2-tools' (from the clean installed live image) also reproduces the crash.

Comment 8 Adam Williamson 2018-09-17 19:12:42 UTC
dmach, can you please have someone look at this on the libdnf end? PackageKit maintainer says it is crashing in swdb, and he has no idea how swdb works. Thanks! We need an evaluation/fix for this urgently as go/no-go is on Thursday.

Comment 9 Geoffrey Marr 2018-09-17 20:00:15 UTC
Discussed during the 2018-09-17 blocker review meeting: [1]

The decision to classify this bug as an "AcceptedBlocker" was made as it violates the following criteria:

"The installed system must be able appropriately to install, remove, and update software with the default tool for the relevant software type in all release-blocking desktops (e.g. default graphical package manager)" for Workstation, with current repo state.

[1] https://meetbot.fedoraproject.org/fedora-blocker-review/2018-09-17/f29-blocker-review.2018-09-17-16.02.txt

Comment 10 Adam Williamson 2018-09-17 21:49:08 UTC
Note, it seems like the 'interesting' thing about grub2-tools is that it obsoletes old versions of...itself:

[adamw@adam pki-core ((pki-core-10.6.6-1.fc29) %)]$ rpm -q grub2-tools
grub2-tools-2.02-58.fc29.x86_64
[adamw@adam pki-core ((pki-core-10.6.6-1.fc29) %)]$ rpm -q --obsoletes grub2-tools
grub2-tools < 1:2.02-58.fc29
[adamw@adam pki-core ((pki-core-10.6.6-1.fc29) %)]$

I don't know why it does that. But that's the obvious suspect for the 'odd' condition that causes this.

Comment 11 Adam Williamson 2018-09-17 22:00:52 UTC
CCing pjones for the slightly odd thing grub2 does here. It's not really _wrong_ per se, and package managers should certainly cope with it, but it seems unusual and I can't see what the point of it is.

Comment 12 Adam Williamson 2018-09-17 22:15:53 UTC
Ah, so I think these obsoletes: showed up when there was a subpackage split in grub2.

The generic scenario is like this. Say you have package 'foo', version 1.0-1, and you want to split it into 'foo' and 'foo-extras' in 2.0-1. You want systems that already have 'foo' installed to get both 'foo' and 'foo-extras' when they update (to make sure they don't lose anything), but you don't want 'foo' to actually depend on 'foo-extras' going forward. In that case, what you have to do is make both foo-2.0-1 and foo-extras-2.0-1 obsolete the old foo, e.g.:

Obsoletes: foo < 2.0-1

that's a reason you'd have 'foo' obsolete an old version of itself. It seems grub2 did some splits like this in the past. I'd expect these obsoletes to be specifically versioned to cover the actual point in the package history where the splits occurred, and it seems like at first they actually were, but in https://src.fedoraproject.org/rpms/grub2/c/ecef1ed7b50ed05b65574c8b8815d7ae66e5a0a9 , for some reason, a lot of the Obsoletes: were rejigged and they all wound up being "< %{evr}" instead of "< (some specific version)".

If I'm right that this crash happens any time we are upgrading some package to a new version which Obsoletes the currently-installed version, then it is a real problem, because it will happen any time anyone is doing one of these subpackage splits properly.

Comment 13 Adam Williamson 2018-09-17 23:03:04 UTC
so packagekit's code here is really very simple and more or less amounts to getting the package ids to be updated using dnf_utils_find_package_ids , then calling hy_goal_upgrade_to on each one, then running the transaction. That's basically all it does. So if something like what was done in dnf needs to be done for this case, it feels to me like it ought to be done *in libdnf*.

Comment 14 Daniel Mach 2018-09-18 08:56:14 UTC
I believe this was fixed in libdnf-0.19.0 :
a46e66c5 [transaction] Avoid adding duplicates via Transaction::addItem().

(duplicates in the transaction/history caused that rpm transaction callbacks set state of the first item and remaining occurences of the same item weren't changed and it ended up with the exception "TransactionItem state is not set") 

You mentioned upgrade from dnf 3.2 which used libdnf 1.17.
The issue cannot be fixed as the upgrade is performed by the old dnf and libdnf.
If you upgrade dnf and libdnf first and then upgrade the rest, everything should work as expected.

To me, it's CLOSED/CURRENTRELEASE already.

Comment 15 Adam Williamson 2018-09-18 15:26:32 UTC
...can you please read more closely? I already explicitly pointed out that commit (and the bug report that lead to it) and said it is *not* the problem, as this was tested with DNF 3.5.1 and libdnf 0.19.1. I *later* tested with DNF 3.2.0 and libdnf 0.17 to check whether the bug was new or not; it is not, the same bug happens in both.

You can reproduce this for yourself, quite easily, by getting the RC3 Workstation live:

https://kojipkgs.fedoraproject.org/compose/29/Fedora-29-20180916.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-29_Beta-1.3.iso

installing it, and from the installed system, running 'pkcon update grub2-tools'. As you do so, you can check for yourself that it includes DNF 3.5.1 and libdnf 0.19.1.

Comment 16 Adam Williamson 2018-09-18 16:30:03 UTC
Created attachment 1484424 [details]
screencast of the bug happening with dnf 3.5.1 and libdnf 0.19.1 (clean Beta RC3 Workstation live install)

Here is a *screencast* of me reproducing the bug on Beta RC3 Workstation live with DNF 3.5.1 and libdnf 0.19.1.

Comment 18 Adam Williamson 2018-09-18 23:33:55 UTC
Fix works in a quick test here. I'm going to fire a build so we can put it in an RC4, as we're on a very tight time frame for Beta. If it turns out to fail CI or be bad in some other way, we can just throw away RC4.

Comment 19 Fedora Update System 2018-09-19 00:19:37 UTC
anaconda-29.24.3-1.fc29 dnf-3.5.1-1.fc29 dnf-plugins-core-3.0.3-1.fc29 libdnf-0.19.1-3.fc29 lorax-29.12-2.fc29 python-blivet-3.1.0-2.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-f16a71bc92

Comment 20 Fedora Update System 2018-09-20 16:16:23 UTC
anaconda-29.24.3-1.fc29, dnf-3.5.1-1.fc29, dnf-plugins-core-3.0.3-1.fc29, libdnf-0.19.1-3.fc29, lorax-29.12-2.fc29, python-blivet-3.1.0-2.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-f16a71bc92

Comment 21 Fedora Update System 2018-09-20 22:35:41 UTC
anaconda-29.24.3-1.fc29, dnf-3.5.1-1.fc29, dnf-plugins-core-3.0.3-1.fc29, libdnf-0.19.1-3.fc29, lorax-29.12-2.fc29, python-blivet-3.1.0-2.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 22 Daniel Mach 2019-03-08 06:43:16 UTC
*** Bug 1599185 has been marked as a duplicate of this bug. ***

Comment 23 Daniel Mach 2019-03-08 06:44:10 UTC
*** Bug 1608685 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.