Description of problem: When verifying bug 1181501, I stumbled upon other unexpected packagekit errors. I played with the system for a while, and found out that roughly 50% of all my packagekit install or update operations fail with some kind of transaction error. When I wanted to reproduce it properly and gather all important logs, I saw a different error and it seems my rpm database got completely corrupted. I have no idea whether all those errors have a common root cause or not, I'll report everything I see in this report and we can dissect it with the help of developers. This is the corruption issue (issue 1): [kparal@localhost ~]$ pkcon install htop Resolving [=========================] Starting [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Waiting for authentication [=========================] Querying [=========================] Downloading packages [=========================] Testing changes [=========================] Installing packages [=========================] Finished [=========================] [kparal@localhost ~]$ pkcon remove htop Resolving [=========================] Querying [=========================] Testing changes [=========================] Finished [=========================] Removing [=========================] Waiting for authentication [=========================] Finished [=========================] Fatal error: Error -1 running transaction [kparal@localhost ~]$ pkcon remove htop Resolving [=========================] Finished [=========================] Removing [=========================] Waiting for authentication [=========================] Finished [=========================] Fatal error: failed to find package htop [kparal@localhost ~]$ pkcon install htop Resolving [=========================] Loading cache [=========================] Finished [=========================] Fatal error: htop-1.0.3-4.fc21.x86_64 is already installed [kparal@localhost ~]$ sudo yum remove htop error: db5 error(-30973) from dbenv->open: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages index using db5 - (-30973) error: cannot open Packages database in /var/lib/rpm CRITICAL:yum.main: Error: rpmdb open failed [kparal@localhost ~]$ rpm -q htop htop-1.0.3-4.fc21.x86_64 [kparal@localhost ~]$ Version-Release number of selected component (if applicable): fully updated F21, with libhif-0.1.7-3.fc21 installed PackageKit-1.0.3-4.fc21.x86_64 rpm-4.12.0.1-4.fc21.x86_64 libhif-0.1.7-3.fc21.x86_64 hawkey-0.5.2-1.fc21.x86_64 librepo-1.7.11-1.fc21.x86_64 dnf-0.6.3-2.fc21.noarch yum-3.4.3-153.fc21.noarch libdb-5.3.28-9.fc21.x86_64 How reproducible: not sure about this particular corruption issue, but if I run pkcon install and pkcon remove in a loop, I receive an error after just a few rounds, reliably. I always reset VM to the last snapshot and try again, to always run on a fresh and non-broken system. Steps to Reproduce: 1. install clean F21 in a VM 2. fully update, include libhif-0.1.7-3.fc21 to have the latest and greatest stuff (note: I have seen similar errors even with older libhif) 3. make a snapshot 4. pick some package, e.g. htop (it's small) 5. run pkcon install htop 6. run pkcon remote htop 7. repeat steps 5 -6 in a loop until you get an error (that should be under a minute)
Created attachment 980832 [details] issue 1 - journal
Created attachment 980833 [details] rpm -qa output
I just reproduced this exactly same problem (issue 1; I guess I'm being very lucky today) on a completely different computer while using virtualbox instead of libvirt/kvm for my VM. That rules out hardware issues or virtualization issues with the host.
This is a different type of error - cannot create index (issue 2): [kparal@localhost ~]$ C=1; while echo "Run $C ..." && pkcon install htop && pkcon remove htop; do C=$(($C + 1)); done Run 1 ... Resolving [=========================] Starting [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Waiting for authentication [=========================] Downloading packages [=========================] Testing changes [=========================] Installing packages [=========================] Finished [=========================] Fatal error: cannot create index for htop;1.0.3-4.fc21;x86_64;installed [kparal@localhost ~]$ This is a very common issue, if PK fails, this is a very frequent error.
Created attachment 980853 [details] issue 2 - journal
I have tried to bisect recent version of packages and find when it started to occur. This is my reproducer: C=1; while echo "Run $C ..." && pkcon install htop && pkcon remove htop; do C=$(($C + 1)); done Usually I see an error in the first 1-4 runs. If I see 20 transactions without an error, I consider that setup working. F21 GOLD works. It contains: dnf-0.6.1-1.fc21.noarch hawkey-0.5.1-1.fc21.x86_64 libdb-5.3.28-9.fc21.x86_64 libhif-0.1.6-1.fc21.x86_64 librepo-1.7.5-2.fc21.x86_64 libsolv-0.6.4-3.fc21.x86_64 PackageKit-1.0.3-2.fc21.x86_64 rpm-4.12.0.1-3.fc21.x86_64 yum-3.4.3-153.fc21.noarch F21 fully updated today doesn't work. It contains: dnf-0.6.3-2.fc21.noarch hawkey-0.5.2-1.fc21.x86_64 libdb-5.3.28-9.fc21.x86_64 libhif-0.1.7-1.fc21.x86_64 librepo-1.7.11-1.fc21.x86_64 libsolv-0.6.4-3.fc21.x86_64 PackageKit-1.0.3-4.fc21.x86_64 rpm-4.12.0.1-4.fc21.x86_64 yum-3.4.3-153.fc21.noarch The minimal change to trigger the bug is to take F21 GOLD and update these packages: libhif-0.1.6-1.fc21.x86_64 -> libhif-0.1.7-1.fc21.x86_64 librepo-1.7.5-2.fc21.x86_64 -> librepo-1.7.11-1.fc21.x86_64 PackageKit-1.0.3-2.fc21.x86_64 -> PackageKit-1.0.3-4.fc21.x86_64 If I update just libhif or just PackageKit, I don't see the bug. Only when both of them are updated, I see the bug. I need to update librepo as well because otherwise PackageKit crashes instantly on boot. Or you can do it the other way round, take fully updated F21 and downgrade just PackageKit or just libhif. In both cases I suddenly can't reproduce the bug. So it really seems to be related to the combination of PackageKit 1.0.3-4 + libhif 0.1.7-1. That doesn't mean that the bug can't originate from a completely different package, but these updates seem to trigger it while the previous ones didn't.
And another very common error - Error -1 running transaction (issue 3): [kparal@localhost ~]$ C=1; while echo "Run $C ..." && pkcon install htop && pkcon remove htop; do C=$(($C + 1)); done Run 1 ... <snip> Run 2 ... Resolving [=========================] Loading cache [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Waiting for authentication [=========================] Querying [=========================] Downloading packages [=========================] Installing packages [=========================] Finished [=========================] Fatal error: Error -1 running transaction [kparal@localhost ~]$
Created attachment 980872 [details] issue 3 - journal
Richard created a new libhif which should hopefully fix this. It's in this repo: http://people.freedesktop.org/~hughsient/fedora/ Unfortunately PackageKit fails to start with it, even when updated from Richard's repo: packagekitd[1297]: Failed to load the backend: opening module hif failed : /usr/lib64/packagekit-backend/libpk_backend_hif.so: undefined symbol: hif_rc_to_gerror So, waiting for Richard to fix it.
Repo updated, sorry for the brain-fart.
Still bad with libhif-0.1.8-0.112.20150116git.fc21.x86_64 , using kparal's reproducer: failed on run 3 with the 'Fatal error: cannot create index for htop;1.0.3-4.fc21;x86_64;install' error.
Repo updated with https://github.com/hughsie/libhif/commit/c17515c0a5e33b81ee15fdfbdce38930f91a6631
I still see the same issue: [kparal@localhost ~]$ C=1; while echo "Run $C ..." && pkcon install htop && pkcon remove htop; do C=$(($C + 1)); done Run 1 ... Resolving [=========================] Starting [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Waiting for authentication [=========================] Querying [=========================] Downloading packages [=========================] Testing changes [=========================] Installing packages [=========================] Finished [=========================] Resolving [=========================] Testing changes [=========================] Finished [=========================] Removing [=========================] Waiting for authentication [=========================] Requesting data [=========================] Finished [=========================] Run 2 ... Resolving [=========================] Loading cache [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Querying [=========================] Downloading packages [=========================] Testing changes [=========================] Installing packages [=========================] Finished [=========================] Fatal error: cannot create index for htop;1.0.3-4.fc21;x86_64;installed [kparal@localhost ~]$ rpm -q libhif libhif-0.1.8-0.116.20150117git.fc21.x86_64 [kparal@localhost ~]$ In journal: Jan 19 09:04:20 localhost.localdomain PackageKit[1297]: uid 1000 is trying to obtain org.freedesktop.packagekit. Jan 19 09:04:20 localhost.localdomain PackageKit[1297]: new install-packages transaction /49_dbcdbcac scheduled Jan 19 09:04:20 localhost.localdomain PackageKit[1297]: uid 1000 obtained auth for org.freedesktop.packagekit.pa Jan 19 09:04:23 localhost.localdomain packagekitd[1297]: mapping pread: Bad file descriptor Jan 19 09:04:23 localhost.localdomain PackageKit[1297]: in /49_dbcdbcac for install-packages package htop;1.0.3- Jan 19 09:04:23 localhost.localdomain PackageKit[1297]: install-packages transaction /49_dbcdbcac from uid 1000
Note, I updated just libhif from Richard's utopia repo, not PackageKit (that seems to have broken deps with libappstream-glib).
I've found out that matching libappstream-glib is built in rawhide: http://koji.fedoraproject.org/koji/buildinfo?buildID=604510 so I used it and updated PackageKit as well. Unfortunately it still fails for me: Run 7 ... Resolving [=========================] Loading cache [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Querying [=========================] Downloading packages [=========================] Testing changes [=========================] Installing packages [=========================] Finished [=========================] Fatal error: cannot create index for htop;1.0.3-4.fc21;x86_64;installed [kparal@localhost ~]$ rpm -q libhif libhif-0.1.8-0.116.20150117git.fc21.x86_64 [kparal@localhost ~]$ rpm -q PackageKit PackageKit-1.0.4-0.1851.20150117git.fc21.x86_64 [kparal@localhost ~]$ Journal says: Jan 19 09:30:23 localhost.localdomain PackageKit[1298]: new install-packages transaction /101_bacdaeea scheduled from uid 1000 Jan 19 09:30:23 localhost.localdomain PackageKit[1298]: uid 1000 obtained auth for org.freedesktop.packagekit.package-install Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: mapping pread: Bad file descriptor Jan 19 09:30:25 localhost.localdomain PackageKit[1298]: in /101_bacdaeea for install-packages package htop;1.0.3-4.fc21;x86_64;fedora was installing for uid 1000 Jan 19 09:30:25 localhost.localdomain PackageKit[1298]: install-packages transaction /101_bacdaeea from uid 1000 finished with failed after 1848ms Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: error: rpmdb: BDB0151 fsync: Unknown error -1 Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: error: db5 error(-1) from db->close: BDB0092 Unknown error: -1 Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: error: rpmdb: BDB0151 fsync: Unknown error -1 Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: error: rpmdb: BDB0164 close: Bad file descriptor Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: error: rpmdb: Packages: Bad file descriptor Jan 19 09:30:25 localhost.localdomain packagekitd[1298]: error: db5 error(-1) from db->close: BDB0092 Unknown error: -1 Jan 19 09:30:25 localhost.localdomain polkitd[681]: Unregistered Authentication Agent for unix-process:4765:7321 (system bus name :1.129, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
commit 783f1f9dc8aa904a14ba24ac2c337d19f64e2bbc Author: Richard Hughes <richard> Date: Mon Jan 19 15:03:40 2015 +0000 Do not attempt to read from a closed file descriptor This fixes a hard-to-trigger and even-harder-to-debug crash when installing and then removing the same package. This will fix errors in the journal like: error: rpmdb: Packages: Bad file descriptor Part of the yumdb core design is that the package 'key' is the SHA hash of the rpmdb package header. By setting this during the transaction we do not have to open the rpmdb when writing out the yumdb entry. It appears that hy_sack_load_system_repo() -> repo_add_repodata() -> repodata_empty() -> repodata_freedata() -> close() calls on the fd that we're trying to share between different threads. The pread() and pwrite() system calls were chosen in libsolv to make this possible, so we're not trying to do anything unsupported and crazy. We've got the rpm 'Header' already during the transaction, so it's easy to fix by just extracting it there rather than looking it up once the transaction has completed. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1183010
PackageKit-1.0.4-1.fc21,libhif-0.1.8-1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/PackageKit-1.0.4-1.fc21,libhif-0.1.8-1.fc21
With the latest bits - from #c17 - I can no longer trigger any crashes or outright error messages, but it seems after a while PK gets confused about the current status of the package: Run 3 ... Resolving [=========================] Testing changes [=========================] Finished [=========================] Installing [=========================] Waiting in queue [=========================] Querying [=========================] Downloading packages [=========================] Testing changes [=========================] Finished [=========================] Installing packages [=========================] Resolving [=========================] Querying [=========================] Testing changes [=========================] Finished [=========================] Removing [=========================] Waiting for authentication [=========================] Finished [=========================] Run 4 ... Resolving [=========================] Testing changes [=========================] Finished [=========================] Fatal error: htop-1.0.3-4.fc21.x86_64 is already installed [root@localhost test]# rpm -q htop package htop is not installed that is, run 3's install/uninstall cycle completed, but then run 4's 'install' command claimed htop was already installed; rpm -q shows it is not. This doesn't look like it needs to hold up the update, but still seems to be not perfectly fixed yet.
That's a serious issue, but I suspect it's unrelated to this bug. PackageKit has occasionally had trouble understanding what's installed and what isn't for quite some time.
(In reply to Adam Williamson (Red Hat) from comment #18) > that is, run 3's install/uninstall cycle completed, but then run 4's > 'install' command claimed htop was already installed So, this looks to be some kind of cache-invalidation issue. After a remove/install PackageKit is supposed to invalidate any sack cache with @system included. Can anyone that can repduce this issue please grab a: sudo /usr/libexec/packagekitd --verbose --disable-timer --backend=hif trace -- I then need the *whole* log right to the start. Also, we should probably open a new bug for this issue, it's unrelated to any of the fixes. Thanks. Richard
libhif-0.1.8-1.fc21, PackageKit-1.0.4-1.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.
For reference of anyone with broken offline updates, you can install the update with: su -c 'yum update --advisory=FEDORA-2015-0921' or: sudo yum update --advisory=FEDORA-2015-0921 At present the update is syncing to mirrors so it may not be immediately available; if the above doesn't install anything, wait a few hours and try again. Sorry again for the trouble.