Description of problem: While removing the openlmi-providers package, yum hung, and got this crash. [stef@stef ~]$ sudo yum remove openlmi-realmd openlmi-providers Loaded plugins: auto-update-debuginfo, langpacks, refresh-packagekit Resolving Dependencies --> Running transaction check ---> Package openlmi-providers.x86_64 0:0.0.22-1.fc19 will be erased --> Processing Dependency: libopenlmicommon.so.0()(64bit) for package: openlmi-providers-devel-0.0.22-1.fc19.x86_64 --> Processing Dependency: openlmi-providers(x86-64) = 0.0.22-1.fc19 for package: openlmi-providers-devel-0.0.22-1.fc19.x86_64 ---> Package openlmi-realmd.x86_64 0:0.0.22-1.fc19 will be erased --> Running transaction check ---> Package openlmi-providers-devel.x86_64 0:0.0.22-1.fc19 will be erased --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Removing: openlmi-providers x86_64 0.0.22-1.fc19 @updates-testing 69 k openlmi-realmd x86_64 0.0.22-1.fc19 @updates-testing 139 k Removing for dependencies: openlmi-providers-devel x86_64 0.0.22-1.fc19 @updates-testing 31 k Transaction Summary ================================================================================ Remove 2 Packages (+1 Dependent package) Installed size: 240 k Is this ok [y/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Erasing : openlmi-providers-devel-0.0.22-1.fc19.x86_64 1/3 Erasing : openlmi-realmd-0.0.22-1.fc19.x86_64 2/3 Erasing : openlmi-providers-0.0.22-1.fc19.x86_64 3/3 <<<<< just hangs here >>>>> Version-Release number of selected component: openlmi-providers-0.0.21-1.fc19 Additional info: reporter: libreport-2.1.4 cmdline: python /usr/libexec/openlmi-cimmof -n root/cimv2 delete /usr/share/openlmi-providers/30_LMI_Jobs.mof dso_list: python-libs-2.7.4-4.fc19.x86_64 executable: /usr/libexec/openlmi-cimmof kernel: 3.9.0-301.fc19.x86_64 runlevel: N 5 uid: 0 Truncated backtrace: expatbuilder.py:223:parseString:ExpatError: no element found: line 1, column 0 Traceback (most recent call last): File "/usr/libexec/openlmi-cimmof", line 339, in <module> main() File "/usr/libexec/openlmi-cimmof", line 332, in main objs = get_objects_from_mofs(args.cimmof, args.namespace, *args.mof) File "/usr/libexec/openlmi-cimmof", line 82, in get_objects_from_mofs parsed_dom = dom.parseString(xml_cleanup(out)) File "/usr/lib64/python2.7/xml/dom/minidom.py", line 1931, in parseString return expatbuilder.parseString(string) File "/usr/lib64/python2.7/xml/dom/expatbuilder.py", line 940, in parseString return builder.parseString(string) File "/usr/lib64/python2.7/xml/dom/expatbuilder.py", line 223, in parseString parser.Parse(string, True) ExpatError: no element found: line 1, column 0 Local variables in innermost frame: self: <xml.dom.expatbuilder.ExpatBuilderNS instance at 0x2709560> parser: <pyexpat.xmlparser object at 0x2711130> string: ''
Created attachment 747651 [details] File: backtrace
Created attachment 747652 [details] File: core_backtrace
Created attachment 747653 [details] File: environ
This caused this on my machine once i killed yum: [sudo] password for stef: BDB2053 Freeing read locks for locker 0x624: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x625: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x626: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x627: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x628: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x629: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x62a: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x62b: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x62c: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x62d: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x62e: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x62f: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x64c: 23126/140454044719104 BDB2053 Freeing read locks for locker 0x64d: 23126/140454044719104 BDB2053 Freeing read locks for locker 0x64e: 23126/140454044719104 BDB2053 Freeing read locks for locker 0x64f: 23126/140454044719104 BDB2053 Freeing read locks for locker 0x612: 23055/140092637361984 BDB2053 Freeing read locks for locker 0x622: 23055/140092637361984 error: rpmdb: BDB0113 Thread/process 23055/140092637361984 failed: BDB2052 locker has write locks error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages index using db5 - (-30973) error: cannot open Packages database in /var/lib/rpm CRITICAL:yum.main: Error: rpmdb open failed
Managed to get through that, so on my end everything is fine. But obviously not a bug we'd want others to run into. [stef@stef ~]$ sudo yum remove openlmi-realmd openlmi-providers error: rpmdb: BDB0113 Thread/process 23055/140092637361984 failed: BDB1507 Thread died in Berkeley DB library error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery error: cannot open Packages index using db5 - (-30973) error: cannot open Packages database in /var/lib/rpm CRITICAL:yum.main: Error: rpmdb open failed [stef@stef ~]$ sudo rpm --rebuilddb [stef@stef ~]$ sudo yum remove openlmi-realmd openlmi-providers Loaded plugins: auto-update-debuginfo, langpacks, refresh-packagekit No Match for argument: openlmi-realmd Resolving Dependencies There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them. --> Running transaction check ---> Package openlmi-providers.x86_64 0:0.0.22-1.fc19 will be erased --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Removing: openlmi-providers x86_64 0.0.22-1.fc19 @updates-testing 69 k Transaction Summary ================================================================================ Remove 1 Package Installed size: 69 k Is this ok [y/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Warning: RPMDB altered outside of yum. Erasing : openlmi-providers-0.0.22-1.fc19.x86_64 1/1 warning: file /usr/share/openlmi-providers/30_LMI_Jobs.mof: remove failed: No such file or directory warning: file /usr/share/openlmi-providers/05_LMI_Qualifiers.mof: remove failed: No such file or directory warning: file /usr/share/openlmi-providers: remove failed: No such file or directory warning: file /usr/share/doc/openlmi-providers-0.0.22/README: remove failed: No such file or directory warning: file /usr/share/doc/openlmi-providers-0.0.22/COPYING: remove failed: No such file or directory warning: file /usr/share/doc/openlmi-providers-0.0.22: remove failed: No such file or directory warning: file /usr/libexec/openlmi-register-pegasus: remove failed: No such file or directory warning: file /usr/libexec/openlmi-cimmof: remove failed: No such file or directory warning: file /usr/lib64/libopenlmicommon.so.0.0.1: remove failed: No such file or directory warning: file /usr/lib64/libopenlmicommon.so.0: remove failed: No such file or directory warning: file /usr/bin/openlmi-mof-register: remove failed: No such file or directory Verifying : openlmi-providers-0.0.22-1.fc19.x86_64 1/1 Removed: openlmi-providers.x86_64 0:0.0.22-1.fc19
jsynacek and jsafrane told me today they see it in DNF too. Guys, I can not reproduce it with dnf-0.3.5. This version is handling the RPM callbacks slightly differently than Yum so it is both possible I was unlucky and didn't get the hang or DNF doesn't suffer from this. What DNF version did you use? (0.3.5 is only available in rawhide now, sorry) Speculation: to me it looks like one of the openlmi-providers' RPM hooks crashes and Yum waits to read from a file descriptor for an input that never arrives.
I just managed to reproduce it with plain 'rpm -e openlmi-providers'. I'll look into it.
Ok, what seems to reproduce this issue for me every time is the following: (I'm testing on a rawhide machine, but I can speculate that this could happen on F19 as well.) 1) grab an openlmi-providers package (in my case openlmi-providers-0.0.21-3.fc20) from koji or build your own from fedora git 2) uninstall openlmi-providers if it's already installed on your system (this may trigger the problem too) # rpm -evvvv openlmi-providers 3) install openlmi-providers # rpm -ivvvv openlmi-providers-0.0.21-3.fc20 Either step 2 or 3 always triggers the problem. Here is my output of step 2: (ssh) root@rawhide-virt /home/jsynacek # rpm -evvvv openlmi-providers D: loading keyring from pubkeys in /var/lib/rpm/pubkeys/*.key D: couldn't find any keys in /var/lib/rpm/pubkeys/*.key D: loading keyring from rpmdb D: opening db environment /var/lib/rpm cdb:0x401 D: opening db index /var/lib/rpm/Packages 0x400 mode=0x0 D: locked db index /var/lib/rpm/Packages D: opening db index /var/lib/rpm/Name 0x400 mode=0x0 D: read h# 1052 Header SHA1 digest: OK (ee9a5785940e76344d2734c2156c88f1b88191bd) D: added key gpg-pubkey-de7f38bd-501f4964 to keyring D: Using legacy gpg-pubkey(s) from rpmdb D: read h# 1647 Header SHA1 digest: OK (e198855ffdd4089cda19e3f53be5defee4ebcdc3) D: ========== --- openlmi-providers-0.0.21-3.fc20 x86_64/linux 0x2 D: opening db index /var/lib/rpm/Requirename 0x400 mode=0x0 D: ========== recording tsort relations D: ========== tsorting packages (order, #predecessors, #succesors, depth) D: 0 0 0 1 -openlmi-providers-0.0.21-3.fc20.x86_64 D: erasing packages D: closed db index /var/lib/rpm/Requirename D: closed db index /var/lib/rpm/Name D: closed db index /var/lib/rpm/Packages D: closed db environment /var/lib/rpm D: opening db environment /var/lib/rpm cdb:0x401 D: opening db index /var/lib/rpm/Packages (none) mode=0x42 D: sanity checking 1 elements D: running pre-transaction scripts D: computing 11 file fingerprints D: opening db index /var/lib/rpm/Name (none) mode=0x42 D: opening db index /var/lib/rpm/Basenames (none) mode=0x42 D: opening db index /var/lib/rpm/Group (none) mode=0x42 D: opening db index /var/lib/rpm/Requirename (none) mode=0x42 D: opening db index /var/lib/rpm/Providename (none) mode=0x42 D: opening db index /var/lib/rpm/Conflictname (none) mode=0x42 D: opening db index /var/lib/rpm/Obsoletename (none) mode=0x42 D: opening db index /var/lib/rpm/Triggername (none) mode=0x42 D: opening db index /var/lib/rpm/Dirnames (none) mode=0x42 D: opening db index /var/lib/rpm/Installtid (none) mode=0x42 D: opening db index /var/lib/rpm/Sigmd5 (none) mode=0x42 D: opening db index /var/lib/rpm/Sha1header (none) mode=0x42 Preparing packages... D: computing file dispositions D: 0x0000fc03 4096 9934623 2910224 / D: ========== +++ openlmi-providers-0.0.21-3.fc20 x86_64-linux 0x2 D: read h# 1647 Header SHA1 digest: OK (e198855ffdd4089cda19e3f53be5defee4ebcdc3) D: erase: openlmi-providers-0.0.21-3.fc20 has 11 files D: %preun(openlmi-providers-0.0.21-3.fc20.x86_64): scriptlet start D: %preun(openlmi-providers-0.0.21-3.fc20.x86_64): execv(/bin/sh) pid 23093 + '[' 0 -eq 0 ']' + /usr/bin/openlmi-mof-register --just-mofs unregister /usr/share/openlmi-providers/30_LMI_Jobs.mof D: %preun(openlmi-providers-0.0.21-3.fc20.x86_64): waitpid(23093) rc 23093 status 0 openlmi-providers-0.0.21-3.fc20.x86_64 D: erase 100644 1 ( 0, 0) 4918 /usr/share/openlmi-providers/30_LMI_Jobs.mof D: erase 100644 1 ( 0, 0) 1058 /usr/share/openlmi-providers/05_LMI_Qualifiers.mof D: erase 040755 2 ( 0, 0) 4096 /usr/share/openlmi-providers D: erase 100644 1 ( 0, 0) 7851 /usr/share/doc/openlmi-providers-0.0.21/README D: erase 100644 1 ( 0, 0) 24428 /usr/share/doc/openlmi-providers-0.0.21/COPYING D: erase 040755 2 ( 0, 0) 4096 /usr/share/doc/openlmi-providers-0.0.21 D: erase 100755 1 ( 0, 0) 2491 /usr/libexec/openlmi-register-pegasus D: erase 100755 1 ( 0, 0) 13263 /usr/libexec/openlmi-cimmof D: erase 100755 1 ( 0, 0) 11200 /usr/lib64/libopenlmicommon.so.0.0.1 D: erase 120777 1 ( 0, 0) 25 /usr/lib64/libopenlmicommon.so.0 D: erase 100755 1 ( 0, 0) 5856 /usr/bin/openlmi-mof-register D: %postun(openlmi-providers-0.0.21-3.fc20.x86_64): scriptlet start D: %postun(openlmi-providers-0.0.21-3.fc20.x86_64): execv(/sbin/ldconfig) pid 23143 D: %postun(openlmi-providers-0.0.21-3.fc20.x86_64): waitpid(23143) rc 23143 status 0 D: --- h# 1647 openlmi-providers-0.0.21-3.fc20.x86_64 << hangs here >> Also, running strace on the rpm process results in: (ssh) root@rawhide-virt /home/jsynacek # strace -p 23092 Process 23092 attached futex(0x7fbdd800e924, FUTEX_WAIT, 1, NULL << hangs here >> After attaching gdb to the rpm process: (ssh) root@rawhide-virt ~ # gdb -p 23092 GNU gdb (GDB) Fedora (7.5.91.20130407-18.fc20) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Attaching to process 23092 Reading symbols from /usr/bin/rpm...Reading symbols from /usr/lib/debug/bin/rpm.debug...done. done. Reading symbols from /lib64/librpm.so.3...Reading symbols from /usr/lib/debug/usr/lib64/librpm.so.3.1.0.debug...done. done. Loaded symbols for /lib64/librpm.so.3 Reading symbols from /lib64/librpmio.so.3...Reading symbols from /usr/lib/debug/usr/lib64/librpmio.so.3.1.0.debug...done. done. Loaded symbols for /lib64/librpmio.so.3 Reading symbols from /lib64/libselinux.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libselinux.so.1.debug...done. done. Loaded symbols for /lib64/libselinux.so.1 Reading symbols from /lib64/libcap.so.2...BDB2053 Freeing read locks for locker 0xe1: 23446/140072383457280 BDB2053 Freeing read locks for locker 0xe2: 23446/140072383457280 BDB2053 Freeing read locks for locker 0xe3: 23446/140072383457280 BDB2053 Freeing read locks for locker 0xe4: 23446/140072383457280 << hangs here as well >> This looks like a strange locking problem in libdb to me. Also, one even stranger observation and also a *workaround* is that instead of running the debugger, trying to query rpm results only in the "... Freeing read locks ..." lines and the original hung rpm process then finishes without errors. (ssh) root@rawhide-virt ~ # rpm -q whatever BDB2053 Freeing read locks for locker 0x12e: 23626/140565356640256 BDB2053 Freeing read locks for locker 0x12f: 23626/140565356640256 BDB2053 Freeing read locks for locker 0x130: 23626/140565356640256 BDB2053 Freeing read locks for locker 0x131: 23626/140565356640256 package whatever is not installed << after this, the original hanging rpm process finishes with rc == 0 >>
Panu, could you please take a look at the comment above if you see anything suspicious?
Created attachment 748669 [details] strace
Looks like a dupe of bug 860500. Are there any crashes or other anomalies in /var/log/messages during the transaction?
Definitely looks like it. There are no traces in /var/log/messages when I use rpm, but there were some abrtd entries when I used dnf to update/remove the package.
Doh... reading from the start, a crash in /usr/libexec/openlmi-cimmof is why this bug exists in the first place :) And concurrent abrtd entries in logs pretty much confirms this. In some circumstances, when abrt runs during an rpm transaction, it can apparently leave stale locks behind which then stalls the transaction. I'm currently building rpm-4.11.0.1-4 for rawhide which checks for stale locks more often than before, which should help with the symptoms (ie rpm/yum/dnf hanging up). Figuring out the actual disease is another thing however, I haven't been able to reproduce these hangs on my own systems.
I don't know if this might help or not, but if I remove the %postun -p /sbin/ldconfig line from openlmi-providers package, the problem goes away.
Running ldconfig (or not) shouldn't affect anything really. Unless of course its crashing (or other otherwise triggering) abrt (or something else) that leaves stale rpmdb locks behind. I'm pushing an update shortly that should help the case by checking for those stale locks more frequently than only on db open.
rpm-4.11.0.1-2.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/rpm-4.11.0.1-2.fc19
Description of problem: This crash occurred during a yum update from openlmi-providers-0.0.21-1.fc19.x86_64 to openlmi-providers-0.0.22-1.fc19.x86_64 Version-Release number of selected component: openlmi-providers-0.0.21-1.fc19 Additional info: reporter: libreport-2.1.4 cmdline: python /usr/libexec/openlmi-cimmof -n root/cimv2 delete /usr/share/openlmi-storage/base.mof /usr/share/openlmi-storage/luks.mof /usr/share/openlmi-storage/partitions.mof /usr/share/openlmi-storage/storage.mof dso_list: PyXML-0.8.4-28.fc18.x86_64 executable: /usr/libexec/openlmi-cimmof kernel: 3.9.2-301.fc19.x86_64 runlevel: N 5 uid: 0 Truncated backtrace: expatbuilder.py:223:parseString:ExpatError: no element found: line 1, column 0 Traceback (most recent call last): File "/usr/libexec/openlmi-cimmof", line 339, in <module> main() File "/usr/libexec/openlmi-cimmof", line 332, in main objs = get_objects_from_mofs(args.cimmof, args.namespace, *args.mof) File "/usr/libexec/openlmi-cimmof", line 82, in get_objects_from_mofs parsed_dom = dom.parseString(xml_cleanup(out)) File "/usr/lib64/python2.7/site-packages/_xmlplus/dom/minidom.py", line 1925, in parseString return expatbuilder.parseString(string) File "/usr/lib64/python2.7/site-packages/_xmlplus/dom/expatbuilder.py", line 942, in parseString return builder.parseString(string) File "/usr/lib64/python2.7/site-packages/_xmlplus/dom/expatbuilder.py", line 223, in parseString parser.Parse(string, True) ExpatError: no element found: line 1, column 0 Local variables in innermost frame: self: <xml.dom.expatbuilder.ExpatBuilderNS instance at 0x11c95a8> parser: <pyexpat.xmlparser object at 0x1164980> string: ''
Package rpm-4.11.0.1-2.fc19: * should fix your issue, * was pushed to the Fedora 19 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing rpm-4.11.0.1-2.fc19' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-9458/rpm-4.11.0.1-2.fc19 then log in and leave karma (feedback).
rpm-4.11.0.1-2.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.