Bug 962750 - Uninstalling openlmi-providers-0.0.22 hung yum during transaction
Uninstalling openlmi-providers-0.0.22 hung yum during transaction
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: openlmi-providers (Show other bugs)
19
x86_64 Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Radek Novacek
Fedora Extras Quality Assurance
abrt_hash:899f64eb3053e827b68ec137e1d...
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-14 07:31 EDT by Stef Walter
Modified: 2016-11-30 19:31 EST (History)
12 users (show)

See Also:
Fixed In Version: rpm-4.11.0.1-2.fc19
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-05-31 23:22:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
File: backtrace (997 bytes, text/plain)
2013-05-14 07:31 EDT, Stef Walter
no flags Details
File: core_backtrace (578 bytes, text/plain)
2013-05-14 07:32 EDT, Stef Walter
no flags Details
File: environ (2.19 KB, text/plain)
2013-05-14 07:32 EDT, Stef Walter
no flags Details
strace (1.89 MB, text/plain)
2013-05-16 04:54 EDT, Jan Synacek
no flags Details

  None (edit)
Description Stef Walter 2013-05-14 07:31:53 EDT
Description of problem:
While removing the openlmi-providers package, yum hung, and got this crash.

[stef@stef ~]$ sudo yum remove openlmi-realmd openlmi-providers
Loaded plugins: auto-update-debuginfo, langpacks, refresh-packagekit
Resolving Dependencies
--> Running transaction check
---> Package openlmi-providers.x86_64 0:0.0.22-1.fc19 will be erased
--> Processing Dependency: libopenlmicommon.so.0()(64bit) for package: openlmi-providers-devel-0.0.22-1.fc19.x86_64
--> Processing Dependency: openlmi-providers(x86-64) = 0.0.22-1.fc19 for package: openlmi-providers-devel-0.0.22-1.fc19.x86_64
---> Package openlmi-realmd.x86_64 0:0.0.22-1.fc19 will be erased
--> Running transaction check
---> Package openlmi-providers-devel.x86_64 0:0.0.22-1.fc19 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                   Arch     Version            Repository          Size
================================================================================
Removing:
 openlmi-providers         x86_64   0.0.22-1.fc19      @updates-testing    69 k
 openlmi-realmd            x86_64   0.0.22-1.fc19      @updates-testing   139 k
Removing for dependencies:
 openlmi-providers-devel   x86_64   0.0.22-1.fc19      @updates-testing    31 k

Transaction Summary
================================================================================
Remove  2 Packages (+1 Dependent package)

Installed size: 240 k
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Erasing    : openlmi-providers-devel-0.0.22-1.fc19.x86_64                 1/3 
  Erasing    : openlmi-realmd-0.0.22-1.fc19.x86_64                          2/3 
  Erasing    : openlmi-providers-0.0.22-1.fc19.x86_64                       3/3 
<<<<< just hangs here >>>>>


Version-Release number of selected component:
openlmi-providers-0.0.21-1.fc19

Additional info:
reporter:       libreport-2.1.4
cmdline:        python /usr/libexec/openlmi-cimmof -n root/cimv2 delete /usr/share/openlmi-providers/30_LMI_Jobs.mof
dso_list:       python-libs-2.7.4-4.fc19.x86_64
executable:     /usr/libexec/openlmi-cimmof
kernel:         3.9.0-301.fc19.x86_64
runlevel:       N 5
uid:            0

Truncated backtrace:
expatbuilder.py:223:parseString:ExpatError: no element found: line 1, column 0

Traceback (most recent call last):
  File "/usr/libexec/openlmi-cimmof", line 339, in <module>
    main()
  File "/usr/libexec/openlmi-cimmof", line 332, in main
    objs = get_objects_from_mofs(args.cimmof, args.namespace, *args.mof)
  File "/usr/libexec/openlmi-cimmof", line 82, in get_objects_from_mofs
    parsed_dom = dom.parseString(xml_cleanup(out))
  File "/usr/lib64/python2.7/xml/dom/minidom.py", line 1931, in parseString
    return expatbuilder.parseString(string)
  File "/usr/lib64/python2.7/xml/dom/expatbuilder.py", line 940, in parseString
    return builder.parseString(string)
  File "/usr/lib64/python2.7/xml/dom/expatbuilder.py", line 223, in parseString
    parser.Parse(string, True)
ExpatError: no element found: line 1, column 0

Local variables in innermost frame:
self: <xml.dom.expatbuilder.ExpatBuilderNS instance at 0x2709560>
parser: <pyexpat.xmlparser object at 0x2711130>
string: ''
Comment 1 Stef Walter 2013-05-14 07:31:56 EDT
Created attachment 747651 [details]
File: backtrace
Comment 2 Stef Walter 2013-05-14 07:32:00 EDT
Created attachment 747652 [details]
File: core_backtrace
Comment 3 Stef Walter 2013-05-14 07:32:02 EDT
Created attachment 747653 [details]
File: environ
Comment 4 Stef Walter 2013-05-14 07:40:33 EDT
This caused this on my machine once i killed yum:

[sudo] password for stef: 
BDB2053 Freeing read locks for locker 0x624: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x625: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x626: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x627: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x628: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x629: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x62a: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x62b: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x62c: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x62d: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x62e: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x62f: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x64c: 23126/140454044719104
BDB2053 Freeing read locks for locker 0x64d: 23126/140454044719104
BDB2053 Freeing read locks for locker 0x64e: 23126/140454044719104
BDB2053 Freeing read locks for locker 0x64f: 23126/140454044719104
BDB2053 Freeing read locks for locker 0x612: 23055/140092637361984
BDB2053 Freeing read locks for locker 0x622: 23055/140092637361984
error: rpmdb: BDB0113 Thread/process 23055/140092637361984 failed: BDB2052 locker has write locks
error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 -  (-30973)
error: cannot open Packages database in /var/lib/rpm
CRITICAL:yum.main:

Error: rpmdb open failed
Comment 5 Stef Walter 2013-05-14 07:44:13 EDT
Managed to get through that, so on my end everything is fine. But obviously not a bug we'd want others to run into.

[stef@stef ~]$ sudo yum remove openlmi-realmd openlmi-providers
error: rpmdb: BDB0113 Thread/process 23055/140092637361984 failed: BDB1507 Thread died in Berkeley DB library
error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 -  (-30973)
error: cannot open Packages database in /var/lib/rpm
CRITICAL:yum.main:

Error: rpmdb open failed
[stef@stef ~]$ sudo rpm --rebuilddb
[stef@stef ~]$ sudo yum remove openlmi-realmd openlmi-providers
Loaded plugins: auto-update-debuginfo, langpacks, refresh-packagekit
No Match for argument: openlmi-realmd
Resolving Dependencies
There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them.
--> Running transaction check
---> Package openlmi-providers.x86_64 0:0.0.22-1.fc19 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package               Arch       Version            Repository            Size
================================================================================
Removing:
 openlmi-providers     x86_64     0.0.22-1.fc19      @updates-testing      69 k

Transaction Summary
================================================================================
Remove  1 Package

Installed size: 69 k
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
  Erasing    : openlmi-providers-0.0.22-1.fc19.x86_64                       1/1 
warning: file /usr/share/openlmi-providers/30_LMI_Jobs.mof: remove failed: No such file or directory
warning: file /usr/share/openlmi-providers/05_LMI_Qualifiers.mof: remove failed: No such file or directory
warning: file /usr/share/openlmi-providers: remove failed: No such file or directory
warning: file /usr/share/doc/openlmi-providers-0.0.22/README: remove failed: No such file or directory
warning: file /usr/share/doc/openlmi-providers-0.0.22/COPYING: remove failed: No such file or directory
warning: file /usr/share/doc/openlmi-providers-0.0.22: remove failed: No such file or directory
warning: file /usr/libexec/openlmi-register-pegasus: remove failed: No such file or directory
warning: file /usr/libexec/openlmi-cimmof: remove failed: No such file or directory
warning: file /usr/lib64/libopenlmicommon.so.0.0.1: remove failed: No such file or directory
warning: file /usr/lib64/libopenlmicommon.so.0: remove failed: No such file or directory
warning: file /usr/bin/openlmi-mof-register: remove failed: No such file or directory
  Verifying  : openlmi-providers-0.0.22-1.fc19.x86_64                       1/1 

Removed:
  openlmi-providers.x86_64 0:0.0.22-1.fc19
Comment 6 Ales Kozumplik 2013-05-16 03:16:17 EDT
jsynacek and jsafrane told me today they see it in DNF too.

Guys, I can not reproduce it with dnf-0.3.5. This version is handling the RPM callbacks slightly differently than Yum so it is both possible I was unlucky and didn't get the hang or DNF doesn't suffer from this. What DNF version did you use? (0.3.5 is only available in rawhide now, sorry)

Speculation: to me it looks like one of the openlmi-providers' RPM hooks crashes and Yum waits to read from a file descriptor for an input that never arrives.
Comment 7 Jan Synacek 2013-05-16 03:18:08 EDT
I just managed to reproduce it with plain 'rpm -e openlmi-providers'. I'll look into it.
Comment 8 Jan Synacek 2013-05-16 04:44:01 EDT
Ok, what seems to reproduce this issue for me every time is the following:

(I'm testing on a rawhide machine, but I can speculate that this could happen on F19 as well.)

1) grab an openlmi-providers package (in my case openlmi-providers-0.0.21-3.fc20) from koji or build your own from fedora git

2) uninstall openlmi-providers if it's already installed on your system (this may trigger the problem too)

# rpm -evvvv openlmi-providers

3) install openlmi-providers

# rpm -ivvvv openlmi-providers-0.0.21-3.fc20

Either step 2 or 3 always triggers the problem.

Here is my output of step 2:

(ssh) root@rawhide-virt /home/jsynacek # rpm -evvvv openlmi-providers
D: loading keyring from pubkeys in /var/lib/rpm/pubkeys/*.key
D: couldn't find any keys in /var/lib/rpm/pubkeys/*.key
D: loading keyring from rpmdb
D: opening  db environment /var/lib/rpm cdb:0x401
D: opening  db index       /var/lib/rpm/Packages 0x400 mode=0x0
D: locked   db index       /var/lib/rpm/Packages
D: opening  db index       /var/lib/rpm/Name 0x400 mode=0x0
D:  read h#    1052 Header SHA1 digest: OK (ee9a5785940e76344d2734c2156c88f1b88191bd)
D: added key gpg-pubkey-de7f38bd-501f4964 to keyring
D: Using legacy gpg-pubkey(s) from rpmdb
D:  read h#    1647 Header SHA1 digest: OK (e198855ffdd4089cda19e3f53be5defee4ebcdc3)
D: ========== --- openlmi-providers-0.0.21-3.fc20 x86_64/linux 0x2
D: opening  db index       /var/lib/rpm/Requirename 0x400 mode=0x0
D: ========== recording tsort relations
D: ========== tsorting packages (order, #predecessors, #succesors, depth)
D:     0    0    0    1   -openlmi-providers-0.0.21-3.fc20.x86_64
D: erasing packages
D: closed   db index       /var/lib/rpm/Requirename
D: closed   db index       /var/lib/rpm/Name
D: closed   db index       /var/lib/rpm/Packages
D: closed   db environment /var/lib/rpm
D: opening  db environment /var/lib/rpm cdb:0x401
D: opening  db index       /var/lib/rpm/Packages (none) mode=0x42
D: sanity checking 1 elements
D: running pre-transaction scripts
D: computing 11 file fingerprints
D: opening  db index       /var/lib/rpm/Name (none) mode=0x42
D: opening  db index       /var/lib/rpm/Basenames (none) mode=0x42
D: opening  db index       /var/lib/rpm/Group (none) mode=0x42
D: opening  db index       /var/lib/rpm/Requirename (none) mode=0x42
D: opening  db index       /var/lib/rpm/Providename (none) mode=0x42
D: opening  db index       /var/lib/rpm/Conflictname (none) mode=0x42
D: opening  db index       /var/lib/rpm/Obsoletename (none) mode=0x42
D: opening  db index       /var/lib/rpm/Triggername (none) mode=0x42
D: opening  db index       /var/lib/rpm/Dirnames (none) mode=0x42
D: opening  db index       /var/lib/rpm/Installtid (none) mode=0x42
D: opening  db index       /var/lib/rpm/Sigmd5 (none) mode=0x42
D: opening  db index       /var/lib/rpm/Sha1header (none) mode=0x42
Preparing packages...
D: computing file dispositions
D: 0x0000fc03     4096      9934623      2910224 /
D: ========== +++ openlmi-providers-0.0.21-3.fc20 x86_64-linux 0x2
D:  read h#    1647 Header SHA1 digest: OK (e198855ffdd4089cda19e3f53be5defee4ebcdc3)
D:     erase: openlmi-providers-0.0.21-3.fc20 has 11 files
D: %preun(openlmi-providers-0.0.21-3.fc20.x86_64): scriptlet start
D: %preun(openlmi-providers-0.0.21-3.fc20.x86_64): execv(/bin/sh) pid 23093
+ '[' 0 -eq 0 ']'
+ /usr/bin/openlmi-mof-register --just-mofs unregister /usr/share/openlmi-providers/30_LMI_Jobs.mof
D: %preun(openlmi-providers-0.0.21-3.fc20.x86_64): waitpid(23093) rc 23093 status 0
openlmi-providers-0.0.21-3.fc20.x86_64
D: erase      100644  1 (   0,   0)  4918 /usr/share/openlmi-providers/30_LMI_Jobs.mof
D: erase      100644  1 (   0,   0)  1058 /usr/share/openlmi-providers/05_LMI_Qualifiers.mof
D: erase      040755  2 (   0,   0)  4096 /usr/share/openlmi-providers
D: erase      100644  1 (   0,   0)  7851 /usr/share/doc/openlmi-providers-0.0.21/README
D: erase      100644  1 (   0,   0) 24428 /usr/share/doc/openlmi-providers-0.0.21/COPYING
D: erase      040755  2 (   0,   0)  4096 /usr/share/doc/openlmi-providers-0.0.21
D: erase      100755  1 (   0,   0)  2491 /usr/libexec/openlmi-register-pegasus
D: erase      100755  1 (   0,   0) 13263 /usr/libexec/openlmi-cimmof
D: erase      100755  1 (   0,   0) 11200 /usr/lib64/libopenlmicommon.so.0.0.1
D: erase      120777  1 (   0,   0)    25 /usr/lib64/libopenlmicommon.so.0
D: erase      100755  1 (   0,   0)  5856 /usr/bin/openlmi-mof-register
D: %postun(openlmi-providers-0.0.21-3.fc20.x86_64): scriptlet start
D: %postun(openlmi-providers-0.0.21-3.fc20.x86_64): execv(/sbin/ldconfig) pid 23143
D: %postun(openlmi-providers-0.0.21-3.fc20.x86_64): waitpid(23143) rc 23143 status 0
D:   --- h#    1647 openlmi-providers-0.0.21-3.fc20.x86_64
<< hangs here >>

Also, running strace on the rpm process results in:

(ssh) root@rawhide-virt /home/jsynacek # strace -p 23092
Process 23092 attached
futex(0x7fbdd800e924, FUTEX_WAIT, 1, NULL
<< hangs here >>

After attaching gdb to the rpm process:

(ssh) root@rawhide-virt ~ # gdb -p 23092
GNU gdb (GDB) Fedora (7.5.91.20130407-18.fc20)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attaching to process 23092
Reading symbols from /usr/bin/rpm...Reading symbols from /usr/lib/debug/bin/rpm.debug...done.
done.
Reading symbols from /lib64/librpm.so.3...Reading symbols from /usr/lib/debug/usr/lib64/librpm.so.3.1.0.debug...done.
done.
Loaded symbols for /lib64/librpm.so.3
Reading symbols from /lib64/librpmio.so.3...Reading symbols from /usr/lib/debug/usr/lib64/librpmio.so.3.1.0.debug...done.
done.
Loaded symbols for /lib64/librpmio.so.3
Reading symbols from /lib64/libselinux.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libselinux.so.1.debug...done.
done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libcap.so.2...BDB2053 Freeing read locks for locker 0xe1: 23446/140072383457280
BDB2053 Freeing read locks for locker 0xe2: 23446/140072383457280
BDB2053 Freeing read locks for locker 0xe3: 23446/140072383457280
BDB2053 Freeing read locks for locker 0xe4: 23446/140072383457280
<< hangs here as well >>

This looks like a strange locking problem in libdb to me. Also, one even  stranger observation and also a *workaround* is that instead of running the debugger, trying to query rpm results only in the "... Freeing read locks ..." lines and the original hung rpm process then finishes without errors.

(ssh) root@rawhide-virt ~ # rpm -q whatever
BDB2053 Freeing read locks for locker 0x12e: 23626/140565356640256
BDB2053 Freeing read locks for locker 0x12f: 23626/140565356640256
BDB2053 Freeing read locks for locker 0x130: 23626/140565356640256
BDB2053 Freeing read locks for locker 0x131: 23626/140565356640256
package whatever is not installed
<< after this, the original hanging rpm process finishes with rc == 0 >>
Comment 9 Jan Synacek 2013-05-16 04:46:32 EDT
Panu, could you please take a look at the comment above if you see anything suspicious?
Comment 10 Jan Synacek 2013-05-16 04:54:50 EDT
Created attachment 748669 [details]
strace
Comment 11 Panu Matilainen 2013-05-16 05:24:20 EDT
Looks like a dupe of bug 860500. Are there any crashes or other anomalies in /var/log/messages during the transaction?
Comment 12 Jan Synacek 2013-05-16 05:33:39 EDT
Definitely looks like it. There are no traces in /var/log/messages when I use rpm, but there were some abrtd entries when I used dnf to update/remove the package.
Comment 13 Panu Matilainen 2013-05-16 05:45:08 EDT
Doh... reading from the start, a crash in /usr/libexec/openlmi-cimmof is why this bug exists in the first place :) And concurrent abrtd entries in logs pretty much confirms this.

In some circumstances, when abrt runs during an rpm transaction, it can apparently leave stale locks behind which then stalls the transaction. I'm currently building rpm-4.11.0.1-4 for rawhide which checks for stale locks more often than before, which should help with the symptoms (ie rpm/yum/dnf hanging up).  Figuring out the actual disease is another thing however, I haven't been able to reproduce these hangs on my own systems.
Comment 14 Jan Synacek 2013-05-16 06:10:00 EDT
I don't know if this might help or not, but if I remove the

%postun -p /sbin/ldconfig

line from openlmi-providers package, the problem goes away.
Comment 15 Panu Matilainen 2013-05-28 03:21:35 EDT
Running ldconfig (or not) shouldn't affect anything really. Unless of course its crashing (or other otherwise triggering) abrt (or something else) that leaves stale rpmdb locks behind. I'm pushing an update shortly that should help the case by checking for those stale locks more frequently than only on db open.
Comment 16 Fedora Update System 2013-05-28 03:26:50 EDT
rpm-4.11.0.1-2.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/rpm-4.11.0.1-2.fc19
Comment 17 Stephen Gallagher 2013-05-28 08:09:59 EDT
Description of problem:
This crash occurred during a yum update from openlmi-providers-0.0.21-1.fc19.x86_64 to openlmi-providers-0.0.22-1.fc19.x86_64

Version-Release number of selected component:
openlmi-providers-0.0.21-1.fc19

Additional info:
reporter:       libreport-2.1.4
cmdline:        python /usr/libexec/openlmi-cimmof -n root/cimv2 delete /usr/share/openlmi-storage/base.mof /usr/share/openlmi-storage/luks.mof /usr/share/openlmi-storage/partitions.mof /usr/share/openlmi-storage/storage.mof
dso_list:       PyXML-0.8.4-28.fc18.x86_64
executable:     /usr/libexec/openlmi-cimmof
kernel:         3.9.2-301.fc19.x86_64
runlevel:       N 5
uid:            0

Truncated backtrace:
expatbuilder.py:223:parseString:ExpatError: no element found: line 1, column 0

Traceback (most recent call last):
  File "/usr/libexec/openlmi-cimmof", line 339, in <module>
    main()
  File "/usr/libexec/openlmi-cimmof", line 332, in main
    objs = get_objects_from_mofs(args.cimmof, args.namespace, *args.mof)
  File "/usr/libexec/openlmi-cimmof", line 82, in get_objects_from_mofs
    parsed_dom = dom.parseString(xml_cleanup(out))
  File "/usr/lib64/python2.7/site-packages/_xmlplus/dom/minidom.py", line 1925, in parseString
    return expatbuilder.parseString(string)
  File "/usr/lib64/python2.7/site-packages/_xmlplus/dom/expatbuilder.py", line 942, in parseString
    return builder.parseString(string)
  File "/usr/lib64/python2.7/site-packages/_xmlplus/dom/expatbuilder.py", line 223, in parseString
    parser.Parse(string, True)
ExpatError: no element found: line 1, column 0

Local variables in innermost frame:
self: <xml.dom.expatbuilder.ExpatBuilderNS instance at 0x11c95a8>
parser: <pyexpat.xmlparser object at 0x1164980>
string: ''
Comment 18 Fedora Update System 2013-05-28 12:15:21 EDT
Package rpm-4.11.0.1-2.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing rpm-4.11.0.1-2.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-9458/rpm-4.11.0.1-2.fc19
then log in and leave karma (feedback).
Comment 19 Fedora Update System 2013-05-31 23:22:34 EDT
rpm-4.11.0.1-2.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.