Bug 476195

Summary: yum doesn't lock the rpmdb RW constantly, so rpm can remove installed packages that yum knows about (yum traceback)
Product: Red Hat Enterprise Linux 5 Reporter: Petr Sklenar <psklenar>
Component: yumAssignee: James Antill <james.antill>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.4CC: bperkins, herrold, jhutar, pmatilai
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Because yum does not lock rpmdb, other applications can make changes to the package database at the same time that yum does. Previously, changes made to rpmdb by another application could cause yum to crash; for example, if an application removed data about a package and yum then attempted to retrieve this data. Now, if yum discovers data that is needs to complete a transaction is missing from the rpmdb, yum will exit safely and avoid crashing.
Story Points: ---
Clone Of:
: 591488 (view as bug list) Environment:
Last Closed: 2009-09-02 07:33:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 591488    

Description Petr Sklenar 2008-12-12 12:20:33 UTC
Description of problem:
during rpm installation, I can see yum traceback sometimes

Version-Release number of selected component (if applicable):

[root@dhcp-lab-158 rpm_test1]# rpm -qa yum*
yum-security-1.1.16-13.el5
yum-3.2.19-18.el5
yum-rhn-plugin-0.5.3-30.el5
yum-updatesd-0.9-2.el5
yum-utils-1.1.16-13.el5
yum-metadata-parser-1.1.2-2.el5
[root@dhcp-lab-158 rpm_test1]# rpm -qa rpm*
rpm-4.4.2.3-9.el5
rpm-libs-4.4.2.3-9.el5
rpm-devel-4.4.2.3-9.el5
rpm-python-4.4.2.3-9.el5
rpm-build-4.4.2.3-9.el5
rpm-apidocs-4.4.2.3-9.el5
rpm-debuginfo-4.4.2.3-9.el5
[root@dhcp-lab-158 rpm_test1]# uname -a
Linux dhcp-lab-158.englab.brq.redhat.com 2.6.18-125.el5 #1 SMP Mon Dec 1 17:46:51 EST 2008 ppc64 ppc64 ppc64 GNU/Linux


How reproducible:
sometimes


Steps to Reproduce:
1. in background:  for i in `seq 1 1000` do; rpm -e foo-package.rpm;rpm -i foo-package.rpm;done 
2. yum update

  
Actual results:
#  yum update
Loaded plugins: rhnplugin, security
This system is not registered with RHN.
RHN support will be disabled.
Skipping security plugin, no data
Setting up Update Process
Resolving Dependencies
Skipping security plugin, no data
--> Running transaction check
---> Package ppc64-utils.ppc 0:0.11-10.el5 set to be updated
---> Package kernel.ppc64 0:2.6.18-126.el5 set to be installed
---> Package kernel-headers.ppc 0:2.6.18-126.el5 set to be updated
Traceback (most recent call last):
  File "/usr/bin/yum", line 29, in ?
    yummain.user_main(sys.argv[1:], exit_code=True)
  File "/usr/share/yum-cli/yummain.py", line 229, in user_main
    errcode = main(args)
  File "/usr/share/yum-cli/yummain.py", line 145, in main
    (result, resultmsgs) = base.buildTransaction() 
  File "/usr/lib/python2.4/site-packages/yum/__init__.py", line 647, in buildTransaction
    (rescode, restring) = self.resolveDeps()
  File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 704, in resolveDeps
    for po, dep in self._checkFileRequires():
  File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 902, in _checkFileRequires
    for name, flag, evr in pkg.requires:
  File "/usr/lib/python2.4/site-packages/yum/packages.py", line 411, in <lambda>
    requires = property(fget=lambda self: self.returnPrco('requires'))
  File "/usr/lib/python2.4/site-packages/yum/packages.py", line 1002, in returnPrco
    self._populatePrco()
  File "/usr/lib/python2.4/site-packages/yum/packages.py", line 1016, in _populatePrco
    hdr = self._get_hdr()
  File "/usr/lib/python2.4/site-packages/yum/rpmsack.py", line 57, in _get_hdr
    return mi.next()
StopIteration


Expected results:
yum update should wait for rpm lock and wait, then install packages successfully

Additional info:
I tried that only on ppc with RHEL 5U3 snapshot5

Comment 1 James Antill 2008-12-12 15:49:28 UTC
Is this only happening on one machine?
Why do you think it's something to do with the rpm lock?
Can you put "print self" just before the above return value?

AFAICS this is saying that we got a particular installed rpm X from index XI, but then when we later try to get some more data from index XI ... rpm says it's bad.

What does "yum --version" say?

Panu, any ideas?

Comment 2 James Antill 2008-12-12 15:57:23 UTC
I'd guess that this is related to:

 https://bugzilla.redhat.com/show_bug.cgi?id=476188

Comment 3 Petr Sklenar 2008-12-12 17:02:47 UTC
it happens on every machine what I have now (all ppc, maybe all arch has same bug).

I thing that bug 476188 has nothing to common with this - because I cannot reproduce bug 476188 on my ppc machines.


:anyway now I tried this 476195 on 
Linux squad7-lp1.rhts.bos.redhat.com 2.6.18-126.el5 #1 SMP Mon Dec 8 18:35:09 EST 2008 ppc64 ppc64 ppc64 GNU/Linux

one my terminal: for i in `seq 1 1000`; do rpm -e rhts-rh-devel ;rpm -ivh /root/rhts-rh-devel-3.4-20081205.0.noarch.rpm;done

second terminal:
I have old version of package file, and yum should upgrade to latest file:

# yum update
Loaded plugins: rhnplugin, security
This system is not registered with RHN.
RHN support will be disabled.
Skipping security plugin, no data
Setting up Update Process
Resolving Dependencies
Skipping security plugin, no data
--> Running transaction check
---> Package file.ppc 0:4.17-15 set to be updated
Traceback (most recent call last):
  File "/usr/bin/yum", line 29, in ?
    yummain.user_main(sys.argv[1:], exit_code=True)
  File "/usr/share/yum-cli/yummain.py", line 229, in user_main
    errcode = main(args)
  File "/usr/share/yum-cli/yummain.py", line 145, in main
    (result, resultmsgs) = base.buildTransaction() 
  File "/usr/lib/python2.4/site-packages/yum/__init__.py", line 647, in buildTransaction
    (rescode, restring) = self.resolveDeps()
  File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 704, in resolveDeps
    for po, dep in self._checkFileRequires():
  File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 902, in _checkFileRequires
    for name, flag, evr in pkg.requires:
  File "/usr/lib/python2.4/site-packages/yum/packages.py", line 411, in <lambda>
    requires = property(fget=lambda self: self.returnPrco('requires'))
  File "/usr/lib/python2.4/site-packages/yum/packages.py", line 1002, in returnPrco
    self._populatePrco()
  File "/usr/lib/python2.4/site-packages/yum/packages.py", line 1016, in _populatePrco
    hdr = self._get_hdr()
  File "/usr/lib/python2.4/site-packages/yum/rpmsack.py", line 57, in _get_hdr
    return mi.next()
StopIteration
[root@squad7-lp1 ~]#
--------------------------
try it: squad7-lp1.rhts.bos.redhat.com

Comment 4 James Antill 2008-12-12 18:29:23 UTC
> one my terminal: for i in `seq 1 1000`; do rpm -e rhts-rh-devel ;rpm -ivh
> /root/rhts-rh-devel-3.4-20081205.0.noarch.rpm;done

> second terminal:
> I have old version of package file, and yum should upgrade to latest file:

 Ok, this makes sense now ... what is happening is that you insert FOO into the DB, and it has index 123, then rpm removes that from the DB, then yum requests the data at index 123 (but it isn't there anymore).

 I'm not sure we can fix this for 5.3 ... a cheap/obvious fix is to lock the rpmdb as soon as we have loaded a single package from it, until we've finished with it completely. This however would be a major "regression" in lots of ways (for instance C-c wouldn't work again) (I'm not 100% sure which other yum releases had that bug ... did you try 3.2.8? -- I'm also pretty sure we said we'd fix C-c at various points).
 Doing something else requires realizing that the package has disappeared and doing something about it ... doing what is non-obvious though.

Comment 5 James Antill 2008-12-12 18:48:57 UTC
 Ok, here's a simple testcase:

#! /usr/bin/python -tt

from rpmUtils.transaction import initReadOnlyTransaction
import time

__pkg__ = 'xgalaxy'

ts =  initReadOnlyTransaction(root='/')

mi = ts.dbMatch('name', __pkg__)
hdr = mi.next()
print hdr['name'], hdr['epoch'], hdr['version'], hdr['release'], hdr['arch']
idx = mi.instance()
print "Index:", idx
while True:
    time.sleep(1)
    mi = ts.dbMatch(0, idx)
    hdr = mi.next()
    print hdr['name'], hdr['epoch'], hdr['version'], hdr['release'], hdr['arch']

...if you run the above you'll see you have a read only lock on the rpmdb (ts above), which is what yum has for most of it's lifetime ... however you can still run rpm -e in another window.

 Probably the best part about this is that if you run the testcase as a user, it doesn't fail!

Comment 6 James Antill 2008-12-12 18:51:11 UTC
 Panu, options?

Comment 7 James Antill 2008-12-12 18:55:15 UTC
 I know why this has started happening in 3.2.19 though ... before that we kept the original hdr around (which was a huge drag on memory).
 As a quick fix for 5.3 we could just undo that.

Comment 8 James Antill 2008-12-12 19:02:51 UTC
 Saying that I think it's very likely that with the old behavior we just fail silently in weird ways... for instance searching for the prco data will not find the data for the pkg that is gone. As will getting packages by name (unless we are "completely_loaded").
 I'd also be interested in what rpm does on the transaction if we tell it to upgrade things that aren't there anymore/etc.

Comment 9 Jan Hutaƙ 2008-12-15 08:25:45 UTC
I do not think we need to fix this in 5.3. It would be better to fix it upstream first.

I was able to reproduce on x86_64 F10 using original reproducer with:

yum-3.2.20-3.fc10.noarch
rpm-4.6.0-0.rc1.8.x86_64

Comment 10 Panu Matilainen 2008-12-15 10:31:52 UTC
Welcome to the world of "rpmdb concurrent access"... Unless you grab a
write-lock on the rpmdb, you're not forbidding others from making changes
behind your back (and OTOH grabbing an exclusive lock has other side-effects).

Catch the exception from mi.next() (would be a good idea to do it anyway),
raise "Please kindly stop pulling the carpet from beneath me" PackageSackError
and abort?

> Probably the best part about this is that if you run the testcase as a user,
> it doesn't fail!

That's because it runs with "private" locking which is about as good as no
locking at all, and can/will run into much uglier problems, see bug 444044.

> I'd also be interested in what rpm does on the transaction if we tell it to
> upgrade things that aren't there anymore/etc.

It'll (quietly) notice the record it was supposed to erase was no longer there
and skip it. Return code from the transaction will show error(s) but that's
all, IIRC. That's essentially what happens in (otherwise unrelated) bug 348971.

Comment 11 James Antill 2009-03-25 06:28:07 UTC
 The fix we'll be doing is just the simple try/except and raise PackageSackError.

Comment 16 Ruediger Landmann 2009-09-01 13:12:40 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Because yum does not lock rpmdb, other applications can make changes to the package database at the same time that yum does. Previously, changes made to rpmdb by another application could cause yum to crash; for example, if an application removed data about a package and yum then attempted to retrieve this data. Now, if yum discovers data that is needs to complete a transaction is missing from the rpmdb, yum will exit safely and avoid crashing.

Comment 17 errata-xmlrpc 2009-09-02 07:33:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1419.html