Description of problem: rpm is "out of available locker entries" Version-Release number of selected component (if applicable): rpm-4.3.2-8.x86_64.rpm rpm-libs-4.3.2-8.x86_64.rpm db4-4.2.52-6.x86_64.rpm How reproducible: # rpm -qa rpmdb: Lock table is out of available locker entries error: db4 error(22) from db->close: Invalid argument error: cannot open Packages index using db3 - Cannot allocate memory (12) error: cannot open Packages database in /var/lib/rpm # rpm -rebuilddb rpmdb: Lock table is out of available locker entries error: db4 error(22) from db->close: Invalid argument error: cannot open Packages index Steps to Reproduce: I was installing packages and rpm started complaining: 1038 yum install automake15 1042 yum install libgnomeprintui22-devel 1044 yum install db4-devel 1045 yum install gcc-g77 1048 yum install evolution-devel gmp-devel gd-devel 1050 yum install slang-devel gpm-devel 1051 yum install XFree86-devel libpng-devel libjpeg-devel libungif-devel libtiff-devel zlib-devel gettext ... wedged Actual results: rpmdb: Lock table is out of available locker entries Expected results: World Peace. Additional info: See also: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=89981
Hmm, congratulations! Never seen this problem before. Are you running your installs concurrently or sequentially? Can you reproduce the problem? Are there other programs trying to access /var/lib/rpm/Packages et al? If you can reproduce, could you run cd /var/lib/rpm /usr/lib/rpm/rpmdb_stat -CA to display what locks are active? Amen, "world peace". Gotta get rid of Bush first ;-)
those installs can't be running concurrently unless he's overriding the yum lock file. And if he's overriding the yum lockfile then it's his mess and he can keep it.
Created attachment 104668 [details] cd /var/lib/rpm && /usr/lib/rpm/rpmdb_stat -CA cd /var/lib/rpm && /usr/lib/rpm/rpmdb_stat -CA
Hi Jeff: The installs should be happening sequentially. I didnt having any installs running from cron and was just matching missing packages for builds of fedora.us srpms in between doing rpmbuilds. There are no programs accessing /var/lib/Packages so far as I know: [root@www rpm]# fuser /var/lib/rpm/Packages [root@www rpm]# /usr/lib/rpm/rpmdb_stat -CA has plenty of output. I'll attach a typescript. There do appear to be many "READ 1 HELD." The bug is still very reproducable. # rpm -qa rpmdb: Lock table is out of available locker entries Hi Seth: I don't know about the yum lockfile so really doubt I've overridden it. I do rsync from duke.edu and createrepo for local use in a cron for later manual yum update but thats all. This should be very generic yum use with the mirror of the duke repo mirror only. The fedora.us use was just trying to build srpms from a wget.
Created attachment 104672 [details] output of rpmdb_stat This happened to me too. I was just doing my normal 'yum update' (actually 'yum --exclude=mod_perl\* update') Here is the output from rpmdb_stat
Is this x86_64 too? What kernel? What glibc?
No, i686. kernel-2.6.8-1.541 Pretty much, latest Rawhide. Currently running glibc-2.3.3-63, but this happened when running with previous one (glibc-2.3.3-60.i686.rpm) while updating to newer one. tom
Seth's question was for Tom, I'm sure. But I thought I'd mention the initial reported machine (x86_64) has never been booted since installing 2.4.21-4.ELsmp. glibc should be 2.3.3-62. The latest was missing i386 glibc-common in the x86_64 directories. Yum.conf details: exclude=lvm* kernel* mkinitrd* *debuginfo* modutil* initscripts* udev* Additional temporary excludes on that day for dealing with current issues (missing packages, mirrors syncing, ...): hal* mod_perl* glibc-* gtk2*
Hmmm, I note glibc in the upgrade. Upgrading glibc changes the libraries that supply the posix mutexes used by db4 internal to rpm. Have you seen the behavior in an upgrade transaction that did not include glibc?
BTW, the rpmdb_stat -CA clearly displays way too many locks held, which explains why the locking table is full. I dunno whether that is because yum is/was not closing match iterators, or because glibc upgrade re-dealt the names of locks so that existing locks did not get properly released.
The only way I noticed it this time was that an endless supply of rpmdb: Lock table is out of available locker entries messages. I haven't noticed any unusual issues updating glibc before, and the current development cache shows 10 glibc packages (2.3.3-46->2.3.3-63). Two things I did notice: 1. after 'recreating' /var/lib/rpm (by initializing the db, and then doing a 'rpm -ivh --nodeps --justdb' on the current packages in /var/cache/yum/developemnt/packages/, followed by a 'yum update'), the resulting /var/lib/rpm is about 60% the size of the corrupted one (76MB vs. 130MB). 2. I do get the 'yum exits after downloading headers, after starting transaction' problem reported elsewhere. I see this every few days or so. Is it possible that in the case of #2, yum exits before the locks are freed?
Created attachment 104688 [details] output of /usr/lib/rpm/rpmdb_stat -CA I ran 'yum' this morning to pull down the new sed package. Trying to get some additional data, I ran 'strace -o foo yum update'. It hung after doing the install, but before 'completing the installation': [root@fedora ~]# strace -o /tmp/strace.yum yum update Setting up Update Process Setting up Repo: development repomd.xml 100% |=========================| 1.1 kB 00:00 Reading repository metadata in from local files primary.xml.gz 100% |=========================| 925 kB 00:02 developmen: ################################################## 3422/3422 Resolving Dependencies sed-4.1.2-3.i386.rpm 100% |=========================| 10 kB 00:00 Dependencies Resolved [u] sed.i386 0:4.1.2-3 - user Is this ok [y/N]: y sed-4.1.2-3.i386.rpm 100% |=========================| 165 kB 00:00 Running Transaction Test Finished Transaction Test Transaction Test Succeeded Running Transaction sed 100 % done 1/2 After killing the python process, I'm now left with a bunch of rpmdb locks, shown in the attached. 2 questions: 1. could this indicate that when yum terminates unexpectedly, it may leave some locks. 2. any simple way to flush these locks? thanks, tom
Hi Tom Not being an expert in db3/4 or rpm, I'd take what I say with a grain of salt. I am not associated with Redhat. The following was just a blind hack based upon what I read in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=89981 After backing up /var/lib I ran the db_recover program that comes with db4-utils: db_recover -h /var/lib/rpm/ It didnt report much but yum upgrade, rpm -qa, ... all worked fine after this. I couldn't pretend to explain if its right or wrong or even recommended, but it did let me get back to what I was doing. [root@www RPMS]# db_recover -h /var/lib/rpm/ [root@www RPMS]# yum check-update [root@www RPMS]# yum update [root@www RPMS]# rpm -qa [root@www RPMS]# rpm --rebuilddb [root@www RPMS]# (cd /var/lib/rpm/ && /usr/lib/rpm/rpmdb_stat -CA) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock region parameters locker table size: 1031, object table size: 1031, obj_off: 646776, osynch_off: 0, locker_off: 630272, lsynch_off: 0, need_dd: 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Conflict matrix 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by lockers Locker Mode Count Status ----------------- Object --------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by object Locker Mode Count Status ----------------- Object --------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Memory free list 0x2a956b5008: 198248 [root@www RPMS]# I then did an rpm --rebuilddb just for medicinal purposes.
Wow..... This worked like a charm. This fixed both my 'new' (and erroneously locked) db, AND the 'old' one that had the original problems. So I can 'recover' the db that really reflects my system!!! Now I just have to update the db with the updates I made since the failure. Thanks! tom
[Thanks to taj, I've recreated and restored my rpmdb.] If I start with a 'clean' rpmdb, and then run 'yum check-update', the rpmdb ends up with what appears to be held locks. Repeated executions of 'yum check-update' makes the list of held locks longer. I show here the output of the following sequence: /usr/lib/rpm/rpmdb_stat -CA yum check-update /* output deleted */ /usr/lib/rpm/rpmdb_stat -CA yum check-update /* output deleted */ /usr/lib/rpm/rpmdb_stat -CA It certainly looks like I'm getting more and more 'straggler' locks. Is this what should be expected? thanks, tom [As taj indicates, you can clear these locks out by running 'db_recover -h /var/lib/rpm' After following his instructions, the rusulting /var/lib/rpm directory is about 35MB smaller than it was when it was croaking..... Could this be due to 6 months of locks?] ------------------------------------------------ [root@fedora rpm]# /usr/lib/rpm/rpmdb_stat -CA =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock region parameters locker table size: 1031, object table size: 1031, obj_off: 442080, osynch_off: 0, locker_off: 433824, lsynch_off: 0, need_dd: 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Conflict matrix 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by lockers Locker Mode Count Status ----------------- Object --------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by object Locker Mode Count Status ----------------- Object --------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Memory free list 0xf6e2b004: 129812 <<<<SNIP: run of 'yum check-update' >>>>> [root@fedora rpm]# /usr/lib/rpm/rpmdb_stat -CA =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock region parameters locker table size: 1031, object table size: 1031, obj_off: 442080, osynch_off: 0, locker_off: 433824, lsynch_off: 0, need_dd: 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Conflict matrix 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by lockers Locker Mode Count Status ----------------- Object --------------- 9 dd= 0 locks held 1 write locks 0 9 READ 1 HELD (4caaa 302 4160b779 3f46 0)handle 0 a dd= 0 locks held 0 write locks 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by object Locker Mode Count Status ----------------- Object --------------- 9 READ 1 HELD (4caaa 302 4160b779 3f46 0)handle 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Memory free list 0xf6e2b004: 129812 [root@fedora rpm]# <<<<< SNIP: run of 'yum check-update' >>>>> [root@fedora rpm]# /usr/lib/rpm/rpmdb_stat -CA =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock region parameters locker table size: 1031, object table size: 1031, obj_off: 442080, osynch_off: 0, locker_off: 433824, lsynch_off: 0, need_dd: 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Conflict matrix 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by lockers Locker Mode Count Status ----------------- Object --------------- 9 dd= 0 locks held 1 write locks 0 9 READ 1 HELD (4caaa 302 4160b779 3f46 0)handle 0 a dd= 0 locks held 0 write locks 0 f dd= 0 locks held 1 write locks 0 f READ 1 HELD (4caaa 302 4160b779 3f46 0)handle 0 10 dd= 0 locks held 0 write locks 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by object Locker Mode Count Status ----------------- Object --------------- 9 READ 1 HELD (4caaa 302 4160b779 3f46 0)handle 0 f READ 1 HELD (4caaa 302 4160b779 3f46 0)handle 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Memory free list 0xf6e2b004: 129812
okay, I've added explicit deletion of any/all transaction sets I have open. I'm not quite sure why the exit of the python interpreter doesn't delete the object and clean up the lockers but anyway. You should be able to cvs update to yum-cvs and get these changes.
yum-2.1.5-1 'fixes' problem: repetitive execution of 'yum check-update' leaves 'clean' rpmdb lock state. And doing a complete 'yum update' on today's Rawhide also leaves a clean lock state. Thanks Seth!
This problem appears resolved by cleaning up transactions. Seth: Transactions share a common db underneath which is refcounted. All open cursors (curesors are what hold locks) are also chained to the rpmdb, and are closed when a ts is deleted which triggers a db close which also closes cursors which hold locks. So I smell a mi, not a ts, leak somewhere in yum. Deleting a match iterator will nuke attached locks just like deleting ts objects will. Note that there is nothing whatsoever wrong with closing unused ts objects, which will release cursors/locks, but nuking mi after use to delete locks is perhaps a better (because more precise) fix.