Description of problem: After 'rpm --rebuilddb --verbose' I got a segfault an a core (I just turn cores on). With data from rpm-debuginfo gdb produces the following Core was generated by `/usr/lib/rpm/rpmd --rebuilddb --verbose'. Program terminated with signal 11, Segmentation fault. #0 __memp_fget_rpmdb (dbmfp=0x9c55788, pgnoaddr=0xbfefbbac, flags=0, addrp=0xbfefbb88) at ../db/dist/../mp/mp_fget.c:190 190 ../db/dist/../mp/mp_fget.c: No such file or directory. in ../db/dist/../mp/mp_fget.c (gdb) where #0 __memp_fget_rpmdb (dbmfp=0x9c55788, pgnoaddr=0xbfefbbac, flags=0, addrp=0xbfefbb88) at ../db/dist/../mp/mp_fget.c:190 #1 0x003c8510 in __db_goff_rpmdb (dbp=0x9c55488, dbt=0x9c5899c, tlen=12052, pgno=6916, bpp=0x9c55914, bpsz=0x9c5591c) at ../db/dist/../db/db_overflow.c:147 #2 0x003cfe4d in __db_ret_rpmdb (dbp=0x9c55488, h=0xb7c935c4, indx=11, dbt=0x9c5899c, memp=0x9c55914, memsize=0x9c5591c) at ../db/dist/../db/db_ret.c:50 #3 0x003bb115 in __db_c_get_rpmdb (dbc_arg=0x9c558c8, key=0x9c58984, data=0x9c5899c, flags=<value optimized out>) at ../db/dist/../db/db_cam.c:778 #4 0x003c15f6 in __db_c_get_pp_rpmdb (dbc=0x9c558c8, key=0x9c58984, data=0x9c5899c, flags=18) at ../db/dist/../db/db_iface.c:1741 #5 0x00351706 in db3cget (dbi=0x9c54f30, dbcursor=0x5704db86, key=0x9c58984, data=0x9c5899c, flags=1459936134) at db3.c:612 #6 0x0034d333 in rpmdbNextIterator (mi=0x9c58968) at rpmdb.h:591 #7 0x0034ee04 in rpmdbRebuild (prefix=0x9c41f30 "/", ts=0x9c53cd8, hdrchk=0x160830 <headerCheck>) at rpmdb.c:3854 #8 0x00184af6 in rpmtsRebuildDB (ts=0x9c53cd8) at rpmts.c:209 #9 0x08049822 in main (argc=3, argv=Cannot access memory at address 0x5704db8a ) at ./rpmqv.c:633 #10 0x00a3cf2c in __libc_start_main () from /lib/libc.so.6 #11 0x080490c1 in _start () (gdb) Locations like "../db/dist/../mp/mp_fget.c:190" are somewhat nasty to look at but it is possible to find the file outside of gdb. The code in question looks like this: /* Search the hash chain for the page. */ retry: st_hsearch = 0; MUTEX_LOCK(dbenv, &hp->hash_mutex); for (bhp = SH_TAILQ_FIRST(&hp->hash_bucket, __bh); bhp != NULL; bhp = SH_TAILQ_NEXT(bhp, hq, __bh)) { ++st_hsearch; -- bomb! --> if (bhp->pgno != *pgnoaddr || bhp->mf_offset != mf_offset) continue; and gdb prints gdb) p bhp $1 = (BH *) 0x5704db86 (gdb) p *bhp Cannot access memory at address 0x5704db86 (gdb) p pgnoaddr $2 = (db_pgno_t *) 0xbfefbbac (gdb) p bhp->pgno Cannot access memory at address 0x5704dbfa Trying to access memory which was already freed? Version-Release number of selected component (if applicable): rpm-4.4.2-32 How reproducible: the next attempt of --rebuilddb succeeded but I tried that because I got a segfault from yum during an installation and maybe this was really an rpm fault?
The segfault is likely the result of bad data, which is likely corrected by --rebuilddb.
> The segfault is likely the result of bad data ... These "bad data" were produced by nothing else but rpm and an attempt to correct that resulted in a segfault. Luckily the condition did not persist.
BTW - segfault in 'yum update' mentioned in the report is now bug 215184. Not much information there, I am afraid, beyond nasty result. It happened when all new packages were already installed and now yum was supposed to do all cleanups; so it left me with a pile of duplicates.
rpm (and Berkeley DB) relies on shared posix mutexes for locking to insure data integrity. There's a rash of recent rpmdb problems, dunno the cause .... blame rpm which has not changed for over a year, certainly not the rpmdb code. YMMV. A --dupes option can be added to rpm with this line in /etc/popt: rpm alias --dupes --qf '%|SOURCERPM?{%{name}.%{arch}}:{%|ARCH?{%{name}}:{%{name}-% {version}}|}| \n' --pipe "sort | uniq -d" \ --POPTdesc=$"list duplicated packages" Invoke as rpm -qa --dupes.
BTW, doing rm -f /var/lib/rpm/__db* before --rebuilddb --verbose would have eliminated a corrupt cache.
In common with a few users, it seems, I'm finding rpm and yum very unstable under FC6. Just now: # rpm -ivh /home/imc/rpmbuild/RPMS/i386/xli-1.17.0-6.fc6.i386.rpm Preparing... Segmentation fault (core dumped) But where's my core file? # ls -l core ls: core: No such file or directory # ulimit -c unlimited
> But where's my core file? If you have 'ulimit -c' set to 'unlimited' then your core file will really have a name like core.<process_id> so try 'ls -l core*'. Also a process which dumped core may be a child with a different context and a core is somewhere else (maybe /?). To look for all possible core files try, with a current updatedb, the following locate -r '/core\.[1-9]' This may have a few wrong hits but not too many.
If you give me a ptr to a core using -ivv and the packages involved, I'll diagnose the segfault. Be forewarned: almost all segfaults in rpm are caused by bad data.
Segafualts and loss of data are likely due to removing an rpmdb environment without correcting other problems in the rpmdb. FYI: Most rpmdb "hangs" are now definitely fixed by purging stale read locks when opening a database environment in rpm-4.4.8-0.4. There's more todo, but I'm quite sure that a large class of problems with symptoms of "hang" are now corrected. Detecting damaged by verifying when needed is well automated in rpm-4.4.8-0.4. Automatically correcting all possible damage is going to take more work, but a large class of problems is likely already fixed in rpm-4.4.8-0.8 as well. UPSTREAM
*** This bug has been marked as a duplicate of 213963 ***