From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; 0.8.1) Gecko/20010323 I upgraded a stock redhat 6.2 box with the 6.2 updates from redhat, including glibc 2.1.3-22 and rpm 4.0.2-6x. after the upgrade this behavior started. the box is a dell poweredge 1400, 2-processor pIII-933, running 2.2.16-3smp. Reproducible: Always Steps to Reproduce: 1. repeatedly perform an rpm operation that changes the database: rpm -e rpmlint (behavior below) 2. try rebuilding the database rpm --rebuilddb (segfaults only) Actual Results: The following behavior happens on *every other run*: it complains error: cannot open Depends index using db1 - File exists (17) sometimes i also got error: cannot open Depends index using db1 - Invalid argument (22) then, *every time*, it segfaults. here is an ls -l of /var/lib/rpm: [root@fracas /tmp]# ls -l /var/lib/rpm total 30723 -rw-r--r-- 1 root root 16384 Apr 8 12:43 conflictsindex.rpm -rw-r--r-- 1 root root 10489856 Apr 8 12:43 fileindex.rpm -rw-r--r-- 1 root root 16384 Apr 8 12:43 groupindex.rpm -rw-r--r-- 1 root root 36864 Apr 8 12:43 nameindex.rpm -rw-r--r-- 1 root root 17843592 Apr 8 12:43 packages.rpm -rw-r--r-- 1 root root 94208 Apr 8 12:43 providesindex.rpm -rw-r--r-- 1 root root 5902336 Apr 8 12:43 requiredby.rpm -rw-r--r-- 1 root root 16384 Apr 8 12:43 triggerindex.rpm here is a strace: SYS_gettimeofday(0xbffff0e0, 0, 1, 0x081a7278, 8) = 0 SYS__newselect(4, 0xbffff06c, 0, 0, 0xbffff064) = 1 SYS_gettimeofday(0x081a7330, 0, 1, 0x081a7278, 0x30000000) = 0 SYS_read(3, "", 24740) = 24740 SYS_gettimeofday(0xbffff0e0, 0, 1, 0x081a7278, 24740) = 0 SYS_stat("/usr/doc/qmail-1.03+patches", 0xbfffe16c) = 0 SYS_gettimeofday(0x081a7330, 0, 0x081a7278, 0x081a7278, 0x00daafd8) = 0 SYS_lseek(3, 0x00daafd8, 0, 0x081a7278, 0x00daafd8) = 0x00daafd8 SYS_gettimeofday(0xbffff188, 0, 0x081a7278, 0x081a7278, 0x00daafd8) = 0 SYS__newselect(4, 0xbffff06c, 0, 0, 0xbffff064) = 1 SYS_gettimeofday(0x081a7330, 0, 1, 0x081a7278, 0x081a22f8) = 0 SYS_read(3, "", 8) = 8 SYS_gettimeofday(0xbffff0e0, 0, 1, 0x081a7278, 8) = 0 SYS_mmap(0xbffff108, 136144, 139264, 0x08187d40, 136144) = 0x40136000 SYS__newselect(4, 0xbffff06c, 0, 0, 0xbffff064) = 1 SYS_gettimeofday(0x081a7330, 0, 1, 0x081a7278, 0x36000000) = 0 SYS_read(3, "", 136128) = 136128 SYS_gettimeofday(0xbffff0e0, 0, 1, 0x081a7278, 136128) = 0 SYS_mmap(0xbffff0a0, 136144, 139264, 0x08188040, 136144) = 0x40158000 SYS_munmap(0x40136000, 139264) = 0 SYS_mmap(0xbffff214, 136144, 139264, 0x08187fb0, 136144) = 0x40136000 SYS_munmap(0x40158000, 139264) = 0 SYS_stat("/usr/doc/gnome-core-1.2.4", 0xbfffe170) = 0 SYS_gettimeofday(0x081a7330, 0, 0x081a7278, 0x081a7278, 0x00dc6498) = 0 SYS_lseek(3, 0x00dc6498, 0, 0x081a7278, 0x00dc6498) = 0x00dc6498 SYS_gettimeofday(0xbffff188, 0, 0x081a7278, 0x081a7278, 0x00dc6498) = 0 SYS__newselect(4, 0xbffff06c, 0, 0, 0xbffff064) = 1 SYS_gettimeofday(0x081a7330, 0, 1, 0x081a7278, 0x081a22f8) = 0 SYS_read(3, "", 8) = 8 SYS_gettimeofday(0xbffff0e0, 0, 1, 0x081a7278, 8) = 0 --- SIGSEGV (Segmentation fault) --- +++ killed by SIGSEGV +++ i also tried doing rpm --rebuilddb, but that segfaults after grinding for a while, apparently after leaving partial results in /var/rpmrebuilddb.<n>: [root@fracas lib]# ls -l rpmrebuilddb.1187/ total 7367 -rw-r--r-- 1 root root 1343488 Apr 8 12:00 Basenames -rw-r--r-- 1 root root 12288 Apr 8 12:00 Conflictname -rw-r--r-- 1 root root 12288 Apr 8 12:00 Group -rw-r--r-- 1 root root 24576 Apr 8 12:00 Name -rw-r--r-- 1 root root 6152192 Apr 8 12:00 Packages -rw-r--r-- 1 root root 45056 Apr 8 12:00 Providename -rw-r--r-- 1 root root 98304 Apr 8 12:00 Requirename -rw-r--r-- 1 root root 12288 Apr 8 12:00 Triggername [root@fracas lib]#
i managed to get a backtrace of the segfault when doing rpm -e rpmlint. i could also provide a core dump, if you want that. a brief inspection reveals that version and release are both NULL, thus the segfault. further probing wasn't too revealing... m. [root@fracas rpm-4.0.2]# ./rpm --rcfile /usr/lib/rpm/rpmrc -e rpmlint error: cannot open Depends index using db1 - File exists (17) Segmentation fault (core dumped) [root@fracas rpm-4.0.2]# gdb ./rpm /core Core was generated by `./rpm --rcfile /usr/lib/rpm/rpmrc -e rpmlint'. Program terminated with signal 11, Segmentation fault. #0 0x806430f in providePackageNVR (h=0x82e71c8) at misc.c:787 787 pEVR = p = alloca(21 + strlen(version) + 1 + strlen(release) + 1); (gdb) backtrace #0 0x806430f in providePackageNVR (h=0x82e71c8) at misc.c:787 #1 0x807ab99 in doGetRecord (pkgs=0x819e588, offset=14443672) at db1.c:145 #2 0x807ae07 in db1cget (dbi=0x8199610, dbcursor=0xffffffff, keyp=0xbffff2d0, keylen=0xbffff2d4, datap=0xbffff2d8, datalen=0xbffff2dc, flags=0) at db1.c:258 #3 0x8068557 in dbiGet (dbi=0x8199610, dbcursor=0xffffffff, keypp=0xbffff2d0, keylenp=0xbffff2d4, datapp=0xbffff2d8, datalenp=0xbffff2dc, flags=0) at rpmdb.c:191 #4 0x806a3ea in XrpmdbNextIterator (mi=0x81d0910, f=0x815e662 "rpmdb.c", l=2009) at rpmdb.c:1370 #5 0x806b96a in rpmdbFindFpList (rpmdb=0x8199588, fpList=0x81c6a20, matchList=0x81d0870, numItems=38) at rpmdb.c:2062 #6 0x8076aa0 in rpmRunTransactions (ts=0x819e680, notify=0, notifyData=0x0, okProbs=0x0, newProbs=0xbffff8ac, transFlags=0, ignoreSet=RPMPROB_FILTER_NONE) at transaction.c:1658 #7 0x806e078 in rpmErase (rootdir=0x81588ac "/", argv=0x8194ac0, transFlags=0, interfaceFlags=0) at rpminstall.c:592 #8 0x804d9f3 in main (argc=5, argv=0xbffff9f4) at rpm.c:1089
Your database probably had problems before you upgraded to rpm-4.0.2., and the db1 handling in rpm-4.0.2 with damaged data is not as good as rpm-3.0.x Workaround by doing a rebuilddb with rpm-3.0.5: 1) get a copy of rpm-3.0.5 packages 2) unpack in a temp directory mkdir -p /var/tmp/xxx cd /var/tmp/xxx rpm2cpio rpm-3.0.5-*.rpm | cpio -dim 3) rebuilld to eliminate damaged record using rpm-3.0.5 cd /var/tmp/xxx ./bin/rpm --rebuilddb 4) rebuild to convert from db1 to db3 using rpm-4.0.2 /bin/rpm --rebuilddb Does that fix your problem?
Unfortunately, not. I rebuilt the DB with rpm 3.0.6 and it skipped a record that was giving problems, but rebuiding with rpm 4 still segfaults in the same place. do i need some way of telling rpm 4 to rebuild from the rpm 3 indexes? [root@fracas bin]# ./rpm --rebuilddb record number 14443672 in database is bad -- skipping. [root@fracas bin]# /bin/rpm --rebuilddb Segmentation fault
OK, you still have damaged records in your database, but the damge is to the header itself, not the chain. Deleting the record (by erasing the package) should fix. No, you don't have to anything special with the indexes (i.e. everything in /var/lib/rpm execept Packages or packages.rpm, as the damage is in packages.rpm. You can detect the damaged record by 1) Run "rpm --rebuilddb -vv" using rpm-4.0.2, and note the last package before the segfault. 2) Run "rpm --rebuilddb -vv" using rpm-3.0.6, and note the package with the damage. Erase that package. 3) Rebuild to convert from db1 to db3 using rpm-4.0.2 /bin/rpm --rebuilddb Note: You might want to clean up the /var/lib/rpmrebuild* directories left after the segfaults. FWIW, the patch to rpm-4.0.2 to avoid the segfault is pretty trivial, already in rpm CVS on the rpm-4_0 branch, if you want.
so i followed the instructions from the last note - 1) i rebuild with rpm 4.0.2 and noted the last package mentioned before the segfault. 2) i then rebuilt with rpm 3.0.6 - it generated no errors. 3) i then removed the package from 1) and tried to rebuild with 4.0.2 again, but it still segfaulted. i repeated this 2-3 times, but rpm 4 still segfaults when rebuilding the DB. so, if there are no issues with this, i'm planning on downgrading to rpm 3 and using it, rather than using a cvs snaphot version of rpm 4. will using rpm 3 cause any compatibility issues and/or further database mangling?
If you don't have a "fix" yet, reopen this bug with a pointer to a tarball of your database cd /var/lib tar czvf /tmp/rpmdb.tar.gz rpm and I'll see what's up.