Bug 1519148
| Summary: | rpm crashes with core dump | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Stefano Biagiotti <stefano.biagiotti> | ||||||
| Component: | nss-softokn | Assignee: | Packaging Maintenance Team <packaging-team-maint> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 26 | CC: | dueno, elio.maldonado.batiz, ignatenko, kengert, mjw, packaging-team-maint, pmatilai, pmoravco, rrelyea, vmukhame | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | i686 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-12-02 10:10:31 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1360779 [details]
Core dump file
Tar up /var/lib/rpm just in case. You might get past the crash with --nosignature --nodigest options (applicaple to all rpm commands). My first guess would be some sort of mismatch with nss packages (something leftover from f25 or so) - what does 'rpm -qa --nosignature --nodigest|grep ^nss' say? Thank you Panu. # rpm -qa --nosignature --nodigest|grep ^nss nss-pem-1.0.3-3.fc26.i686 nss-softokn-3.33.0-1.1.fc26.i686 nss-util-3.33.0-1.0.fc26.i686 nss-sysinit-3.33.0-1.0.fc26.i686 nss-3.33.0-1.0.fc26.i686 nss-softokn-freebl-3.33.0-1.1.fc26.i686 nss-tools-3.33.0-1.0.fc26.i686 The --nosignature --nodigest workaround is great, I can use rpm now. Is there something similar for dnf? I tried dnf --nogpgcheck but it still crashes... Okay so package names seem ok, you might want to run 'rpm -V --nodigest --nodigest' on the above NSS packages to see if the content matches. If NSS broken then that's likely to crash too, but that's information too. And you can of course try --rebuilddb with those options too (but do backup /var/lib/rpm first) You can also try booting from a rescue image to see if that can access the database. The thing is, the rpm version between f25 and f26 is exactly the same so the likely cause is something in the upgrade gone bad and just need to find out what it is, not that this is a regular bug in rpm. You can achieve --nodigest --nosignature globally for *librpm* with something like echo '%__vsflags 0xffffff' > /etc/rpm/macros.nonss ...BUT that doesn't stop dnf from using crypto hashes for other purposes (if it's using NSS) so there's no guarantee that helps at all. Also you don't want to install/erase/upgrade anything in this state except to replace broken packages, once identified as such. The nss* packages seem ok. # rpm -V --nodigest --nosignature nss-pem BDB2053 Freeing read locks for locker 0x2e: 8387/3080648448 BDB2053 Freeing read locks for locker 0x2f: 8387/3080648448 BDB2053 Freeing read locks for locker 0x30: 8387/3080648448 # echo $? 0 # rpm -V --nodigest --nosignature nss-softokn # rpm -V --nodigest --nosignature nss-util # rpm -V --nodigest --nosignature nss-sysinit # rpm -V --nodigest --nosignature nss # rpm -V --nodigest --nosignature nss-softokn-freebl # rpm -V --nodigest --nosignature nss-tools The --rebuilddb option doesn't work even with --nodigest --nosignature. # rpm --nodigest --nosignature --rebuilddb Istruzione non consentita (core dump creato) # echo $? 132 I tried to see if some extra packages from F25 are yet in, but no luck. # echo '%__vsflags 0xffffff' > /etc/rpm/macros.nonss # dnf list extras Istruzione non consentita (core dump creato) I finally tried to reboot with vmlinuz-0-rescue-9d1a9a8112444a7aa55e25001fda650d (old kernel 4.8.12 from F25), but rpm doesn't work anyway. # rpm -q rpm Illegal instruction (core dumped) Right, in that case it seems you have genuine rpmdb corruption, time for disaster recovery: http://rpm.org/user_doc/db_recovery.html If you can attach the corrupted backup here (compress please) or upload someplace else, I'd be interested in taking a look to see if there's anything to learn from the post-morten (whether could do to detect the corruption etc) Oh and you'll want to remove that /etc/rpm/macros.nonss before you forget it's there :) The "RPM Database Recovery" commands are below. Rpm still crashes on --rebuilddb. # cd /var/lib/rpm # /usr/lib/rpm/rpmdb_stat -CA Default locking region information: 75 Last allocated locker ID 0x7fffffff Current maximum unused locker ID 5 Number of lock modes 5 Initial number of locks allocated 0 Initial number of lockers allocated 5 Initial number of lock objects allocated 0 Maximum number of locks possible 0 Maximum number of lockers possible 0 Maximum number of lock objects possible 5 Current number of locks allocated 3 Current number of lockers allocated 5 Current number of lock objects allocated 1 Number of lock object partitions 1031 Size of object hash table 0 Number of current locks 3 Maximum number of locks at any one time 1 Maximum number of locks in any one bucket 0 Maximum number of locks stolen by for an empty partition 0 Maximum number of locks stolen for any one partition 0 Number of current lockers 3 Maximum number of lockers at any one time 0 Number of current lock objects 3 Maximum number of lock objects at any one time 1 Maximum number of lock objects in any one bucket 0 Maximum number of objects stolen by for an empty partition 0 Maximum number of objects stolen for any one partition 282 Total number of locks requested 282 Total number of locks released 0 Total number of locks upgraded 50 Total number of locks downgraded 0 Lock requests not available due to conflicts, for which we waited 0 Lock requests not available due to conflicts, for which we did not wait 0 Number of deadlocks 0 Lock timeout value 0 Number of locks that have timed out 0 Transaction timeout value 0 Number of transactions that have timed out 232KB Region size 0 The number of partition locks that required waiting (0%) 0 The maximum number of times any partition lock was waited for (0%) 0 The number of object queue operations that required waiting (0%) 0 The number of locker allocations that required waiting (0%) 0 The number of region locks that required waiting (0%) 1 Maximum hash bucket length =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock REGINFO information: Environment Region type 1 Region ID __db.001 Region name 0xb7b0c000 Region address 0xb7b0c068 Region allocation head 0xb7b0c3dc Region primary address 0 Region maximum allocation 0 Region allocated Region allocations: 33 allocations, 0 failures, 0 frees, 1 longest Allocations by power-of-two sizes: 1KB 30 2KB 1 4KB 0 8KB 0 16KB 1 32KB 0 64KB 0 128KB 1 256KB 0 512KB 0 1024KB 0 REGION_SHARED Region flags =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock region parameters: 2 Lock region region mutex [0/702 0% 22270/3086506688] 131 locker table size 1031 object table size 1484 obj_off 96500 locker_off 0 need_dd =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Lock conflict matrix: =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by lockers: Locker Mode Count Status ----------------- Object --------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Locks grouped by object: Locker Mode Count Status ----------------- Object --------------- # rm __db* rm: rimuovere file regolare '__db.001'? y rm: rimuovere file regolare '__db.002'? y rm: rimuovere file regolare '__db.003'? y [root@bolzano rpm]# /usr/lib/rpm/rpmdb_verify Packages BDB5105 Verification of Packages succeeded. [root@bolzano rpm]# echo $? 0 [root@bolzano rpm]# rpm -qa 1> /dev/null [root@bolzano rpm]# echo $? 0 [root@bolzano rpm]# rpm -v --rebuilddb Istruzione non consentita (core dump creato) Removed the macros.nonss file. Don't know if this can help, but there are no F25 packages but kernels and hawkey. # rpm --nodigest --nosignature -qa | grep fc25 kernel-PAE-core-4.12.8-200.fc25.i686 kernel-PAE-4.12.8-200.fc25.i686 kernel-PAE-modules-4.12.8-200.fc25.i686 kernel-PAE-core-4.13.15-100.fc25.i686 kernel-PAE-4.13.15-100.fc25.i686 hawkey-0.6.4-3.fc25.i686 kernel-PAE-modules-4.13.15-100.fc25.i686 Hmm, didn't remember the recovery process only tells you to dump|load in case db_verify returns errors. Which makes sense I guess, but something you might want to try anyway: # mv Packages Packages.orig # /usr/lib/rpm/rpmdb_dump Packages.orig | /usr/lib/rpm/rpmdb_load Packages # rpmdb --rebuilddb Other than that, I'll need a copy of your rpmdb (Packages file should be sufficient) Created attachment 1361487 [details]
/var/lib/rpm/* files
So... there's absolutely nothing wrong with your rpmdb, AFAICT. That is, no segfaults or other mishaps here. So it suggests something locally broken, but in that case running from a rescue-image should work. Oh BTW, speaking of that - did you run rpm from the actual rescue image with --root /path/to/sysmount (sorry dont remember the exact path)? If you chroot into the system image then it's just the same brokenness obviously. Hardware problems are always a possibility but eg bad RAM would probably manifest in other ways than just rpm signature/digest checking, and the timing with system upgrade to a different version... Is this a physical system or some virtualized image or such? I'm a bit out of ideas at the moment, but since rpm -V works you can always try 'rpm -Va --nosignature --nodigest' to see if something comes up. It is a phisical system with an old 32-bit Athlon CPU. # LANG=en rpm -Va --nosignature --nodigest .M....... /var/cache/man missing /usr/lib/systemd/system-preset/85-display-manager.preset S.5....T. c /etc/postfix/main.cf S.5....T. c /etc/sysconfig/chronyd ....L.... c /etc/pam.d/fingerprint-auth ....L.... c /etc/pam.d/password-auth ....L.... c /etc/pam.d/postlogin ....L.... c /etc/pam.d/smartcard-auth ....L.... c /etc/pam.d/system-auth ..5....T. /var/lib/selinux/targeted/active/commit_num S.5....T. /var/lib/selinux/targeted/active/file_contexts .......T. /var/lib/selinux/targeted/active/homedir_template S.5....T. /var/lib/selinux/targeted/active/policy.kern .......T. /var/lib/selinux/targeted/active/seusers .......T. /var/lib/selinux/targeted/active/users_extra .M....... /var/log/audit S.5....T. c /etc/sysconfig/lm_sensors .......T. c /etc/yum.repos.d/fedora-cisco-openh264.repo S.5....T. c /etc/yum.repos.d/fedora-updates.repo S.5....T. c /etc/yum.repos.d/fedora.repo S.5....T. c /etc/smartmontools/smartd.conf S.5....T. c /etc/aliases Here's what the coredump says:
Core was generated by `rpm -q fedora-release'.
Program terminated with signal SIGILL, Illegal instruction.
#0 0xb748cb2a in SHA1_Update (ctx=0x19a7908,
dataIn=0xb7f12320 <rpm_header_magic> "\216\255\350\001", len=8)
at sha_fast.c:101
101 ctx->size += len;
(gdb) bt
#0 0xb748cb2a in SHA1_Update (ctx=0x19a7908,
dataIn=0xb7f12320 <rpm_header_magic> "\216\255\350\001", len=8)
at sha_fast.c:101
#1 0xb762a047 in NSC_DigestUpdate (hSession=2,
pPart=0xb7f12320 <rpm_header_magic> "\216\255\350\001", ulPartLen=8)
at pkcs11c.c:1795
#2 0xb7ab6bdd in PK11_DigestOp (context=0x19a7750,
in=0xb7f12320 <rpm_header_magic> "\216\255\350\001", inLen=8)
at pk11cxt.c:783
#3 0xb7ea399e in rpmDigestUpdate (ctx=ctx@entry=0x1996068,
data=0xb7f12320 <rpm_header_magic>, len=len@entry=8) at digest_nss.c:166
#4 0xb7ee6aa7 in headerSigVerify (buf=0xbfbedc24, dataStart=<optimized out>,
pe=0xbfbedd34, rdl=<optimized out>, ril=<optimized out>,
dl=<optimized out>, il=<optimized out>, vsflags=<optimized out>,
keyring=<optimized out>) at package.c:246
#5 headerVerify (keyring=keyring@entry=0x19a0ca0,
vsflags=vsflags@entry=983040, uh=uh@entry=0x19a6760, uc=<optimized out>,
uc@entry=4024, msg=<optimized out>, msg@entry=0xbfbedd2c) at package.c:380
#6 0xb7ee6fda in headerCheck (ts=0x19a0aa0, uh=0x19a6760, uc=4024,
msg=0xbfbedd2c) at package.c:416
#7 0xb7ed0281 in miVerifyHeader (uhlen=4024, uh=0x19a6760, mi=0x19a5548)
at rpmdb.c:1508
#8 rpmdbNextIterator (mi=mi@entry=0x19a5548) at rpmdb.c:1590
#9 0xb7f028ec in loadKeyringFromDB (ts=0x19a0aa0) at rpmts.c:321
#10 loadKeyring (ts=0x19a0aa0) at rpmts.c:376
#11 0xb7f02587 in rpmtsInitIterator (ts=ts@entry=0x19a0aa0,
rpmtag=rpmtag@entry=2, keyp=keyp@entry=0xbfbef641, keylen=keylen@entry=0)
at rpmts.c:175
#12 0xb7eeb119 in initQueryIterator (ts=ts@entry=0x19a0aa0,
arg=0xbfbef641 "fedora-release", qva=<optimized out>, qva=<optimized out>)
at query.c:479
#13 0xb7eebd6f in rpmcliArgIter (ts=ts@entry=0x19a0aa0,
qva=qva@entry=0xb7f2d5c0 <rpmQVKArgs>, argv=argv@entry=0x19763f8)
at query.c:558
---Type <return> to continue, or q <return> to quit---
#14 0xb7eebf9a in rpmcliQuery (ts=0x19a0aa0, qva=0xb7f2d5c0 <rpmQVKArgs>,
argv=0x19763f8) at query.c:595
#15 0x004ac31d in main (argc=3, argv=0xbfbef044) at rpmqv.c:305
(gdb)
So it's not a segfault or such but an illegal instruction. Which leads to different trail altogether, sorry for not digging out the backtrace as the first thing.
It seems that NSS requires SSE2 nowadays:
https://bugzilla.mozilla.org/show_bug.cgi?id=1400603
https://bugzilla.mozilla.org/show_bug.cgi?id=1400603
...which an old 32bit system might not have. What does /proc/cpuinfo look like on that system?
And sure enough, it seems NSS 3.33 which you have installed does require SSE2, a newish requirement that was then eliminated from 3.34 - from https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/NSS_3.34_release_notes: > - libfreebl no longer requires SSE2 instructions. There's a test-update for NSS 3.34 available at https://bodhi.fedoraproject.org/updates/FEDORA-2017-552febe596 which I'd expect to sort out the problem. Only you'll need to manually download all the relevant components. Do NOT resort to --nodeps when upgrading those packages! No SSE2 indeed. # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 10 model name : AMD Athlon(TM) XP 2800+ stepping : 0 cpu MHz : 2083.071 cache size : 512 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow cpuid 3dnowprefetch vmmcall bugs : fxsave_leak sysret_ss_attrs bogomips : 4166.14 clflush size : 32 cache_alignment : 32 address sizes : 34 bits physical, 32 bits virtual power management: ts I'm managing to download the nss*.i686 packages and install them all together without --nodeps. IT WORKS! Thank you so much Panu, you rock! # LANG=en rpm --nosignature --nodigest -U -v -h *.rpm Preparing... ################################# [100%] Updating / installing... 1:nss-util-3.34.0-1.0.fc26 ################################# [ 8%] 2:nss-softokn-freebl-3.34.0-1.0.fc2################################# [ 17%] 3:nss-softokn-3.34.0-1.0.fc26 ################################# [ 25%] 4:nss-sysinit-3.34.0-1.0.fc26 ################################# [ 33%] 5:nss-3.34.0-1.0.fc26 ################################# [ 42%] 6:nss-tools-3.34.0-1.0.fc26 ################################# [ 50%] Cleaning up / removing... 7:nss-tools-3.33.0-1.0.fc26 ################################# [ 58%] 8:nss-sysinit-3.33.0-1.0.fc26 ################################# [ 67%] 9:nss-3.33.0-1.0.fc26 ################################# [ 75%] 10:nss-softokn-3.33.0-1.1.fc26 ################################# [ 83%] 11:nss-softokn-freebl-3.33.0-1.1.fc2################################# [ 92%] 12:nss-util-3.33.0-1.0.fc26 ################################# [100%] # rpm -qa | grep ^nss nss-softokn-freebl-3.34.0-1.0.fc26.i686 nss-pem-1.0.3-3.fc26.i686 nss-sysinit-3.34.0-1.0.fc26.i686 nss-util-3.34.0-1.0.fc26.i686 nss-3.34.0-1.0.fc26.i686 nss-tools-3.34.0-1.0.fc26.i686 nss-softokn-3.34.0-1.0.fc26.i686 *** This bug has been marked as a duplicate of bug 1482798 *** |
Description of problem: Rpm and rpmdb don't work, they crash producing core dump. Version-Release number of selected component (if applicable): # LANG=en rpm -v RPM version 4.13.0.2 ... How reproducible: Every time I try to access to the rpm database using rpm or rpmdb. Actual results: # LANG=en rpm -q fedora-release BDB2053 Freeing read locks for locker 0x2374: 26283/3079137600 BDB2053 Freeing read locks for locker 0x2375: 26283/3079137600 BDB2053 Freeing read locks for locker 0x2376: 26283/3079137600 Istruzione non consentita (core dump creato) Description: After an upgrade from F25 to F26 (with dnf system-upgrade), I can't use rpm anymore. Dnf and rpmdb don't work as well. # LANG=en rpmdb --rebuilddb Istruzione non consentita (core dump creato) I tried to remove the __db.NNN in /var/lib/rpm, but nothing happened, rpm still crashes and the __db.NNN are recreated. I hope someone can help me at least with a workaround because I actually can't use rpm nor dnf at all. # coredumpctl -S 2017-11-30 dump PID: 27165 (rpm) UID: 0 (root) GID: 0 (root) Signal: 4 (ILL) Timestamp: Thu 2017-11-30 10:26:47 CET (37s ago) Command Line: rpm -q fedora-release Executable: /usr/bin/rpm Control Group: /user.slice/user-0.slice/session-5.scope Unit: session-5.scope Slice: user-0.slice Session: 5 Owner UID: 0 (root) Boot ID: a9b8cf8abb434a02b34362534a0937be Machine ID: 9d1a9a8112444a7aa55e25001fda650d Hostname: bolzano Storage: /var/lib/systemd/coredump/core.rpm.0.a9b8cf8abb434a02b34362534a0937be.27165.1512034007000000.lz4 Message: Process 27165 (rpm) of user 0 dumped core. Stack trace of thread 27165: #0 0x00000000b748cb2a SHA1_Update (libfreeblpriv3.so) #1 0x00000000b762a047 NSC_DigestUpdate (libsoftokn3.so) #2 0x00000000b7ab6bdd PK11_DigestOp (libnss3.so) #3 0x00000000b7ea399e rpmDigestUpdate (librpmio.so.7) #4 0x00000000b7ee6aa7 headerVerify (librpm.so.7) #5 0x00000000b7ee6fda headerCheck (librpm.so.7) #6 0x00000000b7ed0281 rpmdbNextIterator (librpm.so.7) #7 0x00000000b7f028ec loadKeyring (librpm.so.7) #8 0x00000000b7f02587 rpmtsInitIterator (librpm.so.7) #9 0x00000000b7eeb119 initQueryIterator.isra.0 (librpm.so.7) #10 0x00000000b7eebd6f rpmcliArgIter (librpm.so.7) #11 0x00000000b7eebf9a rpmcliQuery (librpm.so.7) #12 0x00000000004ac31d main (rpm) #13 0x00000000b789d5b3 __libc_start_main (libc.so.6) #14 0x00000000004ac669 _start (rpm) Refusing to dump core to tty (use shell redirection or specify --output). # coredumpctl -S 2017-11-30 dump PID: 27256 (rpmdb) UID: 0 (root) GID: 0 (root) Signal: 4 (ILL) Timestamp: Thu 2017-11-30 10:28:01 CET (3s ago) Command Line: rpmdb --rebuilddb Executable: /usr/bin/rpmdb Control Group: /user.slice/user-0.slice/session-5.scope Unit: session-5.scope Slice: user-0.slice Session: 5 Owner UID: 0 (root) Boot ID: a9b8cf8abb434a02b34362534a0937be Machine ID: 9d1a9a8112444a7aa55e25001fda650d Hostname: bolzano Storage: /var/lib/systemd/coredump/core.rpmdb.0.a9b8cf8abb434a02b34362534a0937be.27256.1512034081000000.lz4 Message: Process 27256 (rpmdb) of user 0 dumped core. Stack trace of thread 27256: #0 0x00000000b7510b2a SHA1_Update (libfreeblpriv3.so) #1 0x00000000b76ae047 NSC_DigestUpdate (libsoftokn3.so) #2 0x00000000b7b3abdd PK11_DigestOp (libnss3.so) #3 0x00000000b7f2799e rpmDigestUpdate (librpmio.so.7) #4 0x00000000b7f6aaa7 headerVerify (librpm.so.7) #5 0x00000000b7f6afda headerCheck (librpm.so.7) #6 0x00000000b7f54281 rpmdbNextIterator (librpm.so.7) #7 0x00000000b7f564a8 rpmdbRebuild (librpm.so.7) #8 0x00000000b7f879c6 rpmtsRebuildDB (librpm.so.7) #9 0x00000000004c1080 main (rpmdb) #10 0x00000000b79215b3 __libc_start_main (libc.so.6) #11 0x00000000004c1125 _start (rpmdb) Refusing to dump core to tty (use shell redirection or specify --output).