Bug 1484139

Summary: libdb DB_VERSION_MISMATCH errors in rawhide
Product: [Fedora] Fedora Reporter: Dusty Mabe <dustymabe>
Component: rpmAssignee: Packaging Maintenance Team <packaging-team-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 27CC: ignatenko, kardos.lubos, mjw, packaging-team-maint, pmatilai, slaznick, vmukhame
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-23 13:28:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
docker dnf update log
none
lorax runroot log none

Description Dusty Mabe 2017-08-22 19:46:46 UTC
Description of problem:

I see libdb errors from the latest runs of lorax for atomic host installer in rawhide: 

```
command output:
error: db5 error(-30969) from dbenv->open: BDB0091 DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db5 -  (-30969)
error: cannot open Packages database in /var/lib/rpm
error: db5 error(-30969) from dbenv->open: BDB0091 DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db5 -  (-30969)
error: cannot open Packages database in /var/lib/rpm
```

This is causing lorax runs to fail for fedora atomic host: https://koji.fedoraproject.org/koji/taskinfo?taskID=21404991 (see attachment for runroot.log)



You can also get DB errors when running dnf update from the latest rawhide docker container (see attachment):

```

BDB1539 Build signature doesn't match environment
failed loading RPMDB
```

Version-Release number of selected component (if applicable):

rpm-4.13.90-0.git14002.6.fc28.x86_64



How reproducible:

I'll monitor the lorax runs from atomic host. If you run dnf in a rawhide docker container you can see them today:


Steps to Reproduce:
1.docker run --rm -it registry.fedoraproject.org/fedora:rawhide /bin/bash
2.dnf update glibc --nogpgcheck

Comment 1 Dusty Mabe 2017-08-22 19:47:32 UTC
Created attachment 1316778 [details]
docker dnf update log

Comment 2 Dusty Mabe 2017-08-22 19:48:43 UTC
Created attachment 1316787 [details]
lorax runroot log

Comment 3 Dusty Mabe 2017-08-22 19:49:22 UTC
The rpm version from the docker run was `rpm-4.13.0.1-31.fc27.x86_64`. The rpm version from the lorax run was `rpm-4.13.90-0.git14002.6.fc28.x86_64`.

Comment 4 Panu Matilainen 2017-08-23 07:03:27 UTC
Well yes, you can't mix and match different rpm versions at will.

'rm -f /var/lib/rpm/__db.*' or more generally 'rpm --rebuilddb' is required before accessing the rpmdb with a different rpm version.

Comment 5 Dusty Mabe 2017-08-23 13:14:56 UTC
(In reply to Panu Matilainen from comment #4)
> Well yes, you can't mix and match different rpm versions at will.
> 
> 'rm -f /var/lib/rpm/__db.*' or more generally 'rpm --rebuilddb' is required
> before accessing the rpmdb with a different rpm version.

hey panu. These happen to be two completely different cases where we saw the same error. The lorax run is run in fedora's build system. The docker command can be run on your laptop (I provided that so you could possibly reproduce).

We were not mixing rpm versions on the same system. I was just reporting two different cases and the rpm versions that happened to exist on those systems.

Comment 6 Panu Matilainen 2017-08-23 13:28:22 UTC
Oh okay, misunderstanding on my behalf then, not having one of my brightest days I guess ;)

...and on a closer look, I see this involves "dnf update glibc" too, which makes it a dupe of bug 1465809.

*** This bug has been marked as a duplicate of bug 1465809 ***

Comment 7 Dusty Mabe 2017-08-23 14:32:56 UTC
This is causing lorax failures, which means we don't get install media for ostree. How do we get that fixed?

Comment 8 Panu Matilainen 2017-08-24 05:15:38 UTC
Bug 1394862 resolution is exposing all sorts of lurkers in the dark corners, so analyzing who's breaking the rules and figuring out how to deal with it?

The issue is that when glibc or libdb are updated, all the rpmdb handles that were open before the update started must be closed before you can open any new ones. This includes not just the naughty packages running rpm from inside their scriptlets during install/upgrade/erase, but also any other rpm queries.

Now, I dont know the damnest thing about lorax, but looking at the log it seems like the latter: runtime-postinstall.tmpl is running an rpm query after the "install" to create the image, but the database environment is busy. My guess is that lorax isn't closing it's dnf instance before the post-install phase, or something to that effect.

Comment 9 Panu Matilainen 2017-08-24 10:58:32 UTC
FWIW, I can't reproduce the docker glibc update failure here, despite it clearly being the same exact image and version (based on digest etc).

Comment 10 Standa Laznicka 2017-09-04 14:07:11 UTC
I was able to reproduce this today, reproducer is very simple.

What I did:
1. docker run --rm -ti fedora:27 bash
2. dnf upgrade -y glibc

Result:
"""
Fedora 27 - x86_64 - Test Updates                                      168 kB/s | 3.1 MB     00:19    
Fedora 27 - x86_64 - Updates                                           8.2 kB/s | 373  B     00:00    
Fedora 27 - x86_64                                                     8.7 MB/s |  66 MB     00:07    
Last metadata expiration check: 0:00:00 ago on Mon Sep  4 14:04:55 2017.
Dependencies resolved.
=======================================================================================================
 Package                        Arch                Version                  Repository           Size
=======================================================================================================
Upgrading:
 glibc                          x86_64              2.26-6.fc27              fedora              3.6 M
 glibc-common                   x86_64              2.26-6.fc27              fedora              872 k
 glibc-langpack-en              x86_64              2.26-6.fc27              fedora              302 k
 libcrypt-nss                   x86_64              2.26-6.fc27              fedora               64 k

Transaction Summary
=======================================================================================================
Upgrade  4 Packages

Total download size: 4.8 M
Downloading Packages:
(1/4): glibc-langpack-en-2.26-6.fc27.x86_64.rpm                        1.3 MB/s | 302 kB     00:00    
(2/4): libcrypt-nss-2.26-6.fc27.x86_64.rpm                             1.2 MB/s |  64 kB     00:00    
(3/4): glibc-common-2.26-6.fc27.x86_64.rpm                             2.0 MB/s | 872 kB     00:00    
(4/4): glibc-2.26-6.fc27.x86_64.rpm                                    3.9 MB/s | 3.6 MB     00:00    
-------------------------------------------------------------------------------------------------------
Total                                                                  1.7 MB/s | 4.8 MB     00:02     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                               1/1 
  Upgrading        : glibc-common-2.26-6.fc27.x86_64                                               1/8 
  Upgrading        : glibc-langpack-en-2.26-6.fc27.x86_64                                          2/8 
  Running scriptlet: glibc-2.26-6.fc27.x86_64                                                      3/8 
  Upgrading        : glibc-2.26-6.fc27.x86_64                                                      3/8 
  Running scriptlet: glibc-2.26-6.fc27.x86_64                                                      3/8 
  Upgrading        : libcrypt-nss-2.26-6.fc27.x86_64                                               4/8 
  Running scriptlet: libcrypt-nss-2.26-6.fc27.x86_64                                               4/8 
  Cleanup          : libcrypt-nss-2.26-4.fc27.x86_64                                               5/8 
  Running scriptlet: libcrypt-nss-2.26-4.fc27.x86_64                                               5/8 
  Cleanup          : glibc-2.26-4.fc27.x86_64                                                      6/8 
  Running scriptlet: glibc-2.26-4.fc27.x86_64                                                      6/8 
  Cleanup          : glibc-langpack-en-2.26-4.fc27.x86_64                                          7/8 
  Cleanup          : glibc-common-2.26-4.fc27.x86_64                                               8/8 
BDB1539 Build signature doesn't match environment
failed loading RPMDB
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
"""

Comment 11 Panu Matilainen 2017-09-04 15:25:19 UTC
And whatever the reason, I still can't reproduce that.

However if you can consistently reproduce it, please try updating rpm first, ie:

1. docker run --rm -ti fedora:27 bash
2. dnf upgrade -y rpm
3. dnf upgrade -y glibc

The rpm update should bring in version 4.13.90-0.git14000.8 which has some band-aid for this issue.

Comment 12 Standa Laznicka 2017-09-07 13:43:58 UTC
I should have added ^ to See Also, thanks.

Btw: your proposed workaround does not fix the issue.