Bug 2084334
| Summary: | certmonger startup very slow using default NSS sqlite database backend [rhel-8.7.0] | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Tilman Kranz <kranz> | ||||
| Component: | nss | Assignee: | Bob Relyea <rrelyea> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | BaseOS QE Security Team <qe-baseos-security> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 8.5 | CC: | cllang, rcritten, rrelyea, ssorce | ||||
| Target Milestone: | rc | Keywords: | Triaged, ZStream | ||||
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | nss-3.79.0-7.el8_6 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Cause:
When upgrading dbm databases with lots of Certificates with private keys, the resulting sqlite database becomes extremely slow to access. This is because the sqlite db will contain extra Trust objects for these certs that are unneccessary.
Consequence:
Accessing the resulting sqlite database becomes extremely slow
Fix:
1) this patch speeds up accessing trust objects that don't affect the actual trust values.
2) fixes dbm so that it no longer creates the extra trust objects for certs that have private keys.
Result:
Access to these sqlite databases are now faster. Customers can get faster still results by reupdating the databases from the original dbm after the patch has been applied.
|
Story Points: | --- | ||||
| Clone Of: | |||||||
| : | 2097811 2097816 2097900 (view as bug list) | Environment: | |||||
| Last Closed: | 2023-06-05 16:45:32 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 2097811, 2097816, 2097900 | ||||||
| Attachments: |
|
||||||
|
Description
Tilman Kranz
2022-05-11 21:56:16 UTC
The issue seems to be with a database that is migrated from dbm to sqlite. The attached script generates a self-signed CA and 100 server certificates. To run it: $ mkdir /tmp/nssdb $ bash gencert dbm:/tmp/nssdb $ echo httptest > /tmp/nssdb/passwd Listing all the keys takes less than a second: $ time certutil -K -d dbm:/tmp/nssdb -f /tmp/nssdb/passwd real 0m0.559s user 0m0.444s sys 0m0.086s Upgrade it to sqlite: $ certutil -d sql:/tmp/nssdb/ -N -f /tmp/nssdb/passwd -@ /tmp/nssdb/passwd Same listing of keys: $ time certutil -K -d dbm:/tmp/nssdb -f /tmp/nssdb/passwd real 0m46.905s user 0m45.400s sys 0m0.177s Now if we create the database directly as sqlite the timing is more in line with dbm: $ mkdir /tmp/nssdb2 $ bash gencert sql:/tmp/nssdb2 $ echo httptest > /tmp/nssdb2/passwd And list the keys: $ time certutil -K -d sql:/tmp/nssdb2 -f /tmp/nssdb2/passwd real 0m0.742s user 0m0.581s sys 0m0.032s Also worth mentioning that generating the sqlite database using gencert takes significantly longer than the dbm database. It's plausible that entropy on this VM is simply exhausted. Reproduced with nss-3.67.0-6.el8_4.x86_64 Created attachment 1878803 [details]
Script to generate a CA and 100 server certificates
Changing component to nss for review. This is almost certainly caused by the cache trashing bug when we added integrity to AES. The issue is the key for the decrypt and the key for the integrity check are different, and they would throuh each other out of the cache, so you ended up doing the PBE for every key. (The issue is seen with databases with large numbers of private keys). Does this happen on RHEL-9? If not it should be fixed on the next NSS rebase next month. Rob, can you verify that this does not happen on fedora (I also think RHEL-9 has the appropriate patches as well). bob dbm isn't allowed in Fedora since I think Fedora 32 or 33. $ certutil -N -d dbm:/tmp/nssdb certutil: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format. Oh, I thought that it was just that the database on sqlite was being slow. Hmmm If you copy the database dbm upgraded database to rhel-9 or fedora, is it still slow? There is an upstream bug that was fixed where if you have 100 or so keys, sqlite was really slow listing them. The fix for this is not in RHEL-8. I wonder why we aren't tripping over this when you create the database in sqlite? bob It takes about 4s to list the 100 keys from the same database using nss-3.71.0-1.fc33.x86_64 $ time certutil -K -d sql:/tmp/nssdb/ -f /tmp/nssdb/passwd real 0m4.155s user 0m4.102s sys 0m0.031s Thanks Rob. That looks like there may be multiple issues, the main one being cache thrashing. slight error in comment 1 > Upgrade it to sqlite: > > $ certutil -d sql:/tmp/nssdb/ -N -f /tmp/nssdb/passwd -@ /tmp/nssdb/passwd > > Same listing of keys: > > $ time certutil -K -d dbm:/tmp/nssdb -f /tmp/nssdb/passwd This last line should be: $ time certutil -K -d sql:/tmp/nssdb -f /tmp/nssdb/passwd > Also worth mentioning that generating the sqlite database using gencert takes significantly longer than the dbm database.
> It's plausible that entropy on this VM is simply exhausted.
No keygen against the sql database definitely takes longer I can see that in both the rhel-8 certutil and my current upsteam certutil.
So the issuer is the CERT_USERDB bit in the trust, fools the legacydb (dbm) into presenting trust objects that are actually empty trust objects. Since NSS checks the integrity of trust objects if you've logged in (which you have to to display the keys), it takes quite some time to display each cert. There are two fixes: 1) we can skip the integrity check if the value we are checking is the value we would default to if there wasn't any trust value (which you get when the integrity check fails. This speeds up the listing of the databases with these dead trust values by about 10x. 2) Fix dbm to to correctly skip cert trust objects with the CERTDB_USER bit and nothing else. This will fix the case the created the bad databases, but won't fix the displaying of the bad databases. NSS 3.79 shipped today, so it won't be upstreamed in time to patch this there. We'll carry the patch until the next release of NSS. RHEL 8.7 contains nss-3.79.0-10.el8_6. |