Bug 2106452
| Summary: | softhsm2: Unable to create cert: Private key not found | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Rob Crittenden <rcritten> | ||||
| Component: | pki-core | Assignee: | Endi Sukma Dewata <edewata> | ||||
| Status: | CLOSED ERRATA | QA Contact: | idm-cs-qe-bugs | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 9.1 | CC: | ckelley, edewata, gkimetto, jmagne, mrhodes, pcech, skhandel | ||||
| Target Milestone: | rc | Keywords: | Triaged | ||||
| Target Release: | 9.2 | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | pki-core-11.3.0-0.2.beta1.el9 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 2184506 (view as bug list) | Environment: | |||||
| Last Closed: | 2023-05-09 07:43:41 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 2184506 | ||||||
| Attachments: |
|
||||||
|
Description
Rob Crittenden
2022-07-12 17:30:45 UTC
Just FYI, this is mainly caused by a permission problem. Initially pkispawn creates the key as root, so the key file in SoftHSM is owned by root too, but later pkispawn tries to find the key again as pkiuser, so it could not find it. This problem is only affecting SoftHSM. Hardware-based HSM should not have this problem. Yes, permissions are the problem. New files within the token are created as root:root rather than pkiuser:pkiuser. https://www.dogtagpki.org/wiki/SoftHSM indicates that this worked with self-signed CAs at one time. IPA developers in the recent past were able to get past this point, as well as another IPA user: https://magnus-k-karlsson.blogspot.com/2019/08/installing-dogtag-on-fedora-30-with.html Adding a chown -R of the softhsm token directory after each certutil -A (more or less) the installation proceeds. I'm prompted several times for the token password to import a certificate into the token database during the pkispawn. Deployment fails because the Server-Cert private key is not in the token database. I haven't yet investigated this, though the blog user reported that he had to force Server-Cert to remain in the NSS database. Yes, this used to work in the past, but it has been broken since 2019, likely due to changes in the installation process. I'm not sure whether SoftHSM is officially supported, but for sure it has not been officially tested in RHEL, so it never got fixed. The workaround is probably to use a two-step installation process and fix the permissions manually, or to add SoftHSM-specific code into pkispawn to fix the permissions. The proper fix is probably to execute the commands that creates the keys and certs in pkispawn as pkiuser so the SoftHSM files will get created with the right permissions. I've managed to hack nssdb.py enough so that the certs and keys are generated and installed into the token and permissions are fine. pkispawn succeeds and the CA is able to start. All the certs/keys are stored in the token which is something that wasn't working in 2019 so there's some progress.
Port 8080 works but the TLS port throws a trace in the journal. Seems it can't apply the cert and/or key to a connection.
SEVERE: Error running socket processor
java.lang.RuntimeException: Unable to configure certificate and key on model SSL PRFileDesc proxy: SEC_ERROR_NO_MEMORY (-8173)
at org.mozilla.jss.ssl.javax.JSSEngine.getServerTemplate(JSSEngine.java:1106)
at org.mozilla.jss.ssl.javax.JSSEngineReferenceImpl.createBufferFD(JSSEngineReferenceImpl.java:332)
at org.mozilla.jss.ssl.javax.JSSEngineReferenceImpl.init(JSSEngineReferenceImpl.java:262)
at org.mozilla.jss.ssl.javax.JSSEngineReferenceImpl.beginHandshake(JSSEngineReferenceImpl.java:646)
at org.apache.tomcat.util.net.SecureNioChannel.processSNI(SecureNioChannel.java:334)
at org.apache.tomcat.util.net.SecureNioChannel.handshake(SecureNioChannel.java:154)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1708)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.base/java.lang.Thread.run(Thread.java:833)
The cert is readable by the CA and passes the CA selfsign tests. A manual verification looks like:
# runuser -u pkiuser -- certutil -V -u V -d /etc/pki/pki-tomcat/alias -n 'softhsm_token:Server-Cert cert-pki-ca' -e
Enter Password or Pin for "NSS Certificate DB":
Enter Password or Pin for "softhsm_token":
certutil: certificate is valid
I set the log level of the token to DEBUG and it's doing some work like creating a session and doing some minor decryption but it logs no errors. strace against the CA confirms that the CA is able to access the files. It's unfortunate that softhsm doesn't log the PKCS#11 API calls directly as it might be easier to see what is going on.
I've tried in rawhide as well but still have the 11.2.0 beta2 bits because I haven't updated to python 3.11 yet.
I'm not sure how to trace things from here.
It fails the same way using the unsupported selfserv tool so the problem lies either in NSS or softhsm2: # /usr/lib64/nss/unsupported-tools/selfserv -d /etc/pki/pki-tomcat/alias -n 'softhsm_token:Server-Cert cert-pki-ca' -p 9443 -v Enter Password or Pin for "softhsm_token": selfserv: SSL_ConfigServerCert returned error -8173: security library: memory allocation failure. Using gdb I traced this within NSS to copying the private key (PK11_CopyKey) for Server-Cert from softhsm. The C_CopyObject call returns CKR_ARGUMENTS_BAD which is unexpected but I think what they mean is that the private key cannot be retrieved. I set the token in my ini file to internal and pkispawn was successful to the point where the CA was restarted. The problem was that server.xml included the softhsm token name in it Connector definition. Removing that and the CA starts fine and is accessible from clients. I have standalone installation and IPA server installations working for my simple use-case. My changes don't cover every possible use of nssdb.NSSDatabase() so its likely I missed something. Ideally all calls to NSSDatabase.nssdb() would include user and group so that proper chown can be done on the files and directories being created, and the effective uid/gid can be set when executing calls in conjunction with prefixing commands with runuser -u <user> --. By doing this the token file ownership is all fine. I've also seen a second problem. The pki nss-import-cert always fails importing the caSigningCert certificate into the token with SEC_ERROR_TOKEN_NOT_LOGGED_IN. This is an issue in certutil AFAICT. I tried working around the error by calling certutil directly, once to import the cert and trust into the NSS softokn and once to import it into the softhsm token but it still fails. I ignore the failure. I have a hard time tracing the python/java boundaries as sometimes it seems like python code calls the pki command which calls more python. I don't know if this will affect all HSMs or just these. Another part of my workaround was to use the same password for the NSS database as the token database per comment#0. Otherwise lots more invasive changes are needed. Additional changes by Rob in the master branch: - https://github.com/dogtagpki/pki/commit/3a79ffd87991482004a1ee39ca9abca5db19e55b - https://github.com/dogtagpki/pki/commit/a4c2f89b07cd39c8b375aa24fa66bc71cca5e84c Additional changes by Rob in the master branch: - https://github.com/dogtagpki/pki/commit/7e05fa4a011cdcdd0f1656dd8acc114011c2db47 - https://github.com/dogtagpki/pki/commit/67872a70be88f0430b4a0f3c2c1ed2f328def766 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: pki-core security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2293 |