Hide Forgot
With today's update to bind-9.18.6 in Rawhide, FreeIPA server deployment fails. The system logs show bind failing in early startup on a crypto error: Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: starting BIND 9.18.6 (Stable Release) <id:> .... Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: adjusted limit on open files from 524288 to 1048576 Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: found 2 CPUs, using 2 worker threads Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: using 2 UDP listeners per interface Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: EVP_PKEY_fromdata_init failed (crypto failure) Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: error:03000096:digital envelope routines::operation not supported for this keytype:crypto/evp/pmeth_gn.c:354: Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: initializing DST: crypto failure Aug 30 13:55:33 ipa002.test.openqa.fedoraproject.org named[8008]: exiting (due to fatal error) This does not happen with the previous bind package, bind-9.18.5-2.fc38 . I've requested releng untag this update, otherwise all Rawhide update tests (and daily compose tests) will fail on this issue.
This looks like one of DSA/RSA/DH being disabled by the openssl 3.x and default cryptopolicies, thus denying access to the specific method.
The failure happens in one of the dst__openssl*_init() calls below: isc_result_t dst_lib_init(isc_mem_t *mctx, const char *engine) { isc_result_t result; REQUIRE(mctx != NULL); REQUIRE(!dst_initialized); UNUSED(engine); memset(dst_t_func, 0, sizeof(dst_t_func)); RETERR(dst__hmacmd5_init(&dst_t_func[DST_ALG_HMACMD5])); RETERR(dst__hmacsha1_init(&dst_t_func[DST_ALG_HMACSHA1])); RETERR(dst__hmacsha224_init(&dst_t_func[DST_ALG_HMACSHA224])); RETERR(dst__hmacsha256_init(&dst_t_func[DST_ALG_HMACSHA256])); RETERR(dst__hmacsha384_init(&dst_t_func[DST_ALG_HMACSHA384])); RETERR(dst__hmacsha512_init(&dst_t_func[DST_ALG_HMACSHA512])); RETERR(dst__openssl_init(engine)); RETERR(dst__openssldh_init(&dst_t_func[DST_ALG_DH])); RETERR(dst__opensslrsa_init(&dst_t_func[DST_ALG_RSASHA1], DST_ALG_RSASHA1)); RETERR(dst__opensslrsa_init(&dst_t_func[DST_ALG_NSEC3RSASHA1], DST_ALG_NSEC3RSASHA1)); RETERR(dst__opensslrsa_init(&dst_t_func[DST_ALG_RSASHA256], DST_ALG_RSASHA256)); RETERR(dst__opensslrsa_init(&dst_t_func[DST_ALG_RSASHA512], DST_ALG_RSASHA512)); RETERR(dst__opensslecdsa_init(&dst_t_func[DST_ALG_ECDSA256])); RETERR(dst__opensslecdsa_init(&dst_t_func[DST_ALG_ECDSA384])); #ifdef HAVE_OPENSSL_ED25519 RETERR(dst__openssleddsa_init(&dst_t_func[DST_ALG_ED25519])); #endif /* ifdef HAVE_OPENSSL_ED25519 */ #ifdef HAVE_OPENSSL_ED448 RETERR(dst__openssleddsa_init(&dst_t_func[DST_ALG_ED448])); #endif /* ifdef HAVE_OPENSSL_ED448 */ #if HAVE_GSSAPI RETERR(dst__gssapi_init(&dst_t_func[DST_ALG_GSSAPI])); #endif /* HAVE_GSSAPI */ dst_initialized = true; return (ISC_R_SUCCESS); out: /* avoid immediate crash! */ dst_initialized = true; dst_lib_destroy(); return (result); }
Dmitriy, could you please help us here? Please see the bug itself.
I think it might be this part of bind code which attempts to recover from RSASHA1 and NSEC3RSASHA1 being disabled: commit f3a0dac0573d21887ee0fa262b2c3a75466a538b Author: Mark Andrews <marka> Date: Tue Mar 22 16:16:57 2022 +1100 Check that we can verify a signature at initialisation time Fedora 33 doesn't support RSASHA1 in future mode. There is no easy check for this other than by attempting to perform a verification using known good signatures. We don't attempt to sign with RSASHA1 as that would not work in FIPS mode. RSASHA1 is verify only. The test vectors were generated using OpenSSL 3.0 and util/gen-rsa-sha-vectors.c. Rerunning will generate a new set of test vectors as the private key is not preserved. e.g. cc util/gen-rsa-sha-vectors.c -I /opt/local/include \ -L /opt/local/lib -lcrypto (cherry picked from commit cd3f00874f63a50954cebb78edac8f580a27c0de) .... .... isc_result_t dst__opensslrsa_init(dst_func_t **funcp, unsigned char algorithm) { + isc_result_t result; + REQUIRE(funcp != NULL); - UNUSED(algorithm); + result = check_algorithm(algorithm); - if (*funcp == NULL) { - *funcp = &opensslrsa_functions; + if (result == ISC_R_SUCCESS) { + if (*funcp == NULL) { + *funcp = &opensslrsa_functions; + } + } else if (result == ISC_R_NOTIMPLEMENTED) { + result = ISC_R_SUCCESS; } - return (ISC_R_SUCCESS); + + return (result); } if check_algorithm() does not return ISC_R_NOTIMPLEMENTED or ISC_R_SUCCESS, we'd fail the whole initialization. In this case we get DST_R_OPENSSLFAILURE returned: (in check_algorithm()) ... status = EVP_PKEY_fromdata_init(ctx); if (status != 1) { DST_RET(dst__openssl_toresult2("EVP_PKEY_fromdata_init", DST_R_OPENSSLFAILURE)); } ...
The code expects to call EVP_PKEY_fromdata_init() and only fail when doing an actual EVP operation: [.. EVP initialization code above ..] /* * Check that we can verify the signature. */ if (EVP_DigestInit_ex(evp_md_ctx, type, NULL) != 1 || EVP_DigestUpdate(evp_md_ctx, "test", 4) != 1 || EVP_VerifyFinal(evp_md_ctx, sig, len, pkey) != 1) { DST_RET(ISC_R_NOTIMPLEMENTED); } So bind's check_algorithm() expects that EVP_PKEY_fromdata_init() would still work for an algorithm that would be blocked later by openssl. And openssl simply does not allow anymore to even initialize PKEY data for it. I guess a simple fix would be to treat DST_R_OPENSSLFAILURE similar to ISC_R_NOTIMPLEMENTED in the dst__openssl*_init() functions (not only dst__opensslrsa_init()).
Looks like the error in the original log shouldn't be caused by SHA1 stuff. error:03000096:digital envelope routines::operation not supported for this keytype:crypto/evp/pmeth_gn.c:354: is probably related to lack of the necessary key management according to code https://github.com/openssl/openssl/blob/56233ba8574c01b3912cf662335fedaabc7faec2/crypto/evp/pmeth_gn.c#L339-L356 Could you please provide more details about the algorithm causing failure?
Oh, I did expect this change would improve things, not break them. I tried this change on RHEL9 before merging and it seemed it worked fine. Is SHA1 disabled in rawhide already?
Yes, it is. But I suspect the problem is different. I'd test if commenting pkcs11 engine out of the config (as a temporary workaround) will resolve the situation. If I understand correctly, the key you create is legacy one, so it doesn't have a keymgmt and EVP_PKE_fromdata_init fails. If I'm wrong, we need to investigate more. If I'm correct, we will have to deal with PKCS11 stuff somehow.
Well, we cannot disable PKCS11 engine because otherwise bind would not see the keys stored there by IPA DNSSEC helpers. If you want to try, comment them out in /usr/share/ipa/bind.openssl.cryptopolicy.cnf.template before running ipa-server-install.
https://github.com/openssl/openssl/issues/19102 is the upstream issue. I kindly ask Alexander and Petr watch it.
Thanks Dmitry for pointers. I think it needs just mapping from OpenSSL error to BIND's internal error code, which can it map to intentionally disabled in crypto library. Something similar existing entry in to_result() function [1]. Unless there is a need to change this behaviour in openssl, I think that can be fixed on bind component only. We just need to bind recognize that it was not runtime error in openssl, but signalling from OpenSSL policy this action is not (and not going to be) supported. Which is a purpose of check_algorithm function where if fails anyway. 1. https://github.com/isc-projects/bind9/blob/main/lib/dns/openssl_link.c#L135
Just a draft with mapping current error code to ISC_R_DISABLED, which is in turn recognized by in check_algorithm function.
I'm afraid it's wrong decision - RSA keys are to work, we just can't check it this way...
I have made a scratch build with comment #12 change. It seems it helps in my case. Could you be more verbose why it is a bad decision? Does openssl reporting of disabled algorithm need to change? https://koji.fedoraproject.org/koji/taskinfo?taskID=91464191
Upstream has introduced the change that works normally with providers but works bad when we have simultaneously use engine and provider. And if understand correctly you refuse using RSA keys according to this check. Am I wrong?
No, of course not, that would not be acceptable. It tests each DNSSEC algorithm number. Hence it does not check RSA as a whole, but RSA algorithms 5,7,8,10. It should detect algorithms 5 and 7 disabled, but algorithms 8 and 10 has to be working in any case. If that were the result, then it needs indeed change in OpenSSL. I verified it and you are correct. It disables all RSA algorithms this way, so it cannot validate anything, because the root key is RSA of course. (gdb) p dst_t_func $1 = {0x0, 0x0, 0x7f1ed0e181c0 <openssldh_functions>, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f1ed0e18100 <opensslecdsa_functions>, 0x7f1ed0e18100 <opensslecdsa_functions>, 0x7f1ed0e18040 <openssleddsa_functions>, 0x7f1ed0e18040 <openssleddsa_functions>, 0x0 <repeats 140 times>, 0x7f1ed0e17880 <hmacmd5_functions>, 0x0, 0x0, 0x7f1ed0e1a540 <gssapi_functions>, 0x7f1ed0e177c0 <hmacsha1_functions>, 0x7f1ed0e17700 <hmacsha224_functions>, 0x7f1ed0e17640 <hmacsha256_functions>, 0x7f1ed0e17580 <hmacsha384_functions>, 0x7f1ed0e174c0 <hmacsha512_functions>, 0x0 <repeats 90 times>} (gdb) p dst_t_func[8] $2 = (dst_func_t *) 0x0 (gdb) p dst_t_func[10] $3 = (dst_func_t *) 0x0 (gdb) p dst_t_func[7] $4 = (dst_func_t *) 0x0 (gdb) p dst_t_func[5] $5 = (dst_func_t *) 0x0 (gdb) p dst_t_func[13] $6 = (dst_func_t *) 0x7f1ed0e18100 <opensslecdsa_functions> (gdb) p dst_t_func[12] $7 = (dst_func_t *) 0x0 (gdb) p dst_t_func[11] $8 = (dst_func_t *) 0x0 (gdb) p dst_t_func[15] $9 = (dst_func_t *) 0x7f1ed0e18040 <openssleddsa_functions>
Note, it seems this failure really happens only on Rawhide - the bind update for F37 passed tests: https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=37&groupid=2&build=Update-FEDORA-2022-710b831bc0 so some difference between Rawhide and F37 must be involved here. I'm not sure what, though. Could this actually be related to https://bugzilla.redhat.com/show_bug.cgi?id=2117859 - the bug in openssl-pkcs11-0.4.12-2 (which we have in Rawhide, but not F37)?
(In reply to Adam Williamson from comment #17) > Note, it seems this failure really happens only on Rawhide - the bind update > for F37 passed tests: > https://openqa.fedoraproject.org/tests/ > overview?distri=fedora&version=37&groupid=2&build=Update-FEDORA-2022- > 710b831bc0 > so some difference between Rawhide and F37 must be involved here. I'm not > sure what, though. Could this actually be related to > https://bugzilla.redhat.com/show_bug.cgi?id=2117859 - the bug in > openssl-pkcs11-0.4.12-2 (which we have in Rawhide, but not F37)? I thought it would be difference in openssl, but found there is not yet openssl build for f38. So they share the same binary, which means there is no difference between f37 and rawhide. So yes, this remains as very good candidate of differences. My f37 instance has openssl-pkcs11-0.4.12-1.fc37.x86_64 rawhide instance has openssl-pkcs11-0.4.12-2.fc37.x86_64
yeah, that's basically intentional: we're holding the update out of f37 because of https://bugzilla.redhat.com/show_bug.cgi?id=2117859 . What I can do later today is run a special test of the bind update for F37 with openssl-pkcs11-0.4.12-2.fc37 included, and see if that makes it fail. If it does, then that would definitely mean openssl-pkcs11-0.4.12-2.fc37 triggers this problem as well as 2117859.
I'm sorry but let me repeat. This issue may be caused by a specific change of openssl-pkcs11, but mixing providers and engines in OpenSSL 3.0 is a bad practice especially in case they implement the same algorithm.
I have cloned bug #2123076 from RHEL9 to track OpenSSL provider support. But I doubt we would have it ready for Fedora 38, let alone for anything before it. But I think bind's code does not attempt to mix providers and engines. It uses engine, sure. When it does, it stores all private keys in the engine and none should be stored the other way. But we need to keep FreeIPA working and PKCS11 provider is not there (yet) to satisfy those needs. If we can improve engine usage, then we will. But just general phrases "use provider" won't help much.
I have tested downgrading to: openssl-pkcs11-0.4.12-1.fc37.x86_64 bind-9.18.6-1.fc38.x86_64 named service at least starts fine with it. At first glance validation is working, even on SHA-1 based signatures. Marking bug #2117859 dependent.
@jjelen just noticed whole support for PKCS11 engine has been removed. It has a bit misleading release notes link, but that were mentioned [1] and I have overlooked that. That is explicitly removed in commit 60535fc5 [2]. [1] https://downloads.isc.org/isc/bind9/9.18.6/doc/arm/html/notes.html#removed-feature [2] https://gitlab.isc.org/isc-projects/bind9/commit/60535fc5f7ccee58c641a96fe52d9b15c192698b
Thats the same link I mentioned to you in bug #2117342 and on IRC yesterday. So I am not sure what would be the next steps here or what is the question on me right now. Shall we revert that commit in bind to keep bind working with pkcs11 engines? Shall we mark it unsupported? I think it is quite late for this drastic change.
I think we should revert it. Given that F36 with the same openssl version works fine for previous bind version, engine is working for us. When openssl-pkcs11 provider will be ready, we can migrate to it.
Adding a link to MR !5385, which contains the responsible commit.
I have started experiments on branch engine_pkcs11-revert [1] to development release. It seems just reverting engine disabling is not enough. OpenSSL cannot cope with EVP_PKEY_fromdata initialization after engine were set and used. I have found multiple issues, but ended again on null ctx->keymgmt, which fails check_algorithm() check. We found it should work fine with RSA_* calls as used in v9_16 branch. But no simple change led to code branch for OpenSSL >= 3.0 to work with engines. 1. https://gitlab.isc.org/pemensik/bind9/-/commits/feature/main/engine_pkcs11-revert
I have build a test build [1], it seems it works better. I have built it also as a copr build on pemensik/bind repository [2]. After updating to that version, doing commands: - rndc managed-keys destroy - systemctl restart named Then resolution started working. It should allow also passing of upstream system tests keyfromlabel and engine_pkcs11 when proper configuration environments are passed. 1. https://koji.fedoraproject.org/koji/taskinfo?taskID=91804993 2. https://copr.fedorainfracloud.org/coprs/pemensik/bind/
Testing of that in openQA looks promising: with that build plus bind-dyndb-ldap-11.10-4.fc38 and the rest of Rawhide as it currently is, openQA tests passed even with dnssec enabled...
I guess I will make builds of this into rawhide and fedora37. Even if upstream will make more changes to this, the basic seems to be the only working solution available. Of course excluding implementing real and working PKCS11 provider into openssl and starting using it. This my change makes it use the same API as OpenSSL 1.1 builds, while still linking to OpenSSL 3.0. It is kind of hack, but I do not think we have other ready to work solution available.
FEDORA-2022-0fea8abd6e has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-0fea8abd6e
FEDORA-2022-0fea8abd6e has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2022-cbcb55d5c7 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-cbcb55d5c7
FEDORA-2022-cbcb55d5c7 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-cbcb55d5c7` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-cbcb55d5c7 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2022-cbcb55d5c7 has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report.