We have an integration test between pkcs11-provider and bind. It worked fine with bind 9.19.21 but started failing once 9.12.22 was released to Fedora. In the test we are using dnssec-keyfromlabel to gather the keys from the software token. This command start crashing as follows: # dnssec-keyfromlabel -a RSASHA256 -l 'pkcs11:model=SoftHSM%20v2;manufacturer=SoftHSM%20project;serial=a54dd083bab6e1fa;token=softhsm;object=localhost-zsk;pin-value=123456' localhost free(): invalid pointer 2098 Aborted (core dumped) # coredumpctl info PID: 2104 (dnssec-keyfroml) UID: 0 (root) GID: 0 (root) Signal: 6 (ABRT) Timestamp: Tue 2024-04-30 10:22:53 EDT (3min 51s ago) Command Line: dnssec-keyfromlabel -a RSASHA256 -l $'pkcs11:model=SoftHSM%20v2;manufacturer=SoftHSM%20project;serial=a54dd083bab6e1fa;token=softhsm;object=lo> Executable: /usr/bin/dnssec-keyfromlabel Control Group: /user.slice/user-0.slice/session-4.scope Unit: session-4.scope Slice: user-0.slice Session: 4 Owner UID: 0 (root) Boot ID: d01369d0db3b49ee9d0d8a8917f42f0a Machine ID: 65931729233e46f3b29865737b77c427 Hostname: vm-10-0-186-123.hosted.upshift.rdu2.redhat.com Storage: /var/lib/systemd/coredump/core.dnssec-keyfroml.0.d01369d0db3b49ee9d0d8a8917f42f0a.2104.1714486973000000.zst (present) Size on Disk: 290.4K Package: bind9-next/9.19.22-1.fc41 build-id: 1261b2883e741e59e129b4ff6eeb6668cd7490ff Message: Process 2104 (dnssec-keyfroml) of user 0 dumped core. Module libffi.so.8 from rpm libffi-3.4.6-1.fc41.x86_64 Module p11-kit-client.so from rpm p11-kit-0.25.3-4.fc40.x86_64 Module pkcs11.so from rpm pkcs11-provider-0.3-2.fc40.x86_64 Module libpcre2-8.so.0 from rpm pcre2-10.43-1.fc41.x86_64 Module libselinux.so.1 from rpm libselinux-3.6-4.fc40.x86_64 Module libkeyutils.so.1 from rpm keyutils-1.6.3-3.fc40.x86_64 Module libkrb5support.so.0 from rpm krb5-1.21.2-5.fc40.x86_64 Module liblzma.so.5 from rpm xz-5.4.6-3.fc41.x86_64 Module liblmdb.so.0.0.0 from rpm lmdb-0.9.32-1.fc40.x86_64 Module libprotobuf-c.so.1 from rpm protobuf-c-1.5.0-3.fc40.x86_64 Module libfstrm.so.0 from rpm fstrm-0.6.1-10.fc40.x86_64 Module libmaxminddb.so.0 from rpm libmaxminddb-1.9.1-2.fc40.x86_64 Module libcom_err.so.2 from rpm e2fsprogs-1.47.0-5.fc40.x86_64 Module libk5crypto.so.3 from rpm krb5-1.21.2-5.fc40.x86_64 Module libkrb5.so.3 from rpm krb5-1.21.2-5.fc40.x86_64 Module libgssapi_krb5.so.2 from rpm krb5-1.21.2-5.fc40.x86_64 Module liburcu-cds.so.8 from rpm userspace-rcu-0.14.0-4.fc40.x86_64 Module liburcu-common.so.8 from rpm userspace-rcu-0.14.0-4.fc40.x86_64 Module liburcu.so.8 from rpm userspace-rcu-0.14.0-4.fc40.x86_64 Module libxml2.so.2 from rpm libxml2-2.12.6-1.fc41.x86_64 Module libnghttp2.so.14 from rpm nghttp2-1.61.0-1.fc41.x86_64 Module libjson-c.so.5 from rpm json-c-0.17-3.fc40.x86_64 Module libz.so.1 from rpm zlib-ng-2.1.6-2.fc40.x86_64 Module libcrypto.so.3 from rpm openssl-3.2.1-6.fc41.x86_64 Module libssl.so.3 from rpm openssl-3.2.1-6.fc41.x86_64 Module libuv.so.1 from rpm libuv-1.48.0-1.fc40.x86_64 Module libjemalloc.so.2 from rpm jemalloc-5.3.0-6.fc40.x86_64 Module libdns-9.19.22.so from rpm bind9-next-9.19.22-1.fc41.x86_64 Module libisc-9.19.22.so from rpm bind9-next-9.19.22-1.fc41.x86_64 Module dnssec-keyfromlabel from rpm bind9-next-9.19.22-1.fc41.x86_64 Stack trace of thread 2104: #0 0x00007f1ab04aa184 __pthread_kill_implementation (libc.so.6 + 0x98184) #1 0x00007f1ab045269e raise (libc.so.6 + 0x4069e) #2 0x00007f1ab043a942 abort (libc.so.6 + 0x28942) #3 0x00007f1ab043b7a7 __libc_message_impl.cold (libc.so.6 + 0x297a7) #4 0x00007f1ab04b41b5 malloc_printerr (libc.so.6 + 0xa21b5) #5 0x00007f1ab04b654c _int_free (libc.so.6 + 0xa454c) #6 0x00007f1ab04b8e0e free (libc.so.6 + 0xa6e0e) #7 0x00007f1aaf29f5fa get_interface_inlock (p11-kit-client.so + 0x75fa) #8 0x00007f1aaf2aaf14 C_GetInterface (p11-kit-client.so + 0x12f14) #9 0x00007f1aaf3cbe79 p11prov_module_init (pkcs11.so + 0x13e79) #10 0x00007f1aaf3d6b71 p11prov_ctx_status (pkcs11.so + 0x1eb71) #11 0x00007f1aaf3dd590 OSSL_provider_init (pkcs11.so + 0x25590) #12 0x00007f1aaff81a62 provider_activate (libcrypto.so.3 + 0x181a62) #13 0x00007f1aaff81f8f ossl_provider_activate (libcrypto.so.3 + 0x181f8f) #14 0x00007f1aaff832cf provider_conf_activate (libcrypto.so.3 + 0x1832cf) #15 0x00007f1aaff83651 provider_conf_load (libcrypto.so.3 + 0x183651) #16 0x00007f1aaff8386b provider_conf_init.lto_priv.0 (libcrypto.so.3 + 0x18386b) #17 0x00007f1aafebbe65 CONF_modules_load (libcrypto.so.3 + 0xbbe65) #18 0x00007f1aafebc408 CONF_modules_load_file_ex (libcrypto.so.3 + 0xbc408) #19 0x00007f1aaff72398 ossl_init_config_ossl_ (libcrypto.so.3 + 0x172398) #20 0x00007f1ab04ad55b __pthread_once_slow.isra.0 (libc.so.6 + 0x9b55b) #21 0x00007f1ab04ad5c9 pthread_once.5 (libc.so.6 + 0x9b5c9) #22 0x00007f1aaff7edad CRYPTO_THREAD_run_once (libcrypto.so.3 + 0x17edad) #23 0x00007f1aaff72e6b OPENSSL_init_crypto (libcrypto.so.3 + 0x172e6b) Reproducible: Always Steps to Reproduce: 1. Clone pkcs11-provider github, it is only needed to get the test. # git clone https://github.com/latchset/pkcs11-provider.git 2. Change directory to pkcs11-provider/tests/integration # cd pkcs11-provider/tests/integration/ 3. Execute bind integration test: # bash -x bind.sh Test does the following: * install all the dependencies * setup softhsm token * setup p11kit server as a proxy to the sofhsm token * get the latest pkcs11-provider, build it and install it * create testing openssl configuration * setup bind * attempt to get the keys from the token and use them to sign the zone To create the test we followed https://bind9.readthedocs.io/en/latest/chapter5.html. Actual Results: Unable to get the keys from the token (crash). Expected Results: (this is how it works with bind9-next-9.19.21-1.fc41) # # Test 1: Extract KSK and ZSK keys from PKCS11 URIs # OPENSSL_CONF=openssl.cnf dnssec-keyfromlabel -a RSASHA256 -l 'pkcs11:model=SoftHSM%20v2;manufacturer=SoftHSM%20project;serial=782800c9320b4da1;token=softhsm;object=localhost-zsk;pin-value=123456' localhost Klocalhost.+008+59760 # OPENSSL_CONF=openssl.cnf dnssec-keyfromlabel -a RSASHA256 -l 'pkcs11:model=SoftHSM%20v2;manufacturer=SoftHSM%20project;serial=782800c9320b4da1;token=softhsm;object=localhost-ksk;pin-value=123456' -f KSK localhost # # Test 2: Sign zone # OPENSSL_CONF=openssl.cnf dnssec-signzone -o localhost localhost Verifying the zone using the following algorithms: - RSASHA256 Zone fully signed: Algorithm: RSASHA256: KSKs: 1 active, 0 stand-by, 0 revoked ZSKs: 1 active, 0 stand-by, 0 revoked localhost.signed
It seems still present with 9.19.24 (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007f931a8aa1f3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007f931a85269e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007f931a83a942 in __GI_abort () at abort.c:79 #4 0x00007f931a83b7a7 in __libc_message_impl (fmt=fmt@entry=0x7f931a9c22e8 "%s\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007f931a8b41b5 in malloc_printerr (str=str@entry=0x7f931a9c0062 "free(): invalid pointer") at malloc.c:5772 #6 0x00007f931a8b654c in _int_free (av=<optimized out>, p=p@entry=0x7f9319a0d810, have_lock=have_lock@entry=0) at malloc.c:4507 #7 0x00007f931a8b8e0e in __GI___libc_free (mem=0x7f9319a0d820) at malloc.c:3398 #8 0x00007f93194e75fa in get_interface_inlock (interface=interface@entry=0x7fffd9421998, version=version@entry=0x7fffd94219a6, flags=flags@entry=0) at ../p11-kit/client.c:183 #9 0x00007f93194f2f14 in C_GetInterface (pInterfaceName=<optimized out>, pVersion=<optimized out>, ppInterface=0x7fffd9421998, flags=0) at ../p11-kit/client.c:274 #10 0x00007f9319f37644 in p11prov_interface_init (mctx=0x7f9319a09240) at ../src/interface.c:178 #11 0x00007f9319f37b00 in p11prov_module_init (mctx=0x7f9319a09240) at ../src/interface.c:302 #12 0x00007f9319f442fa in p11prov_ctx_status (ctx=0x7f9319a66540) at ../src/provider.c:474 #13 0x00007f9319f4a58f in OSSL_provider_init (handle=0x7f9319a66380, in=0x7f931a69f300 <core_dispatch_>, out=0x7fffd9421da8, provctx=0x7fffd9421db0) at ../src/provider.c:1613 #14 0x00007f931a381aa2 in provider_init (prov=0x7f9319a66380) at crypto/provider_core.c:964 #15 provider_activate (prov=prov@entry=0x7f9319a66380, lock=0, lock@entry=1, upcalls=upcalls@entry=1) at crypto/provider_core.c:1173 #16 0x00007f931a381fcf in ossl_provider_activate (prov=prov@entry=0x7f9319a66380, upcalls=upcalls@entry=1, aschild=aschild@entry=0) at crypto/provider_core.c:1310 #17 0x00007f931a3832bf in provider_conf_activate (libctx=libctx@entry=0x7f931a6a8c60 <default_context_int.lto_priv>, name=name@entry=0x7f9319a20038 "pkcs11", value=value@entry=0x7f9319a52ba0 "pkcs11_sect", path=path@entry=0x7f9319a1d1b0 "/usr/lib64/ossl-modules/pkcs11.so", soft=soft@entry=0, cnf=cnf@entry=0x7f9319a1d180) at crypto/provider_conf.c:245 #18 0x00007f931a3836a1 in provider_conf_load (libctx=0x7f931a6a8c60 <default_context_int.lto_priv>, name=<optimized out>, name@entry=0x7f9319a20038 "pkcs11", value=value@entry=0x7f9319a52ba0 "pkcs11_sect", cnf=cnf@entry=0x7f9319a1d180) at crypto/provider_conf.c:325 #19 0x00007f931a383d6b in provider_conf_init (md=<optimized out>, cnf=0x7f9319a1d180) at crypto/provider_conf.c:381 #20 0x00007f931a2bbe45 in module_init (pmod=0x7f9319a1d4e0, name=0x7f9319a08110 "providers", value=0x7f9319a52940 "provider_sect", cnf=<optimized out>) at crypto/conf/conf_mod.c:425 #21 module_run (cnf=<optimized out>, name=0x7f9319a08110 "providers", value=<optimized out>, flags=<optimized out>) at crypto/conf/conf_mod.c:272 #22 CONF_modules_load (cnf=cnf@entry=0x7f9319a1d180, appname=appname@entry=0x0, flags=32, flags@entry=50) at crypto/conf/conf_mod.c:165 #23 0x00007f931a2bc3e8 in CONF_modules_load_file_ex (libctx=<optimized out>, filename=0x0, appname=0x0, flags=50) at crypto/conf/conf_mod.c:214 #24 0x00007f931a3723d8 in ossl_config_int (settings=0x0) at crypto/conf/conf_sap.c:68 #25 ossl_config_int (settings=0x0) at crypto/conf/conf_sap.c:44 #26 ossl_init_config () at crypto/init.c:256 #27 ossl_init_config_ossl_ () at crypto/init.c:254 #28 0x00007f931a8ad55b in __pthread_once_slow (once_control=0x7f931a6a950c <config>, init_routine=0x7f931a3723b0 <ossl_init_config_ossl_>) at pthread_once.c:116 #29 0x00007f931a8ad5c9 in ___pthread_once (once_control=<optimized out>, init_routine=<optimized out>) at pthread_once.c:143 #30 0x00007f931a37eded in CRYPTO_THREAD_run_once (once=<optimized out>, init=<optimized out>) at crypto/threads_pthread.c:152 #31 0x00007f931a372eab in OPENSSL_init_crypto (opts=opts@entry=554572, settings=settings@entry=0x0) at crypto/init.c:599 #32 0x00007f931b0e790d in OPENSSL_init_ssl (opts=554572, opts@entry=554560, settings=settings@entry=0x0) at ssl/ssl_init.c:115 #33 0x00007f931b275294 in isc__tls_initialize () at ../../../lib/isc/tls.c:177 #34 0x00007f931b22f79c in isc__initialize () at ../../../lib/isc/lib.c:50 #35 0x00007f931b2b6427 in call_init (l=<optimized out>, argc=4, argv=0x7fffd9422238, env=0x7fffd9422260) at dl-init.c:74 #36 call_init (l=<optimized out>, argc=4, argv=0x7fffd9422238, env=0x7fffd9422260) at dl-init.c:26 #37 0x00007f931b2b651d in _dl_init (main_map=0x7f931b2e82e0, argc=4, argv=0x7fffd9422238, env=0x7fffd9422260) at dl-init.c:121 #38 0x00007f931b2ce420 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 #39 0x0000000000000004 in ?? () #40 0x00007fffd942338c in ?? () #41 0x00007fffd942339c in ?? () #42 0x00007fffd942339f in ?? () #43 0x00007fffd94233a9 in ?? () #44 0x0000000000000000 in ?? ()
Looks like p11-kit is crashing here. Can you try with a pkcs11-provider coniguration that explicitly load softhsm directly and see if that mitigates the issue? You can do that by setting this env variable before running bind: PKCS11_PROVIDER_MODULE="/usr/lib64/pkcs11/libsofthsm2.so"
This is super strange. It does not happen anywhere in bind code, but even long before it hits main function. That is why ?? is on bottom of stack. Local exec file: `/usr/bin/dnssec-keyfromlabel', file type elf64-x86-64. (gdb) show args Argument list to give program being debugged when it is started is "-h". (gdb) run (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff76aa1f3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ffff765269e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007ffff763a942 in __GI_abort () at abort.c:79 #4 0x00007ffff763b7a7 in __libc_message_impl (fmt=fmt@entry=0x7ffff77c22e8 "%s\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007ffff76b41b5 in malloc_printerr (str=str@entry=0x7ffff77c0062 "free(): invalid pointer") at malloc.c:5772 #6 0x00007ffff76b654c in _int_free (av=<optimized out>, p=p@entry=0x7ffff680d810, have_lock=have_lock@entry=0) at malloc.c:4507 #7 0x00007ffff76b8e0e in __GI___libc_free (mem=0x7ffff680d820) at malloc.c:3398 #8 0x00007ffff62e75fa in get_interface_inlock (interface=interface@entry=0x7fffffffd7f8, version=version@entry=0x7fffffffd806, flags=flags@entry=0) at ../p11-kit/client.c:183 #9 0x00007ffff62f2f14 in C_GetInterface (pInterfaceName=<optimized out>, pVersion=<optimized out>, ppInterface=0x7fffffffd7f8, flags=0) at ../p11-kit/client.c:274 #10 0x00007ffff6c48644 in p11prov_interface_init (mctx=0x7ffff6809240) at ../src/interface.c:178 #11 0x00007ffff6c48b00 in p11prov_module_init (mctx=0x7ffff6809240) at ../src/interface.c:302 #12 0x00007ffff6c552fa in p11prov_ctx_status (ctx=0x7ffff6866540) at ../src/provider.c:474 #13 0x00007ffff6c5b58f in OSSL_provider_init (handle=0x7ffff6866380, in=0x7ffff749f300 <core_dispatch_>, out=0x7fffffffdc08, provctx=0x7fffffffdc10) at ../src/provider.c:1613 #14 0x00007ffff7181aa2 in provider_init (prov=0x7ffff6866380) at crypto/provider_core.c:964 #15 provider_activate (prov=prov@entry=0x7ffff6866380, lock=0, lock@entry=1, upcalls=upcalls@entry=1) at crypto/provider_core.c:1173 #16 0x00007ffff7181fcf in ossl_provider_activate (prov=prov@entry=0x7ffff6866380, upcalls=upcalls@entry=1, aschild=aschild@entry=0) at crypto/provider_core.c:1310 #17 0x00007ffff71832bf in provider_conf_activate (libctx=libctx@entry=0x7ffff74a8c60 <default_context_int.lto_priv>, name=name@entry=0x7ffff6820038 "pkcs11", value=value@entry=0x7ffff6852ba0 "pkcs11_sect", path=path@entry=0x7ffff681d1b0 "/usr/lib64/ossl-modules/pkcs11.so", soft=soft@entry=0, cnf=cnf@entry=0x7ffff681d180) at crypto/provider_conf.c:245 #18 0x00007ffff71836a1 in provider_conf_load (libctx=0x7ffff74a8c60 <default_context_int.lto_priv>, name=<optimized out>, name@entry=0x7ffff6820038 "pkcs11", value=value@entry=0x7ffff6852ba0 "pkcs11_sect", cnf=cnf@entry=0x7ffff681d180) at crypto/provider_conf.c:325 #19 0x00007ffff7183d6b in provider_conf_init (md=<optimized out>, cnf=0x7ffff681d180) at crypto/provider_conf.c:381 #20 0x00007ffff70bbe45 in module_init (pmod=0x7ffff681d4e0, name=0x7ffff6808110 "providers", value=0x7ffff6852940 "provider_sect", cnf=<optimized out>) at crypto/conf/conf_mod.c:425 #21 module_run (cnf=<optimized out>, name=0x7ffff6808110 "providers", value=<optimized out>, flags=<optimized out>) at crypto/conf/conf_mod.c:272 #22 CONF_modules_load (cnf=cnf@entry=0x7ffff681d180, appname=appname@entry=0x0, flags=32, flags@entry=50) at crypto/conf/conf_mod.c:165 #23 0x00007ffff70bc3e8 in CONF_modules_load_file_ex (libctx=<optimized out>, filename=0x0, appname=0x0, flags=50) at crypto/conf/conf_mod.c:214 #24 0x00007ffff71723d8 in ossl_config_int (settings=0x0) at crypto/conf/conf_sap.c:68 #25 ossl_config_int (settings=0x0) at crypto/conf/conf_sap.c:44 #26 ossl_init_config () at crypto/init.c:256 #27 ossl_init_config_ossl_ () at crypto/init.c:254 #28 0x00007ffff76ad55b in __pthread_once_slow (once_control=0x7ffff74a950c <config>, init_routine=0x7ffff71723b0 <ossl_init_config_ossl_>) at pthread_once.c:116 #29 0x00007ffff76ad5c9 in ___pthread_once (once_control=<optimized out>, init_routine=<optimized out>) at pthread_once.c:143 #30 0x00007ffff717eded in CRYPTO_THREAD_run_once (once=<optimized out>, init=<optimized out>) at crypto/threads_pthread.c:152 #31 0x00007ffff7172eab in OPENSSL_init_crypto (opts=opts@entry=554572, settings=settings@entry=0x0) at crypto/init.c:599 #32 0x00007ffff7b5190d in OPENSSL_init_ssl (opts=554572, opts@entry=554560, settings=settings@entry=0x0) at ssl/ssl_init.c:115 #33 0x00007ffff7f8b294 in isc__tls_initialize () at ../../../lib/isc/tls.c:177 #34 0x00007ffff7f4579c in isc__initialize () at ../../../lib/isc/lib.c:50 #35 0x00007ffff7fcc427 in call_init (l=<optimized out>, argc=2, argv=0x7fffffffe098, env=0x7fffffffe0b0) at dl-init.c:74 #36 call_init (l=<optimized out>, argc=2, argv=0x7fffffffe098, env=0x7fffffffe0b0) at dl-init.c:26 #37 0x00007ffff7fcc51d in _dl_init (main_map=0x7ffff7ffe2e0, argc=2, argv=0x7fffffffe098, env=0x7fffffffe0b0) at dl-init.c:121 #38 0x00007ffff7fe4420 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 #39 0x0000000000000002 in ?? () #40 0x00007fffffffe383 in ?? () #41 0x00007fffffffe3a0 in ?? () #42 0x0000000000000000 in ?? ()
It crashes even with softhsm directly. PKCS11_PROVIDER_MODULE=/usr/lib64/pkcs11/libsofthsm2.so gdb --args dnssec-keyfromlabel -h (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff76aa1f3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ffff765269e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007ffff763a942 in __GI_abort () at abort.c:79 #4 0x00007ffff763b7a7 in __libc_message_impl (fmt=fmt@entry=0x7ffff77c22e8 "%s\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007ffff76b41b5 in malloc_printerr (str=str@entry=0x7ffff77c0062 "free(): invalid pointer") at malloc.c:5772 #6 0x00007ffff76b654c in _int_free (av=<optimized out>, p=p@entry=0x7ffff686f710, have_lock=have_lock@entry=0) at malloc.c:4507 #7 0x00007ffff76b8e0e in __GI___libc_free (mem=0x7ffff686f720) at malloc.c:3398 #8 0x00007ffff638ca22 in SimpleConfigLoader::loadConfiguration (this=0x555555572610) at SimpleConfigLoader.cpp:88 #9 0x00007ffff638b5a3 in Configuration::reload (this=this@entry=0x555555572560) at Configuration.cpp:161 #10 0x00007ffff638b600 in Configuration::reload (this=this@entry=0x555555572560, inConfigLoader=<optimized out>) at Configuration.cpp:176 #11 0x00007ffff637452c in SoftHSM::C_Initialize (this=0x555555572330, pInitArgs=pInitArgs@entry=0x7fffffffd850) at SoftHSM.cpp:530 #12 0x00007ffff633e9ac in C_Initialize (pInitArgs=0x7fffffffd850) at main.cpp:137 #13 0x00007ffff6c41fe3 in p11prov_Initialize (ctx=0x7ffff6866540, pInitArgs=0x7fffffffd850) at ../src/interface.gen.c:18 #14 0x00007ffff6c48b39 in p11prov_module_init (mctx=0x7ffff6809240) at ../src/interface.c:309 #15 0x00007ffff6c552fa in p11prov_ctx_status (ctx=0x7ffff6866540) at ../src/provider.c:474 #16 0x00007ffff6c5b58f in OSSL_provider_init (handle=0x7ffff6866380, in=0x7ffff749f300 <core_dispatch_>, out=0x7fffffffdc08, provctx=0x7fffffffdc10) at ../src/provider.c:1613 #17 0x00007ffff7181aa2 in provider_init (prov=0x7ffff6866380) at crypto/provider_core.c:964 #18 provider_activate (prov=prov@entry=0x7ffff6866380, lock=0, lock@entry=1, upcalls=upcalls@entry=1) at crypto/provider_core.c:1173 #19 0x00007ffff7181fcf in ossl_provider_activate (prov=prov@entry=0x7ffff6866380, upcalls=upcalls@entry=1, aschild=aschild@entry=0) at crypto/provider_core.c:1310 #20 0x00007ffff71832bf in provider_conf_activate (libctx=libctx@entry=0x7ffff74a8c60 <default_context_int.lto_priv>, name=name@entry=0x7ffff6820038 "pkcs11", value=value@entry=0x7ffff6852ba0 "pkcs11_sect", path=path@entry=0x7ffff681d1b0 "/usr/lib64/ossl-modules/pkcs11.so", soft=soft@entry=0, cnf=cnf@entry=0x7ffff681d180) at crypto/provider_conf.c:245 #21 0x00007ffff71836a1 in provider_conf_load (libctx=0x7ffff74a8c60 <default_context_int.lto_priv>, name=<optimized out>, name@entry=0x7ffff6820038 "pkcs11", value=value@entry=0x7ffff6852ba0 "pkcs11_sect", cnf=cnf@entry=0x7ffff681d180) at crypto/provider_conf.c:325 #22 0x00007ffff7183d6b in provider_conf_init (md=<optimized out>, cnf=0x7ffff681d180) at crypto/provider_conf.c:381 #23 0x00007ffff70bbe45 in module_init (pmod=0x7ffff681d4e0, name=0x7ffff6808110 "providers", value=0x7ffff6852940 "provider_sect", cnf=<optimized out>) at crypto/conf/conf_mod.c:425 #24 module_run (cnf=<optimized out>, name=0x7ffff6808110 "providers", value=<optimized out>, flags=<optimized out>) at crypto/conf/conf_mod.c:272 #25 CONF_modules_load (cnf=cnf@entry=0x7ffff681d180, appname=appname@entry=0x0, flags=32, flags@entry=50) at crypto/conf/conf_mod.c:165 #26 0x00007ffff70bc3e8 in CONF_modules_load_file_ex (libctx=<optimized out>, filename=0x0, appname=0x0, flags=50) at crypto/conf/conf_mod.c:214 #27 0x00007ffff71723d8 in ossl_config_int (settings=0x0) at crypto/conf/conf_sap.c:68 #28 ossl_config_int (settings=0x0) at crypto/conf/conf_sap.c:44 #29 ossl_init_config () at crypto/init.c:256 #30 ossl_init_config_ossl_ () at crypto/init.c:254 #31 0x00007ffff76ad55b in __pthread_once_slow (once_control=0x7ffff74a950c <config>, init_routine=0x7ffff71723b0 <ossl_init_config_ossl_>) at pthread_once.c:116 #32 0x00007ffff76ad5c9 in ___pthread_once (once_control=<optimized out>, init_routine=<optimized out>) at pthread_once.c:143 #33 0x00007ffff717eded in CRYPTO_THREAD_run_once (once=<optimized out>, init=<optimized out>) at crypto/threads_pthread.c:152 #34 0x00007ffff7172eab in OPENSSL_init_crypto (opts=opts@entry=554572, settings=settings@entry=0x0) at crypto/init.c:599 #35 0x00007ffff7b5190d in OPENSSL_init_ssl (opts=554572, opts@entry=554560, settings=settings@entry=0x0) at ssl/ssl_init.c:115 #36 0x00007ffff7f8b294 in isc__tls_initialize () at ../../../lib/isc/tls.c:177 #37 0x00007ffff7f4579c in isc__initialize () at ../../../lib/isc/lib.c:50 #38 0x00007ffff7fcc427 in call_init (l=<optimized out>, argc=2, argv=0x7fffffffe098, env=0x7fffffffe0b0) at dl-init.c:74 #39 call_init (l=<optimized out>, argc=2, argv=0x7fffffffe098, env=0x7fffffffe0b0) at dl-init.c:26 #40 0x00007ffff7fcc51d in _dl_init (main_map=0x7ffff7ffe2e0, argc=2, argv=0x7fffffffe098, env=0x7fffffffe0b0) at dl-init.c:121 #41 0x00007ffff7fe4420 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 #42 0x0000000000000002 in ?? () #43 0x00007fffffffe386 in ?? () #44 0x00007fffffffe3a3 in ?? () #45 0x0000000000000000 in ?? ()
Could it be caused by jemalloc linked in? ISC code is somehow referenced, but not from normal main function. #36 0x00007ffff7f8b294 in isc__tls_initialize () at ../../../lib/isc/tls.c:177 #37 0x00007ffff7f4579c in isc__initialize () at ../../../lib/isc/lib.c:50 PKCS11_PROVIDER_MODULE=/dev/null Not sure what hacks are responsible for initialize called before main is even hit. I can just assume this might be linked to -flto flag. Breakpoint 1, isc__initialize () at ../../../lib/isc/lib.c:46 46 isc__initialize(void) { (gdb) bt #0 isc__initialize () at ../../../lib/isc/lib.c:46 #1 0x00007ffff7fcc427 in call_init (l=<optimized out>, argc=2, argv=0x7fffffffe0a8, env=0x7fffffffe0c0) at dl-init.c:74 #2 call_init (l=<optimized out>, argc=2, argv=0x7fffffffe0a8, env=0x7fffffffe0c0) at dl-init.c:26 #3 0x00007ffff7fcc51d in _dl_init (main_map=0x7ffff7ffe2e0, argc=2, argv=0x7fffffffe0a8, env=0x7fffffffe0c0) at dl-init.c:121 #4 0x00007ffff7fe4420 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 #5 0x0000000000000002 in ?? () #6 0x00007fffffffe39d in ?? () #7 0x00007fffffffe3ba in ?? () #8 0x0000000000000000 in ?? () (gdb) cont Continuing. Breakpoint 2, main (argc=2, argv=0x7fffffffe0a8) at ../../../bin/dnssec/dnssec-keyfromlabel.c:109 109 main(int argc, char **argv) { (gdb) bt #0 main (argc=2, argv=0x7fffffffe0a8) at ../../../bin/dnssec/dnssec-keyfromlabel.c:109 (gdb) cont
Is any of the ISC code using ELF constructors/destructors like __attribute__((constructor)) or .init/.fini ? Those would run before main() is invoked (by design). However it is not immediately clear to me why that would cause a fault unless, once again, softhsm being based on .cpp has issues because its code depends on cpp constructors to be run before it is initialized and such constructors have not run yet because they have been pre-empted by an ISC library constructor. If ISC code uses constructors I would definitely see if that feature can be disabled, or if we can ensure it does not try to initialize the whole tls stack. That stack should be explicitly initialized in the main function, or implicitly initialized later at first time of use.
lib/isc/include/isc/util.h: #if HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR && HAVE_FUNC_ATTRIBUTE_DESTRUCTOR #define ISC_CONSTRUCTOR __attribute__((constructor)) #define ISC_DESTRUCTOR __attribute__((destructor)) #else #define ISC_CONSTRUCTOR #define ISC_DESTRUCTOR #endif lib/isc/lib.c: void isc__initialize(void) ISC_CONSTRUCTOR; void isc__shutdown(void) ISC_DESTRUCTOR; This attribute is probably responsible for early calls. Somehow I doubt it were meant for such complex initialization, including loading of multiple plugins in it. I just assume some libraries rely on similar initialization, but this is not ordered explicitly. It looks fragile and no wonder series of plugins fail to cope with it properly. isc__initialize(void); is never called explicitly, but relies on constructor of library loader. But that is there quite long. But so far I have failed to identify any change, which would look related. It does not seem supported to handle it different way later on demand. https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/8576 has made changes to tls parameters and crashes happen in tls initialization. But at least from looking at them, they should not be related.
What library needs this? Why not just call isc__initialize(void) from the daemon's main() process?
or have a global lazy variable with a lock and initialize on demand on first use, library constructors are very tricky and generally a bad idea for this kind of initialization, good libraries have a library_init() (and a library_fini()) function applications can call explicitly.
something like that used to be present still in 9.16. There were always strange issues with various openssl modules changing this and that initialization or openssl destruction. Very similar crash happens also with bind-dnssec-utils-9.18.27-1.fc39.x86_64 on normal bind package.
I have tried compilation of main branch, v9.19.21 and v9.19.19 tag. They all fail with configured environment by test the same way. Are you sure the change causing crashes were working in 9.19.21 version? Could have something else changed in addition? But I admit the same test on f41 rawhide passes with bind9-next-dnssec-utils-9.19.21-1.fc41.x86_64 and bind9-next-dnssec-utils-9.19.19-3.fc40.x86_64. On my fedora39, with the same p11-kit, I have prepared for bisecting bind code. But even version which works from package still fails on my machine, compiled from source. With basically identical failure. Could it be change in compiler or linker perhaps, triggering this? I admit debugger backtrace is not much helping, as the freed address is optimized out since it were passed to state->rpc = p11_rpc_transport_new (&state->virt, address, "client"); at ./p11-kit/client.c:141. It returns again to the same line again, but address is optimized out at that time, until it should be freed. Which makes it bit more probable there is something weird with memory functions in binary. It I am not sure. It seems memory allocators are clashing in unexpected ways.
Oh crap. At least my machine issues happen, because malloc is used from jemalloc library, but free is used from normal libc. And then is no wonder it does not recognize the pointer. Breakpoint 2.1, __GI___strdup (s=0x7fffffffcc60 "/home/pemensik/.config/softhsm2/softhsm2.conf") at strdup.c:40 40 { (gdb) info symbol __GI___strdup strdup in section .text of /lib64/libc.so.6 (gdb) bt #0 malloc (size=size@entry=46) at src/jemalloc.c:2745 #1 0x00007ffff6ec360f in __GI___strdup (s=0x7fffffffcc60 "/home/pemensik/.config/softhsm2/softhsm2.conf") at strdup.c:42 #2 0x00007ffff6c7b60d in get_user_path () at SimpleConfigLoader.cpp:244 #3 0x00007ffff6c7c0a5 in SimpleConfigLoader::getConfigPath (this=<optimized out>) at SimpleConfigLoader.cpp:304 #4 SimpleConfigLoader::loadConfiguration (this=0x41d610) at SimpleConfigLoader.cpp:78 #5 0x00007ffff6c7a5a3 in Configuration::reload (this=this@entry=0x41d560) at Configuration.cpp:161 #6 0x00007ffff6c7a600 in Configuration::reload (this=this@entry=0x41d560, inConfigLoader=<optimized out>) at Configuration.cpp:176 #7 0x00007ffff6c6352c in SoftHSM::C_Initialize (this=0x41d330, pInitArgs=pInitArgs@entry=0x7fffffffd610) at SoftHSM.cpp:530 #8 0x00007ffff6c2d9ac in C_Initialize (pInitArgs=0x7fffffffd610) at main.cpp:137 #9 0x00007ffff6d09fe5 in p11prov_Initialize (ctx=0x7ffff685b540, pInitArgs=0x7fffffffd610) at ../src/interface.gen.c:18 (gdb) info symbol malloc malloc in section .text of /lib64/libjemalloc.so.2 Breakpoint 3.3, __GI___libc_free (mem=0x7ffff684c7e0) at malloc.c:3352 3352 { (gdb) bt #0 __GI___libc_free (mem=0x7ffff684c7e0) at malloc.c:3352 #1 0x00007ffff6c7ba22 in SimpleConfigLoader::loadConfiguration (this=0x41d610) at SimpleConfigLoader.cpp:88 #2 0x00007ffff6c7a5a3 in Configuration::reload (this=this@entry=0x41d560) at Configuration.cpp:161 #3 0x00007ffff6c7a600 in Configuration::reload (this=this@entry=0x41d560, inConfigLoader=<optimized out>) at Configuration.cpp:176 #4 0x00007ffff6c6352c in SoftHSM::C_Initialize (this=0x41d330, pInitArgs=pInitArgs@entry=0x7fffffffd610) at SoftHSM.cpp:530 #5 0x00007ffff6c2d9ac in C_Initialize (pInitArgs=0x7fffffffd610) at main.cpp:137 #6 0x00007ffff6d09fe5 in p11prov_Initialize (ctx=0x7ffff685b540, pInitArgs=0x7fffffffd610) at ../src/interface.gen.c:18 When debugging the working version in package, it were calling libc malloc from strdup and on free as well. Breakpoint 3.2, __GI___libc_malloc (bytes=bytes@entry=19) at malloc.c:3294 3294 { (gdb) info symbol __GI___libc_malloc malloc in section .text of /lib64/libc.so.6 Breakpoint 2.4, __GI___libc_free (mem=0x555555565140) at malloc.c:3359 (gdb) info symbol __GI___libc_free free in section .text of /lib64/libc.so.6 So there is clear mess and I am not sure how it could be solved. Seems like problem somewhere in dynamic loader or linker.
versions installed on my machine: jemalloc-devel-5.3.0-4.fc39.x86_64 p11-kit-0.25.3-1.fc39.x86_64 openssl-devel-3.1.1-4.fc39.x86_64 gcc-13.3.1-1.fc39.x86_64 binutils-2.40-14.fc39.x86_64 working version bind9-next-9.19.21-1.fc41 were build with: DEBUG util.py:463: binutils x86_64 2.41-32.fc40 build 6.2 M DEBUG util.py:463: gcc x86_64 14.0.1-0.6.fc40 build 37 M DEBUG util.py:463: gnupg2 x86_64 2.4.4-1.fc40 build 2.7 M DEBUG util.py:463: jemalloc-devel x86_64 5.3.0-6.fc40 build 81 k DEBUG util.py:463: openssl-devel x86_64 1:3.2.1-2.fc40 build 2.7 M DEBUG util.py:463: p11-kit x86_64 0.25.3-4.fc40 build 476 k DEBUG util.py:463: p11-kit-trust x86_64 0.25.3-4.fc40 build 132 k DEBUG util.py:463: libstdc++-14.0.1-0.6.fc40.x86_64 Broken version bind9-next-9.19.22-1.fc41 build with: DEBUG util.py:463: binutils x86_64 2.42.50-6.fc41 build 27.2 MiB DEBUG util.py:463: binutils-gold x86_64 2.42.50-6.fc41 build 2.0 MiB DEBUG util.py:463: openssl-devel x86_64 1:3.2.1-6.fc41 build 4.3 MiB DEBUG util.py:463: libgcc x86_64 14.0.1-0.15.fc41 build 270.6 KiB DEBUG util.py:463: openssl-libs x86_64 1:3.2.1-6.fc41 build 7.8 MiB DEBUG util.py:463: p11-kit x86_64 0.25.3-4.fc40 build 2.2 MiB DEBUG util.py:463: p11-kit-trust x86_64 0.25.3-4.fc40 build 391.4 KiB DEBUG util.py:463: jemalloc-devel x86_64 5.3.0-6.fc40 build 221.1 KiB The only version difference seems binutils. https://koji.fedoraproject.org/koji/taskinfo?taskID=118717255 My new scratch build at .24 still at: binutils-2.42.50-14.fc41.x86_64 libgcc-14.1.1-4.fc41.x86_64 elfutils-0.191-7.fc41.x86_64 openssl-libs-3.2.2-1.fc41.x86_64 p11-kit-0.25.3-4.fc40.x86_64 p11-kit-trust-0.25.3-4.fc40.x86_64
at least objdump suggests it should be different: (debug-test) objdump -T /usr/lib64/pkcs11/libsofthsm2.so | grep -wE 'malloc|realloc|free' 0000000000000000 DF *UND* 0000000000000000 (GLIBC_2.2.5) free 0000000000000000 DF *UND* 0000000000000000 (GLIBC_2.2.5) malloc (debug-test) objdump -T /usr/lib64/libisc-9.19.21.so | grep -wE 'malloc|realloc|free' 0000000000000000 DF *UND* 0000000000000000 Base free 0000000000000000 DF *UND* 0000000000000000 Base malloc (debug-test) objdump -T /usr/lib64/libisc-9.19.24.so | grep -wE 'malloc|realloc|free' 0000000000000000 DF *UND* 0000000000000000 Base free 0000000000000000 DF *UND* 0000000000000000 Base malloc I guess it would depend on dlopen flags also. Breakpoint 1, ___dlopen (file=0x7ffff681d600 "/usr/lib64/ossl-modules/pkcs11.so", mode=2) at dlopen.c:77 Breakpoint 1, ___dlopen (file=0x7ffff681d7e0 "/usr/lib64/pkcs11/libsofthsm2.so", mode=10) at dlopen.c:77
Petr, can you just disable jemalloc use? Seem like the best way to go.
There seems to be documented in upstream issue: https://gitlab.isc.org/isc-projects/bind9/-/issues/4404 I tried to scan code, but this changed mostly only link flags of resulting issues with linker. But the overall result might be what we see in debugger. Especially because pkcs11 provider does not use RTLD_DEEPBIND flag, which should ensure symbols linked directly to pkcs11.so are preferred, including memory management functions. But both seem to use RTLD_NOW, which should ensure whatever memory allocation function are used, they should come from the same memory allocator. Having allocator from jemalloc and free from glibc is strange hybrid, which should not happen. There seems to be also issue with bug #2278016 on samba component, which reports issues with recent bind issues.
configure.ac contains: echo "Features disabled or unavailable on this platform:" if test "no" = "$with_jemalloc"; then echo " Memory allocator: system" echo echo " +------------------------------------------+" echo " | ==== WARNING ==== |" echo " | |" echo " | This is NOT a recommended configuration. |" echo " | Using the system memory allocator causes |" echo " | reduced performance and increased memory |" echo " | fragmentation. Installing the jemalloc |" echo " | memory allocator (version >= 4.0.0) is |" echo " | strongly recommended. |" echo " +------------------------------------------+" echo fi I have not done much actual testing of alternative allocator, but I trust they did not choose it by random changes. ISC is doing great work, so I would like to use recommended solution if possible. But I admit, we are hitting strange issues. Without clear evidence this change is actually broken by bind code indeed. But commits https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/8609/commits happened from good to broken version. It just changes flags on tests however. But we see again partial use of jemalloc, but in different cases. Fixes referenced in https://gitlab.isc.org/isc-projects/bind9/-/issues/4404 may not be complete.
Okay, I have tried to disable jemalloc in bind builds by: https://src.fedoraproject.org/rpms/bind9-next/pull-request/7 But there is some reproducible issue visible on all Fedora releases, where it just fails to create documentation using cfg_test tool. - Passed on test build on epel9: https://koji.fedoraproject.org/koji/taskinfo?taskID=119510454 - Failed on rawhide scratch: https://koji.fedoraproject.org/koji/taskinfo?taskID=119510454 - Failed on f40 scratch: https://koji.fedoraproject.org/koji/taskinfo?taskID=119513219 - Failed on f39 scratch: https://koji.fedoraproject.org/koji/taskinfo?taskID=119513743 I have reproduced it multiple times in mockbuild. Even have backtrace for it: *** buffer overflow detected ***: terminated Program received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff7d89613 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ffff7d30d0e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007ffff7d18942 in __GI_abort () at abort.c:79 #4 0x00007ffff7d197a7 in __libc_message_impl (fmt=fmt@entry=0x7ffff7ea4163 "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007ffff7e178f9 in __GI___fortify_fail (msg=msg@entry=0x7ffff7ea414a "buffer overflow detected") at fortify_fail.c:24 #6 0x00007ffff7e17294 in __GI___chk_fail () at chk_fail.c:28 #7 0x00007ffff7f6e018 in memset (__dest=<optimized out>, __ch=<optimized out>, __len=<optimized out>, __dest=<optimized out>, __ch=<optimized out>, __len=<optimized out>) at /usr/include/bits/string_fortified.h:59 #8 mallocx (size=93824992546016, flags=<optimized out>) at ../../../lib/isc/jemalloc_shim.h:71 #9 mem_get (ctx=ctx@entry=0x55555559ea10, size=size@entry=48, flags=flags@entry=64) at ../../../lib/isc/mem.c:307 #10 0x00007ffff7f6fb33 in isc__mem_get (ctx=0x55555559ea10, size=48, flags=flags@entry=64) at ../../../lib/isc/mem.c:692 #11 0x00007ffff7f70377 in isc__mem_reget (ctx=<optimized out>, old_ptr=<optimized out>, old_size=<optimized out>, new_size=<optimized out>, flags=flags@entry=64) at ../../../lib/isc/mem.c:813 #12 0x00007ffff7f706f3 in sync_channellist (lcfg=0x5555555a0b50) at ../../../lib/isc/log.c:970 #13 0x00007ffff7f7201b in isc_log_create (mctx=<optimized out>, lctxp=lctxp@entry=0x7fffffffde00, lcfgp=lcfgp@entry=0x7fffffffde08) at ../../../lib/isc/log.c:273 #14 0x00005555555553f9 in main (argc=1, argv=0x7fffffffdfa8) at ../../../doc/misc/cfg_test.c:76 There seems to be happening something quite weird in mallocx stub replacement. When it enters the mallocx function after hitting isc__mem_reget, it has still good size. #0 checked_request2size (req=<optimized out>) at malloc.c:3306 #1 __GI___libc_malloc (bytes=bytes@entry=48) at malloc.c:3305 #2 0x00007ffff7f6dfc9 in mallocx (size=48, flags=64) at ../../../lib/isc/jemalloc_shim.h:67 #3 mem_get (ctx=ctx@entry=0x55555559ea10, size=size@entry=48, flags=flags@entry=64) at ../../../lib/isc/mem.c:307 #4 0x00007ffff7f6fb33 in isc__mem_get (ctx=0x55555559ea10, size=48, flags=flags@entry=64) at ../../../lib/isc/mem.c:692 #5 0x00007ffff7f70377 in isc__mem_reget (ctx=<optimized out>, old_ptr=<optimized out>, old_size=<optimized out>, new_size=<optimized out>, flags=flags@entry=64) at ../../../lib/isc/mem.c:813 #6 0x00007ffff7f706f3 in sync_channellist (lcfg=0x5555555a0b50) at ../../../lib/isc/log.c:970 #7 0x00007ffff7f7201b in isc_log_create (mctx=<optimized out>, lctxp=lctxp@entry=0x7fffffffde00, lcfgp=lcfgp@entry=0x7fffffffde08) at ../../../lib/isc/log.c:273 #8 0x00005555555553f9 in main (argc=1, argv=0x7fffffffdfa8) at ../../../doc/misc/cfg_test.c:76 But then something strange happens in malloc and size changes in backtrace. #0 __malloc_usable_size (m=m@entry=0x5555555a0ce0) at malloc.c:5249 #1 0x00007ffff7f6e008 in sallocx (ptr=0x5555555a0ce0, flags=<optimized out>) at ../../../lib/isc/jemalloc_shim.h:60 #2 mallocx (size=93824992546016, flags=<optimized out>) at ../../../lib/isc/jemalloc_shim.h:71 #3 mem_get (ctx=ctx@entry=0x55555559ea10, size=size@entry=48, flags=flags@entry=64) at ../../../lib/isc/mem.c:307 #4 0x00007ffff7f6fb33 in isc__mem_get (ctx=0x55555559ea10, size=48, flags=flags@entry=64) at ../../../lib/isc/mem.c:692 #5 0x00007ffff7f70377 in isc__mem_reget (ctx=<optimized out>, old_ptr=<optimized out>, old_size=<optimized out>, new_size=<optimized out>, flags=flags@entry=64) at ../../../lib/isc/mem.c:813 #6 0x00007ffff7f706f3 in sync_channellist (lcfg=0x5555555a0b50) at ../../../lib/isc/log.c:970 #7 0x00007ffff7f7201b in isc_log_create (mctx=<optimized out>, lctxp=lctxp@entry=0x7fffffffde00, lcfgp=lcfgp@entry=0x7fffffffde08) at ../../../lib/isc/log.c:273 #8 0x00005555555553f9 in main (argc=1, argv=0x7fffffffdfa8) at ../../../doc/misc/cfg_test.c:76 It seems like memory corruption or linker issues of some sort. But I haven't seen the same issues on my dev machines, when used just ``./configure --enable-developer --without-jemalloc``. Might be again something linked to ``-flto=auto`` cflag used for package builds?
Looks like this: make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' /bin/sh ../../libtool --tag=CC --mode=link gcc -Wall -Wextra -Wwrite-strings -Wpointer-arith -Wno-missing-field-initializers -Wformat -Wshadow -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=format-security -Werror=parentheses -Werror=implicit -Werror=strict-prototypes -Werror=vla -fno-strict-aliasing -fno-delete-null-pointer-checks -fdiagnostics-show-option -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -pthread -Wl,--export-dynamic -Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -o cfg_test cfg_test-cfg_test.o ../../lib/isc/libisc.la ../../lib/dns/libdns.la ../../lib/isccfg/libisccfg.la -lpthread -lm -lurcu-cds -lurcu-common -lurcu libtool: link: gcc -Wall -Wextra -Wwrite-strings -Wpointer-arith -Wno-missing-field-initializers -Wformat -Wshadow -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=format-security -Werror=parentheses -Werror=implicit -Werror=strict-prototypes -Werror=vla -fno-strict-aliasing -fno-delete-null-pointer-checks -fdiagnostics-show-option -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Wl,--export-dynamic -Wl,-z -Wl,relro -Wl,--as-needed -Wl,-z -Wl,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -o .libs/cfg_test cfg_test-cfg_test.o ../../lib/isc/.libs/libisc.so ../../lib/dns/.libs/libdns.so ../../lib/isccfg/.libs/libisccfg.so /home/pemensik/fedora/bind9-next/bind-9.19.24/build/lib/ns/.libs/libns.so /home/pemensik/fedora/bind9-next/bind-9.19.24/build/lib/dns/.libs/libdns.so -lgssapi_krb5 -lkrb5 -lk5crypto -lcom_err -lmaxminddb -lprotobuf-c -lfstrm -llmdb /home/pemensik/fedora/bind9-next/bind-9.19.24/build/lib/isc/.libs/libisc.so -lz -ljson-c -lnghttp2 -lxml2 -luv -ldl -lrt -lcrypto -lssl -lpthread -lm -lurcu-cds -lurcu-common -lurcu -pthread make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --zonegrammar forward > forward.zoneopt *** buffer overflow detected ***: terminated make[3]: *** [Makefile:903: forward.zoneopt] Error 134 make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: *** Waiting for unfinished jobs.... make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --zonegrammar primary > primary.zoneopt *** buffer overflow detected ***: terminated make[3]: *** [Makefile:894: primary.zoneopt] Error 134 make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --zonegrammar secondary > secondary.zoneopt *** buffer overflow detected ***: terminated make[3]: *** [Makefile:897: secondary.zoneopt] Error 134 make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --zonegrammar mirror > mirror.zoneopt *** buffer overflow detected ***: terminated make[3]: *** [Makefile:900: mirror.zoneopt] Error 134 make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --rndc --grammar | /usr/bin/perl ../../../doc/misc/sort-options.pl > rndc.grammar *** buffer overflow detected ***: terminated make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --zonegrammar hint > hint.zoneopt *** buffer overflow detected ***: terminated make[3]: *** [Makefile:906: hint.zoneopt] Error 134 make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --named --grammar | /usr/bin/perl ../../../doc/misc/sort-options.pl > options *** buffer overflow detected ***: terminated make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[3]: Entering directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' ./cfg_test --zonegrammar stub > stub.zoneopt *** buffer overflow detected ***: terminated make[3]: *** [Makefile:909: stub.zoneopt] Error 134 make[3]: Leaving directory '/home/pemensik/fedora/bind9-next/bind-9.19.24/build/doc/misc' make[2]: *** [Makefile:461: all-recursive] Error 1 make[1]: *** [Makefile:582: all-recursive] Error 1 make: *** [Makefile:509: all] Error 2
For the record, this issue is very likely caused by softhsm token. With a different token [*] there is no crash: # dnssec-keyfromlabel -a RSASHA256 -l 'pkcs11:id=%00%00' localhost Enter PIN for PKCS#11 Token (Slot 0 - Kryoptic Slot): Klocalhost.+008+38544 # dnssec-keyfromlabel -a RSASHA256 -l 'pkcs11:id=%00%01' -f KSK localhost Enter PIN for PKCS#11 Token (Slot 0 - Kryoptic Slot): Klocalhost.+008+01334 # dnssec-signzone -o localhost localhost Enter PIN for PKCS#11 Token (Slot 0 - Kryoptic Slot): Verifying the zone using the following algorithms: - RSASHA256 Zone fully signed: Algorithm: RSASHA256: KSKs: 1 active, 0 stand-by, 0 revoked ZSKs: 1 active, 0 stand-by, 0 revoked localhost.signed # rpm -q bind9-next bind9-next-9.19.24-4.fc41.x86_64 Petr, please feel free to drop it if you feel the issue is no longer needed. [*] https://github.com/latchset/kryoptic
Could we get LD_DEBUG=all from the failing process? The local search scope for the plugin should have jemalloc for libc, but that doesn't seem to happen for some reason.
@Florian FWIW the crashes seem to go away if we use kryoptic as the token instead of softhsm2 ... so it may be yet another C++ cause issue,
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle. Changing version to 42.