Created attachment 1915194 [details] named.run Description of problem: This is reopening of BZ https://bugzilla.redhat.com/show_bug.cgi?id=1935152 , packages updated Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: ../../../lib/dns-pkcs11/rdataset.c:263: REQUIRE(rdataset->methods != ((void *)0)) failed, back trace Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #0 0x5630812cf120 in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #1 0x7fbf1a50246a in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #2 0x7fbf1a8897c8 in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #3 0x7fbf1a89b8e0 in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #4 0x7fbf1a89df79 in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #5 0x7fbf1a8a113c in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #6 0x7fbf1a526f6b in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #7 0x7fbf18601ea5 in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: #8 0x7fbf17674b0d in ?? Sep 28 03:07:04 ipa-1 named-pkcs11[7587]: exiting (due to assertion failure) Sep 28 03:07:04 ipa-1 abrt-hook-ccpp[7808]: Process 7587 (named-pkcs11) of user 25 killed by SIGABRT - dumping core Sep 28 03:07:06 ipa-1 systemd[1]: named-pkcs11.service: main process exited, code=killed, status=6/ABRT Sep 28 03:07:06 ipa-1 systemd[1]: Unit named-pkcs11.service entered failed state. Sep 28 03:07:06 ipa-1 systemd[1]: named-pkcs11.service failed. Version-Release number of selected component (if applicable): Red Hat Enterprise Linux Server release 7.9 (Maipo) bind-pkcs11-9.11.4-26.P2.el7_9.9.x86_64 Wed Sep 21 18:36:44 2022 How reproducible: - Expected results: named not failing Additional info: Core file can be found in attached case
I don't see any real concurrency in attached core dump. It terminates with REQUIRE(rdataset->methods != ((void *)0)) failed, on line lib/dns-pkcs11/rdataset.c:263, which suggests rdataset in iteration were disassociated. But I don't see any real concurrency. Thread 1 is doing fctx_try and Thread 5 processing ldap change. Other threads seems to be idling.
It seems the crash logged at beginning of named.run happens during managed-keys priming, the failing query was ./DNSKEY. 28-Sep-2022 03:07:02.458 managed-keys-zone: journal file is out of date: removing journal file 28-Sep-2022 03:07:02.458 managed-keys-zone: loaded serial 54749 28-Sep-2022 03:07:02.458 zone 0.in-addr.arpa/IN: loaded serial 0 28-Sep-2022 03:07:02.459 zone localhost/IN: loaded serial 0 28-Sep-2022 03:07:02.459 zone 1.0.0.127.in-addr.arpa/IN: loaded serial 0 28-Sep-2022 03:07:02.459 zone 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN: loaded serial 0 28-Sep-2022 03:07:02.459 zone localhost.localdomain/IN: loaded serial 0 28-Sep-2022 03:07:02.459 all zones loaded 28-Sep-2022 03:07:02.460 running 28-Sep-2022 03:07:02.480 LDAP configuration for instance 'ipa' synchronized 28-Sep-2022 03:07:02.498 LDAP data for instance 'ipa' are being synchronized, please ignore message 'all zones loaded' 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:7fd::1#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:2f::f#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:1::53#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:200::b#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:503:c27::2:30#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:dc3::35#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:503:ba3e::2:30#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:7fe::53#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:9f::42#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:2d::d#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:a8::e#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:2::c#53 28-Sep-2022 03:07:03.261 network unreachable resolving './NS/IN': 2001:500:12::d0d#53 28-Sep-2022 03:07:04.862 ../../../lib/dns-pkcs11/rdataset.c:263: REQUIRE(rdataset->methods != ((void *)0)) failed, back trace
This issue might be related to bind-dyndb-ldap, but seems primary in bind component, named-pkcs11 service. I would recommend the customer using OPTIONS="-4" in /etc/sysconfig/named. When IPv6 is not available on the host, it would save named trying from accessing it. It would make all outgoing queries IPv4 only, because it seems all IPv6 queries are failing anyway on this host. Probably would not avoid hitting this issue, but would save some CPU cycles for important stuff. And log less warnings.
I would suggest workaround recommended in https://bugzilla.redhat.com/show_bug.cgi?id=1935152#c36 It would not prevent those crashes, but would make them less noticeable.
There has been bug #1891410 filled for this issue, which has been closed as a duplicate of bug #1891410, which is still open.
Bug #1854473 is the correct one still open for bind-dyndb-ldap.