Bug 2048235
Summary: | freeipa: bind coredumps with systemctl restart named | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Florence Blanc-Renaud <frenaud> |
Component: | bind | Assignee: | Petr Menšík <pemensik> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 36 | CC: | abokovoy, aegorenk, anon.amish, dns-sig, mruprich, msehnout, pemensik, vonsch, zdohnal |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | bind-9.16.27-1.fc35 bind-9.16.27-1.fc36 bind-9.16.27-1.fc34 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-22 03:42:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2057493 |
Description
Florence Blanc-Renaud
2022-01-30 12:35:21 UTC
Were able to reproduce it on rawhide. It even crashed gdb partially. #0 0x00007feed4d94796 in isc_hp_protect (hp=0x560ad4ccc1f0, ihp=<optimized out>, atom=0x560ad4cd2680) at ../../../lib/isc/hp.c:158 __atomic_store_ptr = <optimized out> __atomic_store_tmp = 94604519898192 n = <optimized out> ret = 94604519898192 #1 0x00007feed4dc022a in isc_queue_enqueue (item=<optimized out>, queue=<optimized out>) at ../../../lib/isc/queue.c:142 lt = 0x0 idx = <optimized out> n = 0 #2 isc_queue_enqueue (queue=0x560ad4cd2600, item=140663467457264) at ../../../lib/isc/queue.c:134 lt = <optimized out> idx = <optimized out> n = <optimized out> lnext = <optimized out> __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> newnode = <optimized out> __atomic_compare_exchange_ptr = <optimized out> __atomic_compare_exchange_tmp = <optimized out> #3 0x00007feed4dab8d5 in isc__nm_enqueue_ievent (worker=0x560ad4ccc710, event=<optimized out>) at netmgr/../../../../lib/isc/netmgr/netmgr.c:1116 No locals. #4 0x00007feed4dc8972 in task_ready (task=0x560ad4cd5490) at ../../../lib/isc/task.c:344 manager = 0x560ad4cd3890 manager = <optimized out> __v = <optimized out> #5 isc_task_sendto (task=task@entry=0x560ad4cd5490, eventp=eventp@entry=0x7feed11b5118, c=c@entry=-1) at ../../../lib/isc/task.c:493 was_idle = <optimized out> #6 0x00007feed4dc8a6e in isc_task_send (task=task@entry=0x560ad4cd5490, eventp=eventp@entry=0x7feed11b5118) at ../../../lib/isc/task.c:449 No locals. #7 0x00007feed139b988 in sync_event_send (synchronous=true, ev=0x7feed11b5118, task=0x560ad4cd5490, sctx=0x7feecc416b10) at /usr/src/debug/bind-dyndb-ldap-11.9-13.fc36.x86_64/src/syncrepl.c:614 result = <optimized out> abs_timeout = {seconds = 3426842816, nanoseconds = 32750} seqid = 3 cleanup = <optimized out> result = <optimized out> abs_timeout = <optimized out> seqid = <optimized out> cleanup = <optimized out> #8 syncrepl_update (inst=inst@entry=0x7feecc416a00, entryp=entryp@entry=0x7feed11b51c8, chgtype=chgtype@entry=1) at /usr/src/debug/bind-dyndb-ldap-11.9-13.fc36.x86_64/src/ldap_helper.c:4231 --Type <RET> for more, q to quit, c to continue without paging-- result = <optimized out> pevent = 0x0 entry = 0x7feec4009160 zone_name = <optimized out> zone_ptr = 0x0 action = <optimized out> task = 0x560ad4cd5490 synchronous = true cleanup = <optimized out> __func__ = <optimized out> #9 0x00007feed139c695 in ldap_sync_search_entry (ls=<optimized out>, msg=<optimized out>, entryUUID=0x7feed11b5330, phase=LDAP_SYNC_CAPI_ADD) at /usr/src/debug/bind-dyndb-ldap-11.9-13.fc36.x86_64/src/ldap_helper.c:4409 inst = <optimized out> old_entry = 0x0 new_entry = 0x7feec4009160 result = 0 node = 0x0 mldap_open = false modrdn = <optimized out> cleanup = <optimized out> __func__ = "ldap_sync_search_entry" #10 0x00007feed135e361 in ldap_sync_search_entry (ls=ls@entry=0x7feec40019a0, res=res@entry=0x7feec4009090) at /usr/src/debug/openldap-2.4.59-6.fc36.x86_64/openldap-2.4.59/libraries/libldap_r/ldap_sync.c:225 ctrls = 0x7feec4004ec0 rc = 80 i = <optimized out> ber = 0x7feec4009100 entryUUID = {bv_len = 16, bv_val = 0x7feec400ba67 "\363\222݊\202\221\021\354\224|\274\233\264i\003\231"} cookie = {bv_len = 0, bv_val = 0x0} state = 1 len = 0 phase = <optimized out> __PRETTY_FUNCTION__ = "ldap_sync_search_entry" #11 0x00007feed135f0bb in ldap_sync_init (ls=ls@entry=0x7feec40019a0, mode=mode@entry=3) at /usr/src/debug/openldap-2.4.59-6.fc36.x86_64/openldap-2.4.59/libraries/libldap_r/ldap_sync.c:790 refreshDone = 0 msg = 0x7feec4009090 ctrl = {ldctl_oid = 0x7feed1365946 "1.3.6.1.4.1.4203.1.9.1.1", ldctl_value = {bv_len = 8, bv_val = 0x7feec4009a80 "0\006\n\001\003\001\001"}, ldctl_iscritical = 1 '\001'} ctrls = {0x7feed11b53b0, 0x0} ber = 0x7feec4002210 rc = <optimized out> tv = {tv_sec = 0, tv_usec = 100000} tvp = <optimized out> res = 0x7feec4009090 --Type <RET> for more, q to quit, c to continue without paging-- __PRETTY_FUNCTION__ = "ldap_sync_init" #12 0x00007feed139d45d in ldap_sync_doit (inst=inst@entry=0x7feecc416a00, conn=conn@entry=0x7feecc418430, filter_objcs=filter_objcs@entry=0x7feed13aa778 "(|(objectClass=idnsZone) (objectClass=idnsForwardZone) (objectClass=idnsRecord))", mode=mode@entry=3) at /usr/src/debug/bind-dyndb-ldap-11.9-13.fc36.x86_64/src/ldap_helper.c:4693 result = 0 ret = <optimized out> s_len = <optimized out> ldap_sync = 0x7feec40019a0 err_hint = 0x7feed13a5fb8 "" filter = "(| (objectClass=idnsConfigObject) (&(objectClass=idnsServerConfigObject) (idnsServerId=ci-vm-10-0-137-13.hosted.upshift.rdu2.redhat.com))(|(objectClass=idnsZone) (objectClass=idnsForwardZone) "... config_template = "(| (objectClass=idnsConfigObject) %s%s%s%s)" server_id = 0x7feecc419c28 "ci-vm-10-0-137-13.hosted.upshift.rdu2.redhat.com" __func__ = <optimized out> #13 0x00007feed139d8e6 in ldap_syncrepl_watcher (arg=0x7feecc416a00) at /usr/src/debug/bind-dyndb-ldap-11.9-13.fc36.x86_64/src/ldap_helper.c:4785 inst = 0x7feecc416a00 conn = 0x7feecc418430 ret = <optimized out> result = <optimized out> sigset = {__val = {512, 0 <repeats 15 times>}} reconnect_interval = 0 state = sync_datainit retry = <optimized out> __func__ = <optimized out> #14 0x00007feed4dc7ef9 in isc__trampoline_run (arg=0x7feecc469430) at ../../../lib/isc/trampoline.c:198 trampoline = 0x7feecc469430 result = <optimized out> #15 0x00007feed45ad862 in start_thread (arg=<optimized out>) at pthread_create.c:442 ret = <optimized out> pd = <optimized out> out = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140663723370656, 1579087791515376470, 140663687177792, 0, 140663741666704, 0, -1569599249955546282, -1569606384089687210}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = { prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> #16 0x00007feed4632670 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 I cannot say it is obvious, what is wrong here. (gdb) disassemble Dump of assembler code for function isc_hp_protect: 0x00007feed4d94750 <+0>: endbr64 0x00007feed4d94754 <+4>: push %r13 0x00007feed4d94756 <+6>: movslq %esi,%rsi 0x00007feed4d94759 <+9>: mov %rdi,%r13 0x00007feed4d9475c <+12>: xor %eax,%eax 0x00007feed4d9475e <+14>: push %r12 0x00007feed4d94760 <+16>: mov %rdx,%r12 0x00007feed4d94763 <+19>: push %rbp 0x00007feed4d94764 <+20>: lea 0x0(,%rsi,8),%rbp 0x00007feed4d9476c <+28>: push %rbx 0x00007feed4d9476d <+29>: sub $0x8,%rsp 0x00007feed4d94771 <+33>: jmp 0x7feed4d9479d <isc_hp_protect+77> 0x00007feed4d94773 <+35>: nopl 0x0(%rax,%rax,1) 0x00007feed4d94778 <+40>: data16 lea 0x6b7b8(%rip),%rdi # 0x7feed4dfff38 0x00007feed4d94780 <+48>: data16 data16 rex.W call 0x7feed4d8f160 <__tls_get_addr@plt> 0x00007feed4d94788 <+56>: mov %rbx,%rcx 0x00007feed4d9478b <+59>: movslq (%rax),%rdx 0x00007feed4d9478e <+62>: mov 0x10(%r13),%rax 0x00007feed4d94792 <+66>: mov (%rax,%rdx,8),%rax => 0x00007feed4d94796 <+70>: xchg %rcx,(%rax,%rbp,1) 0x00007feed4d9479a <+74>: mov %rbx,%rax 0x00007feed4d9479d <+77>: mov (%r12),%rbx 0x00007feed4d947a1 <+81>: cmp %rax,%rbx 0x00007feed4d947a4 <+84>: jne 0x7feed4d94778 <isc_hp_protect+40> 0x00007feed4d947a6 <+86>: add $0x8,%rsp 0x00007feed4d947aa <+90>: pop %rbx 0x00007feed4d947ab <+91>: pop %rbp 0x00007feed4d947ac <+92>: pop %r12 0x00007feed4d947ae <+94>: pop %r13 0x00007feed4d947b0 <+96>: ret End of assembler dump. (gdb) info registers rax 0xbe 190 rbx 0x560ad4cd7c50 94604519898192 rcx 0x560ad4cd7c50 94604519898192 rdx 0x4 4 rsi 0x0 0 rdi 0x7feed4dfff38 140663750393656 rbp 0x0 0x0 rsp 0x7feed11b5020 0x7feed11b5020 r8 0x0 0 r9 0x0 0 r10 0x7feec402be40 140663467458112 r11 0x7feec4000030 140663467278384 r12 0x560ad4cd2680 94604519876224 r13 0x560ad4ccc1f0 94604519850480 r14 0x7feec4009160 140663467315552 r15 0x7feecc416b10 140663605783312 rip 0x7feed4d94796 0x7feed4d94796 <isc_hp_protect+70> eflags 0x10202 [ IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 k0 0x0 0 k1 0xf 15 k2 0xff 255 k3 0x20 32 k4 0x0 0 k5 0x0 0 k6 0x0 0 k7 0x0 0 It seems some variable is uninitialized 0xbe seems uninitialized constant, rbp is 0 Could be some hp structure uninitialized properly in ldap worker thread? Might be caused by recent modification of taskmgr [1]. It is well handled by named, but might not be handled accordingly by bind-dyndb-ldap plugin. It is possible change is required. Mentioned also in later commit 5be356760 [2]. 1. https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/4918 2. https://gitlab.isc.org/isc-projects/bind9/commit/5be356760dd6855944234980f2fc0f13130267fb I tried copr build [1] on f35 release. It crashes the same way. That means code changes are responsible, not gcc and libraries changes as I thought originally. Comment #3 lists primary candidates. It seems bind-dyndb-ldap needs to be fixed to work with modified code. This happens when another thread is loading a zone. Maybe exclusive task semantics has changed? It seems some locks are not at proper places. (gdb) info threads Id Target Id Frame 1 Thread 0x7ff863bfe640 (LWP 10357) 0x00007ff86b33a886 in isc_hp_protect (hp=0x55f85dec4c60, ihp=<optimized out>, atom=0x55f85dec3180) at ../../../lib/isc/hp.c:158 2 Thread 0x7ff86a803dc0 (LWP 10353) 0x00007ff86ac3743a in __GI___sigtimedwait (set=set@entry=0x7ffcf7859e00, info=info@entry=0x7ffcf7859d30, timeout=timeout@entry=0x0) at ../sysdeps/unix/sysv/linux/sigtimedwait.c:61 3 Thread 0x7ff869327640 (LWP 10355) __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7ff869326b00, op=393, expected=0, futex_word=0x55f85debdf98) at futex-internal.c:57 4 Thread 0x7ff868b26640 (LWP 10356) 0x00007ff86ad05c1e in epoll_wait (epfd=16, events=events@entry=0x55f85df38750, maxevents=maxevents@entry=2048, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30 * 5 Thread 0x7ff869b28640 (LWP 10354) isc_lex_isfile (lex=lex@entry=0x7ff85c015a30) at ../../../lib/isc/lex.c:1104 (gdb) bt full #0 isc_lex_isfile (lex=lex@entry=0x7ff85c015a30) at ../../../lib/isc/lex.c:1104 source = <optimized out> #1 0x00007ff86b4f79bf in fromtext_warneof (callbacks=0x7ff864072ac8, lexer=0x7ff85c015a30) at ../../../lib/dns/rdata.c:2080 No locals. #2 dns_rdata_fromtext (rdata=0x0, rdclass=<optimized out>, type=<optimized out>, lexer=0x7ff85c015a30, origin=<optimized out>, options=<optimized out>, mctx=0x55f85df46080, target=0x7ff869b22480, callbacks=0x7ff864072ac8) at ../../../lib/dns/rdata.c:1051 result = 0 region = {base = 0x302d7265444e536e <error: Cannot access memory at address 0x302d7265444e536e>, length = 1677734048} st = <optimized out> token = {type = isc_tokentype_eof, value = {as_char = 48 '0', as_ulong = 140704806808112, as_region = { base = 0x7ff864072a30 "", length = 1}, as_textregion = {base = 0x7ff864072a30 "", length = 1}, as_pointer = 0x7ff864072a30}} lexoptions = 291 name = 0x7ff864220978 "buffer-0x7ff869b224c0" line = <optimized out> callback = 0x7ff86b44a610 <isclog_error_callback> tresult = 0 length = <optimized out> unknown = <optimized out> #3 0x00007ff86b53066f in dns_sdb_putrr (lookup=0x7ff864072a80, type=<optimized out>, ttl=0, data=0x7ff864220808 "@") at ../../../lib/dns/sdb.c:374 datalen = <optimized out> typeval = 2 r = {base = 0x55f85d025c65 "ns", length = 2} lex = 0x7ff85c015a30 result = <optimized out> p = 0x7ff8640030a0 "\003\061\061\063\003\061\060\060\aIN-ADDR\004ARPA" size = 1024 mctx = 0x55f85df46080 imp = <optimized out> origin = 0x7ff8642208a8 b = {magic = 1114990113, base = 0x7ff864220808, length = 1, used = 1, current = 1, active = 0, link = { prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false} rb = {magic = 1114990113, base = 0x7ff8640030a0, length = 1024, used = 22, current = 0, active = 0, link = { prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false} failure = <optimized out> #4 0x000055f85cfda93d in builtin_authority (zone=<optimized out>, dbdata=<optimized out>, lookup=0x7ff864072a80) at ../../../bin/named/builtin.c:461 result = <optimized out> contact = <optimized out> server = 0x7ff864220808 "@" b = <optimized out> #5 0x00007ff86b52c239 in findnodeext (db=0x7ff864220890, name=<optimized out>, create=<optimized out>, methods=<optimized out>, clientinfo=<optimized out>, nodep=0x7ff869b22ac8) at ../../../lib/dns/sdb.c:829 sdb = 0x7ff864220890 --Type <RET> for more, q to quit, c to continue without paging-- node = 0x7ff864072a80 result = 0 b = {magic = 1114990113, base = 0x7ff869b22630, length = 1024, used = 2, current = 0, active = 0, link = { prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false} namestr = "@\000\000\000\000\000\000\000a\305Rk\370\177", '\000' <repeats 11 times>, "\363nI}g\301\300\260&\262i\370\177\000\000\210\376\377\377\377\377\377\377\000\000\000\000\000\000\000\000\000\363nI}g\301\300p\b\"d\370\177\000\000\210\376\377\377\377\377\377\377\000\000\000\000\000\000\000\000\250\000\000\000\000\000\000\000\200,\ad\370\177\000\000\215\353^k\370\177\000\000\243\002\000\000\000\000\000\000\325\033\311j\370\177\000\000\200`\364]\370U\000\000}\337\064k\370\177\000\000\000\000\000\000\000\000\000\000\230`\364]\370U\000\000\200`\364]\370U\000\000}\337\064k\370\177\000\000\352L8k\370\177\000\000\230'\262i\370\177\000\000\060'\262i\370\177\000\000"... isorigin = true imp = 0x55f85dec0e40 relname = {magic = 1145983854, ndata = 0x7ff864479ad0 "\003\061\061\063\003\061\060\060\aIN-ADDR\004ARPA", length = 0, labels = 0, attributes = 0, offsets = 0x0, buffer = 0x0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0, tail = 0x0}} labels = <optimized out> #6 0x00007ff86b5622a3 in zone_get_from_db (zone=zone@entry=0x7ff864478cc0, db=db@entry=0x7ff864220890, nscount=nscount@entry=0x7ff869b22c30, soacount=soacount@entry=0x7ff869b22c34, soattl=soattl@entry=0x7ff869b22c14, serial=serial@entry=0x7ff869b22c28, refresh=0x7ff869b22c24, retry=0x7ff869b22c20, expire=0x7ff869b22c1c, minimum=0x7ff869b22c18, errors=0x7ff869b22c2c) at ../../../lib/dns/zone.c:5632 result = <optimized out> answer = 0 version = 0x7ff86b63a774 <dummy> node = 0x0 #7 0x00007ff86b562891 in zone_postload (zone=0x7ff864478cc0, db=0x7ff864220890, loadtime=..., result=<optimized out>) at ../../../lib/dns/zone.c:4891 soacount = 0 nscount = 0 errors = 0 serial = 0 oldserial = 1773285968 refresh = 0 retry = 0 expire = 0 minimum = 0 soattl = 0 now = {seconds = 1643652943, nanoseconds = 618605591} needdump = false fixjournal = false hasinclude = false nomaster = <optimized out> had_db = false inc = <optimized out> is_dynamic = false done = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- #8 0x00007ff86b55bdd9 in zone_load (zone=0x7ff864478cc0, flags=0, locked=locked@entry=true) at ../../../lib/dns/zone.c:2333 result = <optimized out> now = {seconds = 1643652943, nanoseconds = 618605591} loadtime = {seconds = 1643652943, nanoseconds = 618605591} db = 0x7ff864220890 rbt = <optimized out> hasraw = <optimized out> is_dynamic = <optimized out> #9 0x00007ff86b55c2f5 in zone_asyncload (task=0x55f85df45ea0, event=<optimized out>) at ../../../lib/dns/zone.c:2366 asl = 0x7ff8640605b0 zone = 0x7ff864478cc0 result = <optimized out> #10 0x00007ff86b37071d in task_run (task=0x55f85df45ea0) at ../../../lib/isc/task.c:850 dispatch_count = 0 finished = false event = 0x7ff8640605e0 result = 0 dispatch_count = <optimized out> finished = <optimized out> event = <optimized out> result = <optimized out> done = <optimized out> __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> __v = <optimized out> #11 isc_task_run (task=0x55f85df45ea0) at ../../../lib/isc/task.c:943 No locals. #12 0x00007ff86b35bc09 in isc__nm_async_task (worker=0x55f85debd100, ev0=0x7ff864063de0) at netmgr/../../../../lib/isc/netmgr/netmgr.c:855 ievent = 0x7ff864063de0 result = <optimized out> #13 process_netievent (worker=worker@entry=0x55f85debd100, ievent=0x7ff864063de0) at netmgr/../../../../lib/isc/netmgr/netmgr.c:940 No locals. #14 0x00007ff86b35bd85 in process_queue (worker=worker@entry=0x55f85debd100, type=type@entry=NETIEVENT_TASK) at netmgr/../../../../lib/isc/netmgr/netmgr.c:1009 stop = <optimized out> waiting = 17 ievent = <optimized out> #15 0x00007ff86b35c577 in process_all_queues (worker=0x55f85debd100) at netmgr/../../../../lib/isc/netmgr/netmgr.c:780 --Type <RET> for more, q to quit, c to continue without paging-- result = <optimized out> type = 2 reschedule = false reschedule = <optimized out> type = <optimized out> result = <optimized out> #16 async_cb (handle=0x55f85debd460) at netmgr/../../../../lib/isc/netmgr/netmgr.c:809 worker = 0x55f85debd100 #17 0x00007ff86b0f6b7d in uv__async_io (loop=0x55f85debd110, w=<optimized out>, events=<optimized out>) at src/unix/async.c:163 buf = "\001", '\000' <repeats 967 times>... r = <optimized out> queue = {0x7ff869b245f0, 0x7ff869b245f0} q = 0x55f85debd4c8 h = 0x55f85debd460 __PRETTY_FUNCTION__ = <optimized out> #18 0x00007ff86b112a7e in uv__io_poll (loop=0x55f85debd110, timeout=<optimized out>) at src/unix/epoll.c:374 no_epoll_pwait = <optimized out> no_epoll_wait = <optimized out> events = {{events = 1, data = {ptr = 0xd, fd = 13, u32 = 13, u64 = 13}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}} <repeats 1018 times>, {events = 0, data = {ptr = 0x55f85debd110, fd = 1575735568, u32 = 1575735568, u64 = 94525215985936}}, {events = 0, data = {ptr = 0x5debd2b000000000, fd = 0, u32 = 0, u64 = 6767734518410379264}}, { events = 22008, data = {ptr = 0x55f85debd290, fd = 1575735952, u32 = 1575735952, u64 = 94525215986320}}, {events = 1, data = {ptr = 0x600000000, fd = 0, u32 = 0, u64 = 25769803776}}, {events = 0, data = { ptr = 0x7ff86accc27d <__GI___clock_gettime+29>, fd = 1791804029, u32 = 1791804029, u64 = 140704920420989}}} pe = 0x7ff869b24b20 e = {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}} real_timeout = <optimized out> q = <optimized out> w = 0x55f85debd2d8 sigset = {__val = {0 <repeats 16 times>}} sigmask = <optimized out> base = <optimized out> have_signals = 0 nevents = 0 count = <optimized out> nfds = <optimized out> fd = <optimized out> op = <optimized out> i = 0 user_timeout = <optimized out> reset_timeout = <optimized out> max_safe_timeout = <optimized out> no_epoll_pwait_cached = <optimized out> no_epoll_wait_cached = <optimized out> __PRETTY_FUNCTION__ = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- #19 0x00007ff86b0fc618 in uv__io_poll (timeout=<optimized out>, loop=0x55f85debd110) at src/unix/udp.c:122 events = <optimized out> real_timeout = <optimized out> q = <optimized out> sigmask = <optimized out> no_epoll_pwait = <optimized out> w = <optimized out> nevents = <optimized out> fd = <optimized out> op = <optimized out> pe = <optimized out> user_timeout = <optimized out> no_epoll_wait = <optimized out> sigset = <optimized out> i = <optimized out> e = <optimized out> base = <optimized out> have_signals = <optimized out> count = <optimized out> nfds = <optimized out> reset_timeout = <optimized out> update_timeout = <optimized out> max_safe_timeout = <optimized out> no_epoll_pwait_cached = <optimized out> no_epoll_wait_cached = <optimized out> no_epoll_pwait = <optimized out> no_epoll_wait = <optimized out> events = <optimized out> pe = <optimized out> e = <optimized out> real_timeout = <optimized out> q = <optimized out> w = <optimized out> sigset = <optimized out> sigmask = <optimized out> base = <optimized out> have_signals = <optimized out> nevents = <optimized out> count = <optimized out> nfds = <optimized out> fd = <optimized out> op = <optimized out> i = <optimized out> user_timeout = <optimized out> reset_timeout = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- __PRETTY_FUNCTION__ = <optimized out> update_timeout = <optimized out> _saved_errno = <optimized out> x = <optimized out> #20 uv_run (loop=loop@entry=0x55f85debd110, mode=mode@entry=UV_RUN_DEFAULT) at src/unix/core.c:389 timeout = <optimized out> r = <optimized out> ran_pending = <optimized out> #21 0x00007ff86b35be3b in nm_thread (worker0=0x55f85debd100) at netmgr/../../../../lib/isc/netmgr/netmgr.c:715 r = 1 __atomic_load_ptr = <optimized out> __atomic_load_tmp = <optimized out> worker = 0x55f85debd100 mgr = 0x55f85debcfb0 #22 0x00007ff86b36e5da in isc__trampoline_run (arg=0x55f85deb0060) at ../../../lib/isc/trampoline.c:198 trampoline = 0x55f85deb0060 result = <optimized out> #23 0x00007ff86ac81a87 in start_thread (arg=<optimized out>) at pthread_create.c:435 ret = <optimized out> pd = <optimized out> out = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140704901924416, -3602452523273357970, 140704901924416, 1, 140704920115184, 0, 3598754200031655278, 3598746725591348590}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = { prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> #24 0x00007ff86ad06640 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 No locals. What I don't understand is how those commits could be responsible if v9_16 branch does not have them -- they only appeared in v9_17 and later. (In reply to Alexander Bokovoy from comment #6) > What I don't understand is how those commits could be responsible if v9_16 > branch does not have them -- they only appeared in v9_17 and later. Oh, correct. Only commit that references it is present, not the refactoring itself. There are only two commits touching tasks subsystem in 9.16.25: https://gitlab.isc.org/isc-projects/bind9/commit/9ec7d78d164991b668cd50371ffe8f9b7b4b5ac3 https://gitlab.isc.org/isc-projects/bind9/commit/5be356760dd6855944234980f2fc0f13130267fb Which were merged in: https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5698 Another candidate might be memory pools changes, which also removed lock support. https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5637 Prepared candidate MR on upstream [1], which would allow additional threads to be used with hazard pointers implementation. It would initialize just first half of possible threads, previous maximum of initialized threads. bind-dyndb-ldap would have to call isc_nm_thread_check() on used thread to have isc_hp_* functions ready. It is intended to be run from thread initializer just single time, instead of checking this every time atomic should be used. 1. https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5795 Core of detected issue is with unbound tid() identifier in isc_hp_init(4 * workers); It initializes only 4 threads, meaning ids 0-3. syncrepl thread is not spawned by task manager itself, so this is different thread. it has isc_tid_v == 4, which reaches behind initialized pointers. It is a bit surprising there does not exist any REQUIRE() check ensuring it would fail in controlled manner. So we crash by using uninitialized memory as pointer. This bug appears to have been reported against 'rawhide' during the Fedora 36 development cycle. Changing version to 36. I have built testing build in copr [1]. It requires modification of /etc/sysconfig/named: OPTIONS=" -E pkcs11 -H 200" Then systemctl restart ipa passed. It seems to be possible way, but requires also modification of freeipa package to autodetect -H parameter support. Then add new parameter to named options. Unfortunately this cannot be done just in bind-dyndb-ldap, which usually is the only dependency build together with bind. Not yet accepted or refused by upstream. 1. https://copr.fedorainfracloud.org/coprs/pemensik/bind-9.16/build/3440794/ FEDORA-2022-14e36aac0c has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-14e36aac0c FEDORA-2022-427cfc50f8 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-427cfc50f8 FEDORA-2022-042d9c6146 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2022-042d9c6146 FEDORA-2022-14e36aac0c has been pushed to the Fedora 36 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-14e36aac0c` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-14e36aac0c See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2022-042d9c6146 has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-042d9c6146` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-042d9c6146 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2022-427cfc50f8 has been pushed to the Fedora 35 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-427cfc50f8` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-427cfc50f8 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2022-427cfc50f8 has been pushed to the Fedora 35 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2022-14e36aac0c has been pushed to the Fedora 36 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2022-042d9c6146 has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report. |