Bug 2182342
Summary: | dnsmasq SEGV on dnssec validation [rhel9] | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Pascal Dupuis <cdemills> | |
Component: | dnsmasq | Assignee: | Petr Menšík <pemensik> | |
Status: | CLOSED MIGRATED | QA Contact: | Petr Sklenar <psklenar> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | CentOS Stream | CC: | bstinson, jwboyer, psklenar | |
Target Milestone: | rc | Keywords: | MigratedToJIRA, Regression, TestCaseProvided, Triaged | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2214257 (view as bug list) | Environment: | ||
Last Closed: | 2023-09-21 18:52:22 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 2047510 | |||
Bug Blocks: | 2214257 |
Description
Pascal Dupuis
2023-03-28 09:43:33 UTC
Sorted it out a bit: I took the dnsmask.conf from the package, and it worked. Applied a kind-off bissection, to find out the issue is that I was mixing generic name servers, honoring the DNSSEC protocol, with local name servers which do not support DNSSEC. Even if you mixed DNSSEC aware forwarders with unaware ones, it should not lead to a deamon crash. But I expect such case is not tested under normal process. Not by me at least. Can you please try dnf debuginfo-install dnsmasq and then make backtrace again? Without debugging symbols it is missing all parameters, so I can only guess what happens. It should be visible in coredumpctl output. Also dnsmasq has disabled dnssec validation in default configuration. Support of DNSSEC forwarders should not matter in default configuration. Have you enabled dnssec validation? Can you provide complete dnsmasq configuration, only with comments removed? $ grep -v '^\s*\(#.*\)\?$' /etc/dnsmasq.conf If /etc/dnsmasq.d are present, is there anything related to dns? Can you help me finding minimal reproducer? Coredump attachment might help too. Shall I make the bug private, so you can attach core dump directly? Hello here you are: ## domain-needed bogus-priv conf-file=/usr/share/dnsmasq/trust-anchors.conf dnssec strict-order # for the network where my computer is working server=/univ-tlse3.fr/130.120.124.102 server=84.200.69.80 server=84.200.70.40 user=dnsmasq group=dnsmasq interface=lo except-interface=virbr0 cache-size=1000 no-negcache local-ttl=300 conf-dir=/etc/dnsmasq.d,.bak ## I also checked dig @130.120.124.102 www.univ-tlse3.fr returns a correct answer Regards Pascal And the gdb session: Type "apropos word" to search for commands related to "word"... Reading symbols from dnsmasq... Reading symbols from /usr/lib/debug/usr/sbin/dnsmasq-2.85-6.el9.x86_64.debug... (gdb) set args -d (gdb) r Starting program: /usr/sbin/dnsmasq -d [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". dnsmasq: started, version 2.85 cachesize 1000 dnsmasq: compile time options: IPv6 GNU-getopt DBus no-UBus no-i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile dnsmasq: DNSSEC validation enabled dnsmasq: configured with trust anchor for <root> keytag 20326 dnsmasq: using nameserver 84.200.70.40#53 dnsmasq: using nameserver 84.200.69.80#53 dnsmasq: using nameserver 195.220.43.67#53 for domain univ-toulouse.fr (no DNSSEC) dnsmasq: using nameserver 130.120.124.102#53 for domain univ-tlse3.fr (no DNSSEC) dnsmasq: reading /etc/resolv.conf dnsmasq: using nameserver 84.200.70.40#53 dnsmasq: using nameserver 84.200.69.80#53 dnsmasq: using nameserver 195.220.43.67#53 for domain univ-toulouse.fr (no DNSSEC) dnsmasq: using nameserver 130.120.124.102#53 for domain univ-tlse3.fr (no DNSSEC) dnsmasq: ignoring nameserver 127.0.0.1 - local interface dnsmasq: using nameserver 195.220.43.67#53 dnsmasq: read /etc/hosts - 4 addresses Program received signal SIGSEGV, Segmentation fault. hostname_isequal (b=0x5555555c1670 "univ-toulouse.fr", a=0x18 <error: Cannot access memory at address 0x18>) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/util.c:355 355 c1 = (unsigned char) *a++; With debuginfo enabled: where #0 hostname_isequal (b=0x5555555c1670 "univ-toulouse.fr", a=0xc <error: Cannot access memory at address 0xc>) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/util.c:355 #1 server_domain_find_domain (domain=0xb <error: Cannot access memory at address 0xb>) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/network.c:1660 #2 search_servers (now=1680511161, addrpp=0x0, qtype=<optimized out>, qdomain=0x5555555bfaa0 "com", type=<optimized out>, domain=0x7fffffffda58, norebind=0x0, serv_domain=0x7fffffffda50) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/forward.c:253 #3 0x000055555557ad3b in reply_query (fd=<optimized out>, now=1680511161) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/forward.c:1109 #4 0x000055555557cf61 in check_dns_listeners (now=1680511161) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/dnsmasq.c:1770 #5 0x0000555555560e1a in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/dnsmasq-2.85-6.el9.x86_64/src/dnsmasq.c:1218 It seems that "hostname_isequal" performs a string comparison, but without caring about string length. Regards Pascal No, the problem is in comment #5, it tries to compare invalid pointer a with valid string b. a seems like it is some offset created from a null pointer at some place. Not yet sure how those invalid pointers become present. Because it is not null pointer, it is allowed to dereference. Where it fails. It might be related to regression I have found in bug #2188712. Would you be willing to test a candidate build fixing issue? It fixes possible issues with invalid pointers. Main issue it fixes is using --server=/example.net/#, but it may fix also the issue you are seeing. Hello Bug #2188712 is locked to me. Anyway I would be glad to test a fix. Regards Pascal Hello Bug #2188712 is locked to me. Anyway I would be glad to test a fix. Regards Pascal Would you be able to test MR: https://gitlab.com/redhat/centos-stream/rpms/dnsmasq/-/merge_requests/13 Test build created from this MR: https://centos.softwarefactory-project.io/logs/13/13/0b70f0077368ba61a13e7c58d78bc11e1ef79a1c/check/mock-build/6c61c65/repo/ It should be time limited. The original cause should be different a bit, but crashing at similar point in backtrace. Opened bug #2186481, which is original bug of bug #2188712. I were able to reproduce the issue. No, even the last fix does not fix this one. It is still unfixed. Minimal reproducer seems to be: conf-file=/usr/share/dnsmasq/trust-anchors.conf dnssec server=/example.net/199.43.133.53 Then send the query to example.com address. It crashes reliably. Problem is not properly initialized domain variable when fetching DNSKEY or DS records. Proposed fix: https://gitlab.com/redhat/centos-stream/rpms/dnsmasq/-/merge_requests/14 This is another issue caused by change of bug #2047510. That means this happens on RHEL9 since version dnsmasq-2.85-3.el9, earlier builds should be okay. Hello Petr. Rebuilded dnsmasq from https://gitlab.com/redhat/centos-stream/rpms/dnsmasq and installed on a system where dnsmasq 2.85-6.el9.x86_64 was failing. Tested and ... approved. No more segv. Thank you for the fix. Pascal Pushed automated test [1] to Fedora tests repository, even though it fails only on RHEL9. Strange is it seems it should fail also on RHEL8 version, where similar problems are also, but this test passes there just fine. Fedora releases should be unaffected, because it were caused by my downstream change (in attempt to prevent upstream regressions!). Requires DNSSEC-ready infrastructure. [1] https://src.fedoraproject.org/tests/dnsmasq/c/446fbcf46c27d2756f1a0bf541c27d8f4699eae6 Strange. I were finally able to reproduce this also on RHEL8. For some reason it crashes reliably when run under valgrind, but without it is responds and works. Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |