Description of problem: We tested dnsmasq-2.85-10.el9.x86_64 and it results here in a regression May 21 16:20:53 s2 systemd[1]: dnsmasq.service: start operation timed out. Terminating. May 21 16:20:53 s2 systemd[1]: dnsmasq.service: Control process exited, code=exited status=5 May 21 16:20:53 s2 systemd[1]: dnsmasq.service: Failed with result 'timeout'. May 21 16:20:58 s2 systemd[1]: dnsmasq.service: Service RestartSec=5s expired, scheduling restart. May 21 16:20:58 s2 systemd[1]: dnsmasq.service: Scheduled restart job, restart counter is at 1. May 21 16:22:31 s2 systemd[1]: dnsmasq.service: start operation timed out. Terminating. May 21 16:22:31 s2 systemd[1]: dnsmasq.service: Control process exited, code=exited status=5 May 21 16:22:31 s2 systemd[1]: dnsmasq.service: Failed with result 'timeout'. May 21 16:22:36 s2 systemd[1]: dnsmasq.service: Service RestartSec=5s expired, scheduling restart. May 21 16:22:36 s2 systemd[1]: dnsmasq.service: Scheduled restart job, restart counter is at 2. May 21 16:23:43 s2 systemd[1]: dnsmasq.service: Control process exited, code=exited status=5 May 21 16:23:43 s2 systemd[1]: dnsmasq.service: Failed with result 'exit-code'. Dnsmasq does not start and CPUs are at 100%, then systemd times out (the restarts are coming from a local drop-in config). Downgrade to dnsmasq-2.85-7.el9.x86_64 resolves the start problems Version-Release number of selected component (if applicable): dnsmasq-2.85-10.el9.x86_64 How reproducible: Update to dnsmasq-2.85-10.el9.x86_64 from dnsmasq-2.85-6.el9.x86_64 or dnsmasq-2.85-7.el9.x86_64 Actual results: does not start Expected results: it starts normally Additional info: The point is for sure that we have here a big list of (>100000) # tail /etc/dnsmasq.d/dns-hosts-void.conf ... address=/foo.bar/0.0.0.0 address=/bob.alice/0.0.0.0 but dnsmasq-2.85-6.el9.x86_64 or dnsmasq-2.85-7.el9.x86_64 do not have problems reading it and starting the daemon? Is this regression coming from #2188712 from release 2.85-8 ??
Would you mind attaching your /etc/dnsmasq.d/dns-hosts-void.conf compressed by gzip? I can try to create my own long list of similar addresses, but I cannot guarantee to reproduce it.
Thanks for taking a look. Please try following process to create the list: # Download raw list curl -s "https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts" | \ grep -v newrelic.com | grep -v ^"#" | grep -v "27\-\-" | grep ^0.0.0.0 | \ sort > "${CURRENTDAY}-dns-void-StevenBlack.conf" # translate to dnsmasq config cat "${CURRENTDAY}-dns-void-StevenBlack.conf" | \ sed "s/0\\.0\\.0\\.0\\ www\\./0\\.0\\.0\\.0\\ /" | \ awk '{FS=" "}{print $2}' |sed s/^/address=\\// | sed s/$/\\/0.0.0.0/ | \ sort | uniq > "local-${CURRENTDAY}-dns-void-StevenBlack.conf"
Oh, I confirm that there is a problem with that. dnsmasq has started searching for previous used domain entries. It uses simple linear walk without any optimization. That does not scale well if used domains is a high number. It searches it linearly, which gets slow when there is a lot of them. It slows down just simple --test mode enough to be visible. # for I in {1..10000}; do printf "address=/block.%x.%x/0.0.0.0\n" $RANDOM $RANDOM >> block-10k.conf; done # time dnsmasq --test --conf-file=block-10k.conf dnsmasq: syntax check OK. real 0m0.968s user 0m0.958s sys 0m0.005s # for I in {1..50000}; do printf "address=/block.%x.%x/0.0.0.0\n" $RANDOM $RANDOM >> block-50k.conf; done # time dnsmasq --test --conf-file=block-50k.conf dnsmasq: syntax check OK. real 0m21.197s user 0m21.060s sys 0m0.030s # for I in {1..100000}; do printf "address=/block.%x.%x/0.0.0.0\n" $RANDOM $RANDOM >> block-100k.conf; done # time dnsmasq --test --conf-file=block-100k.conf dnsmasq: syntax check OK. real 1m33.076s user 1m32.534s sys 0m0.060s
This is exactly problem that larger rewrite solved in version 2.86, but which caused a lot of issues later. The reason why I chose to not just rebase to newer version. I tried to use simpler method to implement similar result, which does not change so much. But I am not sure this can be solved in my downstream changes. I have avoided introducing sorted array, but I need to save last_server per domain somewhere. That requires searching if that domain already has a record, which leads to exponential complexity. I would use unbound to create so many blocked domains entries myself. But fixing this would not be simple. I guess I could use a trick to not search for existing domains for record types like --local=/blocked/ or --address=/blocked/#. Those do not need records stored in struct server_domain, those are relevant just for forwarding to servers having an IP. Similar logic exists in upstream version too.
Prepared a fix candidate. It walks the existing records just for normal servers, not --local or --address=/x/#. Not yet properly tested. https://gitlab.com/redhat/centos-stream/rpms/dnsmasq/-/merge_requests/19
Needed a fix for default forwarders, seems okay after basic checks.
Created a simple regression test: https://src.fedoraproject.org/tests/dnsmasq/c/e54f9dd42fc33e8a7e63d4dc278f631bda34ed49