Bug 1084747
Summary: | dnscache gets 100% CPU usage with version 1.05.9 | ||
---|---|---|---|
Product: | [Fedora] Fedora EPEL | Reporter: | Francisco Miguel Biete <fbiete> |
Component: | ndjbdns | Assignee: | pjp <pj.pandit> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | el6 | CC: | amb, mkent, negativo17, pj.pandit, tstewart, weeks |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-06-30 09:31:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Francisco Miguel Biete
2014-04-06 08:17:22 UTC
Hello Francisco, Thank you for reporting this issue. Though it is not easily reproducible on my machine. Is there any specific setting/condition that needs to occur for it to hit the 99% cpu usage level? 'gettimeofday' call are made when processing new requests or when request times out. I'll try to reproduce it to debug more. Thank you. I'm able to reproduce it now. It seems to happen over a slow or saturated(~40% packet loss) connection, with MERGEQUERIES enabled(ie. non-null). Hi, I don't see any error in the interface, but it's a virtualized server over XenServer, and my clients are over GPRS and 3G networks so they could be slow. UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16719340 errors:0 dropped:0 overruns:0 frame:0 TX packets:1087759 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:5936484279 (5.5 GiB) TX bytes:142185683 (135.5 MiB) Interrupt:247 This is my dnscache.conf, and yes MERGEQUERIES is enabled: DATALIMIT=8000000 CACHESIZE=5000000 IP=10.2.19.81 IPSEND=0.0.0.0 UID=2 GID=2 ROOT=/etc/ndjbdns HIDETTL= FORWARDONLY= MERGEQUERIES=1 DEBUG_LEVEL=1 I'm going to test disabling MERGEQUERIES... Regards, Setting MERGEQUERIES=0 doesn't help. After a few minutes dnscache CPU is 100% again. Removing the line MERGEQUERIES helps, I'm not seeing high CPU usage. So, it seems to be a change between 1.05.8 and 1.05.9 related with MERGEQUERIES. (In reply to Francisco Miguel Biete from comment #4) > Setting MERGEQUERIES=0 doesn't help. After a few minutes dnscache CPU is > 100% again. Setting it to 0 would not disable it, but not setting any value at all would. Ie. it needs to be NULL to disable it. MERGEQUERIES= > Removing the line MERGEQUERIES helps, I'm not seeing high CPU usage. Right. It needs to be NULL to disable it. > So, it seems to be a change between 1.05.8 and 1.05.9 related with > MERGEQUERIES. I think the issue is with the MERGEQUERIES function. I'm debugging it more to see how to fix it best. Thank you so much for the update and for confirming the issue. Thank you. Seeing this issue here as well with ndjbdns 1.06: after a brief period of time dnscache starts using 100% of a cpu and won't come down. Initially I thought it was from cache eviction, but DEBUG=3 confirmed we aren't using all the cache. Disabling MERGEQUERIES seems to have fixed it. I have submitted a patch to the ndjbdns project on GitHub that may resolve this issue. It can be found at https://github.com/pjps/ndjbdns/pull/29. Hello Tim, (In reply to Tim Stewart from comment #7) > I have submitted a patch to the ndjbdns project on GitHub that may resolve > this issue. It can be found at https://github.com/pjps/ndjbdns/pull/29. Great! Thank you so much for the patch. I'll pull and merge it soon. Thank you. Proposed patch(above) from Tim has been merged upstream -> https://github.com/pjps/ndjbdns/commit/73527fbe28c6e18229b28f9d437be0ab5960c21f Package has been retired, project is unmaintained since 2017. |