RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2030239 - named consumed too much memory and failed to reload.
Summary: named consumed too much memory and failed to reload.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: bind
Version: 8.3
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: Petr Menšík
QA Contact: Petr Sklenar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-08 09:33 UTC by Xueying Nie
Modified: 2022-05-10 16:53 UTC (History)
1 user (show)

Fixed In Version: bind-9.11.36-3.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-10 15:29:44 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
gen-views.sh (307 bytes, text/plain)
2022-02-10 14:10 UTC, Petr Menšík
no flags Details
candidate patch (1.96 KB, patch)
2022-02-10 18:40 UTC, Petr Menšík
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Internet Systems Consortium (ISC) isc-projects bind9 issues 1693 0 None None None 2022-02-10 14:55:37 UTC
Internet Systems Consortium (ISC) isc-projects bind9 issues 446 0 None None None 2022-02-10 11:39:12 UTC
Internet Systems Consortium (ISC) isc-projects bind9 merge_requests 3067 0 None None None 2022-02-10 14:46:58 UTC
Red Hat Issue Tracker RHELPLAN-105131 0 None None None 2021-12-08 09:35:25 UTC
Red Hat Product Errata RHSA-2022:2092 0 None None None 2022-05-10 15:29:57 UTC

Description Xueying Nie 2021-12-08 09:33:32 UTC
Description of problem:

- named consumed much more memory and failed to reload after updating from RHEL6.5 to RHEL8.3 with the same configuration.

- As the number of view definitions increases, the memory used tends to increase linearly.

Version-Release number of selected component (if applicable):

bind-9.11.26-6

How reproducible:

Steps to Reproduce:

1.Configure views (above 10000) in "named.conf".
2.Run `systemctl reload named-chroot`
3.Run `systemctl reload named-chroot` again

Actual results:

named failed to reload at step 3.
named consumed about 15G memory.

Expected results:

Reduce the memory consumption of named.

Additional info:

- It seems that the issue is due to the option "--with-tuning=large" added in bind 9.10.

- Similar issue on RHEL7 was reported on Bugzilla 1578051, but it was closed with "CLOSED WONTFIX" status.

Comment 2 Petr Menšík 2022-02-10 11:39:13 UTC
We won't switch back to --tuning=small if that is requested.

Some parameters can be changed on command line. For example OPTIONS+="-S 4096" in /etc/sysconfig/named would use less buffers. But I think this parameter is does not significantly change used resources.

But createview in lib/dns/client.c uses parameter RESOLVER_NTASKS, which is higher with --tuning=large configure parameter. I expect that is responsible for more used memory instead. This parameter has no command line alternative.

In bind 9.11, it does not scale. bind 9.16+ multiplies similar constant per used cpus. If customer would limit number of used CPUs (-n 4), it would limit also amount of used memory. We are preparing bind9.16 new package with new version of bind for RHEL 8.6. Could that work for the customer?

Comment 3 Petr Menšík 2022-02-10 14:10:07 UTC
Created attachment 1860358 [details]
gen-views.sh

Simple generator of small views in higher number. Used to generate include from named.conf, which creates enough of separate views. Uses just predefined zones for simplicity. It does not matter much.

Comment 4 Petr Menšík 2022-02-10 14:35:25 UTC
Above script shows important differences only on 20 views. bind-9.11.36-2.el8.x86_64 reports Memory: 518.0M just with 20 views. It raises to Memory: 998.6M in systemctl status named after rndc reload.

Done on VM with 1 CPU.

Statistics created by rndc stats report just small memory consumption.

++ Cache Statistics ++
[View: 127.0.0.2 (Cache: 127.0.0.2)]
                   0 cache hits
                  78 cache misses
                   0 cache hits (from query)
                   0 cache misses (from query)
                   0 cache records deleted due to memory exhaustion
                   0 cache records deleted due to TTL expiration
                   0 cache database nodes
                  64 cache database hash buckets
              279512 cache tree memory total
               21624 cache tree memory in use
               21680 cache tree highest memory in use
              262144 cache heap memory total
                1024 cache heap memory in use
                1024 cache heap highest memory in use

bind-9.16.23-1.el9.x86_64 has much better consumption. Reports Memory: 160.6M after restart, Memory: 201.4M after rndc reload.
That even when cache statistics reports much higher memory usage:
++ Cache Statistics ++
[View: 127.0.0.2 (Cache: 127.0.0.2)]
                   0 cache hits
                  26 cache misses
                   0 cache hits (from query)
                   0 cache misses (from query)
                   0 cache records deleted due to memory exhaustion
                   0 cache records deleted due to TTL expiration
                   0 cache database nodes
              524288 cache database hash buckets
             4478184 cache tree memory total
             4219136 cache tree memory in use
             4219264 cache tree highest memory in use
              262144 cache heap memory total
                1088 cache heap memory in use
                1088 cache heap highest memory in use

Fedora Rawhide has optimized memory usage further, it uses on build bind-9.16.25-2.fc36.x86_64 only Memory: 140.0M after restart, Memory: 180.6M after reload. But it has introduced issues with bind-dyndb-ldap, so rebase is not possible right now.

Used memory raises approximately in linear way to number of used views. With just 30 views, 769.8M and 1.4G used memory is reported without single external query on 9.11. 389.1M and 487.1M is used on RHEL9 with 50 views.

Comment 5 Petr Menšík 2022-02-10 14:46:59 UTC
The change appeared in MR 3067 [1], commit  0d80266f. I guess we can backports such change also to 9.11 branch. It should help on machines with few CPUs. It might raise consumption on high-count of CPUs however. Already prepared bind9.16 would help without additional changes.

1. https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3067

Comment 6 Petr Menšík 2022-02-10 14:55:38 UTC
Adding also upstream issue link to refused runtime tuning change, from bug #1578051

Comment 7 Petr Menšík 2022-02-10 18:40:31 UTC
Created attachment 1860395 [details]
candidate patch

Modified upstream change. Use per cpu count of tasks, but set high limit to number of used tasks. Starts with lower number of tasks, but ensure 16+ cpu machines use at most original amount of memory.

Comment 18 errata-xmlrpc 2022-05-10 15:29:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: bind security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:2092


Note You need to log in before you can comment on or make changes to this bug.