1578051 – named eating memory up to OOM crash

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1578051 - named eating memory up to OOM crash

Summary: named eating memory up to OOM crash

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	bind
Sub Component:
Version:	7.5
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Tomas Korbar
QA Contact:	qe-baseos-daemons
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1692940 (view as bug list)
Depends On:
Blocks:	1709724 1780662
TreeView+	depends on / blocked

Reported:	2018-05-14 17:46 UTC by saturninlepoulet
Modified:	2024-03-25 15:04 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-11-11 21:54:59 UTC
Target Upstream Version:
Embargoed:
Flags:	thozza: mirror+

Attachments	(Terms of Use)
Resultst related to comment 13 (148.21 KB, text/plain) 2019-07-29 12:55 UTC, saturninlepoulet	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Gitlab	1693	0	None	None	None	2020-03-20 08:37:24 UTC

Description saturninlepoulet 2018-05-14 17:46:58 UTC

Description of problem:
Centos upgraded from 7.4.1708 with bind 9.9.4-51.el7_4.2.x86_64 to 7.5.1804 with bind bind-9.9.4-61.el7.x86_64.
Since upgrade, when name-chroot starts, process eat up to 50% physical memory and will increase quickly up to 80%.
Upgrade was performed thursday night (10th May) and process finally crashed saturday night because of Out Of memory : Out of memory: Kill process 1140 (named) score 919 or sacrifice child.
Swap was used 100% before the crash :
Free swap  = 0kB
Total swap = 2097148kB


Version-Release number of selected component (if applicable):
bind-9.9.4-61.el7.x86_64 in Centos 7.5.1804



Additional info:
Load on bind is not so huge : average of 500 queries per minute
Same thing for master or slave instances
Bind views are in use
Prior upgrade, bind used ~300MB of memory

Comment 2 saturninlepoulet 2018-05-14 18:08:27 UTC

Update :
- By disabling views from /etc/named.conf, named memory usage is ~3% (of 2GB) when starting. With views enabled, memory usage is ~50%...
Does something change regarding memory allocation for views ? Does memory scale with view count ?

- Adding more memory on server had same behavior :
> 2GB in VM = ~50% named memory usage
> 4GB in VM = ~23% named memory usage

Comment 3 saturninlepoulet 2018-05-14 19:01:59 UTC

Update :
- From a fresh new 7.4 installation from minimal ISO :
> update yum -y
> yum install bind-chroot -y
> copy bind config files from impacted server 
> systemctl start named-chroot
Result : memory usage for named is ~50% of physical memory

- SELinux is always enabled and set to enforcing. Disabling it does not change anything

- Seems named memory usage scales with view count from /etc/named.conf :
in 2GB server : named memory usage when process starts :
> no view : ~3%
> 1 view : ~5%
> 2 views : ~8%

We have 17 views in total. It matches rule of ~60MB per view :
17 views : ~1020MB (~50% of physical memory)

Does view memory allocation has changed in this Bind version or with Centos 7.5 ?

Comment 4 saturninlepoulet 2018-05-16 17:22:25 UTC

Even after adding more memory to both DNS servers (4GB total), named continue to eat memory...

Both still have 17 views :

More than 75% of 4GB :
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1156 named     20   0 3497252   3.1g   1736 S   0.7 83.6   8:24.87 named

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1104 named     20   0 4092276   3.1g   1728 S   2.3 83.3  16:43.02 named

Comment 5 Petr Menšík 2018-05-18 14:19:26 UTC

Hi, can you share more about your configuration? Options would be helpful, named-checkconf -p would be great.

Have you tried limiting max-cache-size? I think in this version default is unlimited, which might be problem in some uses. Could you try to limit max-cache-size to 50% or less?

When it eats a lot of memory, can you try rndc stats, then providing part of /var/named/data/named_stats.txt related to memory?

Comment 6 saturninlepoulet 2018-05-22 17:16:58 UTC

Here are options :
options {
        directory "/var/named";
        dump-file "/var/named/named.dump";
        interface-interval 0;
        statistics-file "/var/named/named.stats";
        version none;
        allow-recursion {
                "any";
        };
        transfer-format many-answers;
        allow-transfer {
                "xfer";
        };
        max-transfer-time-in 60;
};

max-cache-size 10m; has been tested but does not change anything
memstatistics-file with "-m record" has been tested to dump memory in file on exit but does not provide useful information...
stacksize 100m; has been tested with no difference
There is nothing related to memory in stats (statistics-file)

Something I noticed after some tests : using "rndc reload" increase memory usage immediately...

# systemctl start named-chroot
//named consumed ~32% of 4GB :
PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
21714 named     20   0 1527464   1.2g   2984 S   0.0 32.2   0:02.17 named

# rndc reload
//named now consumed ~62% of 4GB
 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
21714 named     20   0 2647556   2.3g   3060 S   0.0 62.3   0:06.00 named

Comment 7 saturninlepoulet 2018-08-10 16:36:35 UTC

Latest bind version from ISC packages have been tested :
- 9.12.2-P1
- 9.9.13-P1

Both have been compiled from source and installed in order to use same views

Once named is started, memory usage is never higher than 4%...
After several rndc reload, usage do not increase.

Latest RH version (9.9.4-RedHat-9.9.4-61.el7) still consume a lot of memory (more than 50%)

Comment 8 Petr Menšík 2018-08-13 19:28:39 UTC

How many zones are in your configuration? It seems to be quite hungry as you are describing it. This way, it would crash the machine after 3-4 reloads. Noone other reported similar issue. This would be quite visible difference. Have you tried upstream version with the same configuration and data directory?

Our version is patches and has enabled some features. But nothing that should make such a difference. Red Hat distributed bind is compiled with --tuning=large, I guess such option were not used in upstream. But I doubt the difference should be so huge.

When you stop bind, will it crash? Bind has memory allocation checking inside. It would crash if memory was lost. If shutdown is clean, the memory is allocated in way that bind can free again. Do the views share common zone files usually? Could you share more details, how are assigned zones into views?

Current version cannot share data between zones. If the same file is used in each zone, its data would be allocated separately for each zone. More recent BIND has option to share loaded zone between views using in-view clause. I guess that was not used here.

Have you tried comparing memory usage with named and named-chroot service? Do they both consume so much memory?

Comment 9 saturninlepoulet 2018-08-14 17:46:12 UTC

I will first answer your questions :

- How many zones are in your configuration?
36

- Have you tried upstream version with the same configuration and data directory?
yes, and same result with high memory usage

- When you stop bind, will it crash?
no, process just end and log entries show clean exit

- Do the views share common zone files usually? Could you share more details, how are assigned zones into views?
Each view has its own zone file and but view contains multiple zones
Some zone files are shared like named.blackhole

- Have you tried comparing memory usage with named and named-chroot service? Do they both consume so much memory?
yes, both consume same memory


Now, thanks to your explanation on RH package built with --with-tuning=large, I was able to identify root cause. This compilation flag is the guilty here.

I compiled ISC 9.12.2-P1 package WITHOUT any specific flag, just keep default. I used our files/data config including views and memory usage never eat more than 4% even after multiple rndc reload.
Now, I compiled ISC 9.12.2-P1 WITH --with-tuning=large and did same test with same config. When named starts, it eats ~24% memory. After first rndc reload 3 seconds later, memory usage increases to 46%

So, the only difference is this compilation flag you have by default used on RH package.
Please correct me if I'm wrong, from what I understood, there is no way to disable it, there is no named option...

As explained in ISC README, this flag "can improve performance on big servers but will consume more memory and may degrade performance on smaller systems"

Should you remove this flag from RH Bind RPM ?

Comment 10 saturninlepoulet 2018-08-14 18:09:58 UTC

Sorry, regarding first question : How many zones are in your configuration?
Answer is 3580, not 36

Comment 11 saturninlepoulet 2018-08-30 18:38:38 UTC

Hi,

So what do you of removing --with-tuning=large from RH build ?
Or adding an option to turn on/off this setting ?

Thanks

Comment 12 saturninlepoulet 2018-12-10 14:01:15 UTC

So guys, no update since august ? Could we think to have ability to disable this tuning option that is an issue in some case ?

Comment 13 Martin Osvald 🛹 2019-07-18 12:47:26 UTC

Hello,

my name is Martin Osvald and I am helping my colleague with bind cases which are old or not touched for a very long time.

(In reply to saturninlepoulet from comment #9)
...
> Now, thanks to your explanation on RH package built with
> --with-tuning=large, I was able to identify root cause. This compilation
> flag is the guilty here.

Yes, '--with-tuning=large' option introduces increased size of internal buffers and other limits, mainly the number of preallocated task structures to speed up processing and increase the number of clients that can be served.

This change was introduced by bind99-rh1464850.patch for the below Bugzilla/RFE:

Bug 1464850 - [RFE] backport whole --with-tuning=large option from bind 9.10.x and use it during compilation

bind99-rh1464850.patch
~~~
+       Certain compiled-in constants and default settings can be
+       increased to values better suited to large servers with abundant
+       memory resources (e.g, 64-bit servers with 12G or more of memory)
+       by specifying "--with-tuning=large" on the configure command
+       line. This can improve performance on big servers, but will
+       consume more memory and may degrade performance on smaller
+       systems.
...
+#ifdef TUNE_LARGE
+#define RESOLVER_NTASKS 523
+#define UDPBUFFERS 32768
+#define EXCLBUFFERS 32768
+#else
+#define RESOLVER_NTASKS 31
+#define UDPBUFFERS 1000
+#define EXCLBUFFERS 4096
+#endif /* TUNE_LARGE */
~~~

Originally, the values under the above #else statement were used in version 9.9.4-51.el7_4.2. If you compare them with values (RESOLVER_NTASKS) which are used now you can deduce how much memory is expected to get consumed with >=9.9.4-61. Roughly 17 times more.

For every view, there is now 523 task struct created instead of 31. With the high number of views (17) it is expected it will consume such amount of memory.

> 
> I compiled ISC 9.12.2-P1 package WITHOUT any specific flag, just keep
> default. I used our files/data config including views and memory usage never
> eat more than 4% even after multiple rndc reload.
> Now, I compiled ISC 9.12.2-P1 WITH --with-tuning=large and did same test
> with same config. When named starts, it eats ~24% memory. After first rndc
> reload 3 seconds later, memory usage increases to 46%

This is also expected due to the internal behavior during reload. I have made some tests and was able to reproduce ~1.4G of residual memory just after named process started and ~2G of residual memory after few 'rndc reload's and after another thousand of 'rndc reload' commands executed it stayed at that level and didn't increase.

If it kept increasing over time for you, that would be a bug, but it doesn't happen according to the testing. Of course, I cannot fully simulate your environment so if it keeps increasing for you, please, let me know. The top command outputs you provided so far don't show that. I would need to see a top command output directly after the start, and then preferably every five minutes for the 24 hours to have enough evidence to say yes this looks like a memory leak. For example, you can use the below command to gather such information (it writes top output to file top-named.txt every 5 minutes for 24 hours):

~~~
# timeout -s INT 24h /bin/bash -c 'while true; do top -p `pidof named` -b -n 1 | tee top-named.txt; sleep 300; done'
~~~

Otherwise, the higher memory consumption in your setup scenario (17 views) is expected.

> 
> So, the only difference is this compilation flag you have by default used on
> RH package.
> Please correct me if I'm wrong, from what I understood, there is no way to
> disable it, there is no named option...

No, there is no way to tune this during runtime, the values are hardcoded.

(In reply to saturninlepoulet from comment #11)
> Hi,
> 
> So what do you of removing --with-tuning=large from RH build ?
> Or adding an option to turn on/off this setting ?
> 
> Thanks

No, there is no plan to do so if we introduced it on purpose to fix bug 1464850.

On the other hand, it sounds like something which could be made runtime configurable, but this isn't definitely a candidate for RHEL7, but rather RHEL8, because 7.8 release is supposed to be in Maintenance Phase 1, the subsystem team does not plan to include any new features or rebases in the release.

Anyway for such change to get to RHEL8, it would need to get to upstream and Fedora first. So I would advise you to open a new BZ against Fedora to introduce runtime configuration directive(s) to influence the settings introduced by bind99-rh1464850.patch.


Anyway, if you don't provide evidence for any memory leak within one month, I will (have to) close this BZ as a NOTABUG.

Comment 16 George 2019-07-26 08:10:37 UTC

Hello,

I have a similar issue but without using a single view.

centos 7

 rndc status
version: 9.9.4-RedHat-9.9.4-74.el7_6.1 (BIND) <id:8f9657aa>
CPUs found: 36
worker threads: 36
UDP listeners per interface: 36
number of zones: 2764
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 7/150
server is up and running



  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+   SWAP COMMAND
240615 named     20   0 5768.3m   1.6g   3.4m S   1.8  1.7 445:28.22   2.3g named


 named-checkconf -p 
options {
        bindkeys-file "/etc/named.iscdlv.key";
        session-keyfile "/run/named/session.key";
        directory "/var/named";
        dump-file "/var/named/data/cache_dump.db";
        listen-on port 53 {
                "any";
        };
        listen-on-v6 port 53 {
                "none";
        };
        managed-keys-directory "/var/named/dynamic";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        pid-file "/run/named/named.pid";
        statistics-file "/var/named/data/named_stats.txt";
        version "BIND";
        dnssec-enable yes;
        dnssec-validation yes;
        recursion yes;
        also-notify {
                XXX.XXXX.XXX.XXX ;
        };
        notify yes;
};
logging {
        channel "default_debug" {
                file "data/named.run";
                severity warning;
        };
        category "default" {
                "default_debug";
        };
};
zone "." IN {
        type hint;
        file "named.ca";
};
zone "localhost.localdomain" IN {
        type master;
        file "named.localhost";
        allow-update {
                "none";


restarting named took more than a minute and in the log
grep named /var/log/messages
Jul 26 02:50:22 s15 systemd: named.service stop-sigterm timed out. Killing.
Jul 26 02:50:23 s15 systemd: named.service: main process exited, code=killed, status=9/KILL
Jul 26 02:50:23 s15 systemd: Unit named.service entered failed state.
Jul 26 02:50:23 s15 systemd: named.service failed.
Jul 26 02:50:29 s15 named[102298]: starting BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 -u named -c /etc/named.conf -4

top output after restart:

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+   SWAP COMMAND
102298 named     20   0 3002.0m 268.9m   3.6m S   2.7  0.3   0:10.82   0.0m named



====================================
centos 6

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP COMMAND
 6939 named     20   0 2289m 159m 2864 S  0.0  0.2 179:43.40  18m named

rndc status
version: 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6_10.3 (BIND)
CPUs found: 28
worker threads: 28
number of zones: 2305
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 2/0/1000
tcp clients: 6/100
server is up and running

named-checkconf -p
options {
        bindkeys-file "/etc/named.iscdlv.key";
        directory "/var/named";
        dump-file "/var/named/data/cache_dump.db";
        listen-on port 53 {
                "any";
        };
        listen-on-v6 port 53 {
                "none";
        };
        managed-keys-directory "/var/named/dynamic";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        statistics-file "/var/named/data/named_stats.txt";
        version "BIND";
        dnssec-enable yes;
        dnssec-validation yes;
        recursion yes;
        allow-transfer {
                XXX.XXX.XXX.XXX/32;
        };
        also-notify {
                XXX.XXX.XXX.XXX;
        };
        notify yes;
};
logging {
        channel "default_debug" {
                file "data/named.run";
                severity warning;
        };
        category "default" {
                "default_debug";
        };
};
zone "." IN {
        type hint;
        file "named.ca";
};


this sis not normal at all

Comment 17 saturninlepoulet 2019-07-29 12:54:27 UTC

Hello Martin,

Thank you for your answer and explation.
Please found results of your asked execution.
During that 24h, named did not crash even after reload operation following log rotation. It does not crash each time, but still frequently and sometimes after manual reload

I'll attach results to this bug because of 65000 characters limitation

Thanks!

Comment 18 saturninlepoulet 2019-07-29 12:55:31 UTC

Created attachment 1594261 [details]
Resultst related to comment 13

Comment 19 saturninlepoulet 2019-09-06 15:35:07 UTC

Hello Martin,

Any news about that ?

Thanks

Comment 20 Martin Osvald 🛹 2019-09-24 09:17:52 UTC

Hello,

I am sorry for the delay in my reply.

Short version:

Please, could you try to:

1. set the MALLOC_ARENA_MAX environment variable to 1
2. and on the same terminal restart the bind (so it inherits this variable)
3. then do the usual steps to reproduce the high memory consumption (i.e. letting it run for some time or running rndc reload command multiple times):

~~~
# export MALLOC_ARENA_MAX=1
# service named restart
# rndc reload
# rndc reload
...
~~~

4. and then check whether the residual memory still keeps significantly growing?

If it still gets growing, there is a memory leak and we will have to investigate further, but if it keeps at the same level over the longer period of time it would be good to tune MALLOC_ARENA_MAX variable to limit the number of created arenas to prevent involving oom killer.

Normally, malloc() tries to create a new arena for every newly created thread until it hits the limit calculated by the below glibc code to speed up allocation for cases where there are many threads running concurrently to minimize thread contention. In short, it is "number of cores * X", where X is 2 or 8 depending on whether the machine is 32 or 64bit respectively:

~~~
glibc-2.17-c758a686/malloc/malloc.c:
1806 # define NARENAS_FROM_NCORES(n) ((n) * (sizeof(long) == 4 ? 2 : 8))

glibc-2.17-c758a686/malloc/arena.c:
 932 arena_get2(mstate a_tsd, size_t size, mstate avoid_arena)
 933 {
...
 948               int n  = __get_nprocs ();
 949 
 950               if (n >= 1)
 951                 narenas_limit = NARENAS_FROM_NCORES (n);
 952               else
 953                 /* We have no information about the system.  Assume two
 954                    cores.  */
 955                 narenas_limit = NARENAS_FROM_NCORES (2);
~~~

For rndc reload command the named process remains running (not terminated) and only threads get re-spawned so it means the previously created arenas remain allocated and being reused, thus it has such big memory footprint.

The size of RES memory is also influenced by how often madvise(MADV_DONTNEED) is called by malloc code and whether the kernel acts accordingly. madvise(MADV_DONTNEED) is used to inform kernel that parts of memory are no longer needed and that it can claim them as free. From previous experience working on Bug 1583218, there might be a problem at kernel side revealing itself only in very high intensive workload environments, where freeing the memory doesn't lead to lowering RES, but there is no proof of that so far in this BZ.


Longer version:

I further investigated this and it seems the memory is consumed by malloc arenas.

I restarted the bind and with each rndc reload I obtained pmap outputs to compare them gradually:

~~~
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
fresh start:
118139 named     20   0 1440356   1.0g   2848 S   0.0 27.6   0:39.83 named
1st restart:
118139 named     20   0 1843712   1.4g   2920 S   0.0 38.8   0:44.18 named
2nd restart:
118139 named     20   0 2166712   1.7g   2920 S   0.0 47.3   0:48.08 named
3rd restart:
118139 named     20   0 2625464   2.1g   2920 S   0.0 58.4   0:56.43 named
~~~

1st restart:

~~~
# diff -y --suppress-common-lines pmap-x-118139.1.txt pmap-x-118139.2.txt
...
                                                              > 00007fff8c000000   65524   65512   65512 rw---   [ anon ]
                                                              > 00007fff8fffd000      12       0       0 -----   [ anon ]
                                                              > 00007fff90000000   65528   65516   65516 rw---   [ anon ]
                                                              > 00007fff93ffe000       8       0       0 -----   [ anon ]
                                                              > 00007fff94000000   65528   65516   65516 rw---   [ anon ]
                                                              > 00007fff97ffe000       8       0       0 -----   [ anon ]
                                                              > 00007fff9b3d0000   12480   12292   12292 rw---   [ anon ]
                                                              > 00007fff9c000000   65528   65520   65520 rw---   [ anon ]
                                                              > 00007fff9fffe000       8       0       0 -----   [ anon ]
                                                              > 00007fffa0008000  131040  130984  130984 rw---   [ anon ]
                                                              > 00007fffa8000000   65528   65520   65520 rw---   [ anon ]
                                                              > 00007fffabffe000       8       0       0 -----   [ anon ]
                                                              > 00007fffac004000   65520   65512   65512 rw---   [ anon ]
...
#
~~~

2nd restart:

~~~
# diff -y --suppress-common-lines pmap-x-118139.2.txt pmap-x-118139.3.txt
...

                                                              > 00007fff74000000   65512   65500   65500 rw---   [ anon ]
                                                              > 00007fff77ffa000      24       0       0 -----   [ anon ]
                                                              > 00007fff78000000   65524   65516   65516 rw---   [ anon ]
                                                              > 00007fff7bffd000      12       0       0 -----   [ anon ]
                                                              > 00007fff7c000000   65512   65500   65500 rw---   [ anon ]
                                                              > 00007fff7fffa000      24       0       0 -----   [ anon ]
                                                              > 00007fff80000000   65516   65504   65504 rw---   [ anon ]
                                                              > 00007fff83ffb000      20       0       0 -----   [ anon ]
                                                              > 00007fff84000000   65528   16728   16728 rw---   [ anon ]
...
~~~

3rd restart:

~~~
# diff -y --suppress-common-lines pmap-x-118139.3.txt pmap-x-118139.4.txt
...
                                                              > 00007fff67ffa000      24       0       0 -----   [ anon ]
                                                              > 00007fff68000000   65516   65504   65504 rw---   [ anon ]
                                                              > 00007fff6bffb000      20       0       0 -----   [ anon ]
                                                              > 00007fff6c000000   65512   65500   65500 rw---   [ anon ]
                                                              > 00007fff6fffa000      24       0       0 -----   [ anon ]
                                                              > 00007fff70000000   65532   65520   65520 rw---   [ anon ]
                                                              > 00007fff73fff000       4       0       0 -----   [ anon ]
                                                              > 00007fff74000000   65508   65500   65500 rw---   [ anon ]
                                                              > 00007fff77ff9000      28       0       0 -----   [ anon ]
                                                              > 00007fff78000000   65516   65504   65504 rw---   [ anon ]
                                                              > 00007fff7bffb000      20       0       0 -----   [ anon ]
...
~~~

From the above we can see that the RES memory growth is due to the newly created malloc arenas.

Comment 21 Martin Osvald 🛹 2019-09-24 09:31:26 UTC

forgot to set needinfo to get a reply for comment 20

Comment 22 saturninlepoulet 2019-09-26 18:08:08 UTC

Hi Martin,

I followed your actions :
# export MALLOC_ARENA_MAX=1
# service named restart
# rndc reload
# rndc reload

named consumed quickly huge amount of memory

I leaved named running for several days and it did rndc reload by itself over night.
Right now here is status :
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 6707 named     20   0 5067120   3.2g   1640 S   1.3 85.7  25:08.15 named

As you can see, ~85% of memory is still consumed


I just restarted named and executed some rndc reload. As you can see, RES increased very quickly after each reload :
# systemctl restart named-chroot
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
26382 named     20   0 1297492   1.0g   3588 S   2.6 27.0   0:01.25 named

# rndc reload
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
26382 named     20   0 2195236   1.9g   3616 S   0.0 51.3   0:02.74 named

# rndc reload
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
26382 named     20   0 3167096   2.8g   3624 S   2.6 76.4   0:04.42 named

# rndc reload
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
26382 named     20   0 3822456   3.4g   1920 S   0.0 91.6   0:06.93 named

Comment 23 Martin Osvald 🛹 2019-09-27 07:56:47 UTC

Hello,

Thank you for the results!

The reason I asked you to set MALLOC_ARENA_MAX=1 is that if the memory in that arena gets exhausted, every follow-up allocation is done through mmap() with a size of one page which can lead (and in your situation does) to such a quick escalation in memory footprint which helps to indicate memory leaks easier.

I am currently investigating it through stap script I developed for these situations and will see if I can spot the allocation routines on my testing machine to see what is probably forgotten to be deallocated during termination routines. If not successful, I will provide you with the steps and the script to gather further investigation data.

I will try to get back to you soon.

Comment 26 Petr Menšík 2020-01-14 11:49:16 UTC

*** Bug 1692940 has been marked as a duplicate of this bug. ***

Comment 27 Petr Menšík 2020-01-14 12:05:49 UTC

Similar issue was reported upstream[1], still without any resolution. It includes useful hints on debugging memory allocations. I have tried pmap tool, which gives some good numbers. Probably just frontend for mentioned kernel pages. But that issue is not related to --tuning=large build option.

1. https://gitlab.isc.org/isc-projects/bind9/issues/446

Comment 30 saturninlepoulet 2020-02-03 17:17:10 UTC

Hi guys,

I tested with latest bind-chroot package available for Centos 8.1 - same issue after some rndc reload memory is almost all consumed...

[root@localhost ~]# cat /etc/redhat-release
CentOS Linux release 8.1.1911 (Core)
[root@localhost ~]# 
[root@localhost ~]# rpm -qa | grep bind
bind-export-libs-9.11.4-26.P2.el8.x86_64
bind-license-9.11.4-26.P2.el8.noarch
bind-chroot-9.11.4-26.P2.el8.x86_64
bind-libs-lite-9.11.4-26.P2.el8.x86_64
bind-9.11.4-26.P2.el8.x86_64
bind-libs-9.11.4-26.P2.el8.x86_64
[root@localhost ~]# 
[root@localhost ~]# rndc status
version: BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el8 (Extended Support Version) <id:7107deb> (version.bind/txt/ch disabled)
running on localhost.localdomain: Linux x86_64 4.18.0-147.3.1.el8_1.x86_64 #1 SMP Fri Jan 3 23:55:26 UTC 2020
boot time: Mon, 03 Feb 2020 17:13:59 GMT
last configured: Mon, 03 Feb 2020 17:14:00 GMT
configuration file: /etc/named.conf (/var/named/chroot/etc/named.conf)
CPUs found: 2
worker threads: 2
UDP listeners per interface: 1
number of zones: 4255 (4158 automatic)
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is ON
recursive clients: 0/900/1000
tcp clients: 3/150
server is up and running

Comment 34 saturninlepoulet 2020-02-11 14:09:56 UTC

Hi guys,

Since issue started following --with-tuning=large flag in compiled package, could you please remove that flag and/or provide a configuration settings to allow us to enable/disable that feature ? It would solve performance issue immediately...

Thanks

Comment 36 Tomáš Hozza 2020-02-11 14:54:07 UTC

(In reply to saturninlepoulet from comment #34)
> Hi guys,
> 
> Since issue started following --with-tuning=large flag in compiled package,
> could you please remove that flag and/or provide a configuration settings to
> allow us to enable/disable that feature ? It would solve performance issue
> immediately...
> 
> Thanks

Hello.

Removing the --with-tuning=large flag is not really a solution, because it was added to solve different issue reported in the past. Upstream currently does not provide a way to configure this at runtime. The team is working on optimization of the memory handling as it seems that the issue is somehow related to specific behavior of glibc malloc. We are actively working on the solution, although it may not seem so from the outside.

Comment 44 Vitalijs Volodenkovs 2020-05-18 13:25:06 UTC

hi guys,

Any new for this issue? 
Or temp workaround is to compile without "--with-tuning=large" option?

Thanks

Comment 45 Vitalijs Volodenkovs 2020-05-19 07:13:45 UTC

Hey,

So, probably for us source of problem wasnt option "--with-tuning=large".
We are using zabbix to get status and metrics from bind by using:
statistics-channels {
       inet 127.0.0.1 port 8653 allow { 127.0.0.1; };
};

Every time zabbix is trying to gather some metrics from bind mem usage increase. 

So, try to check your configuration, maybe you have similar issue.

Regards,

Comment 46 Tomas Korbar 2020-05-19 07:30:17 UTC

(In reply to Vitalijs Volodenkovs from comment #44)
> hi guys,
> 
> Any new for this issue? 
> Or temp workaround is to compile without "--with-tuning=large" option?
> 
> Thanks

Hi,
We are currently trying to resolve this issue with upstream in https://gitlab.isc.org/isc-projects/bind9/issues/1693
Compilation without "--with-tuning=large" is not really an option because it would cause issues
elsewhere. We will be able to fix this when upstream states their opinion and has time to
do proper analysis.

Comment 52 Chris Williams 2020-11-11 21:54:59 UTC

Red Hat Enterprise Linux 7 shipped it's final minor release on September 29th, 2020. 7.9 was the last minor releases scheduled for RHEL 7.
From intial triage it does not appear the remaining Bugzillas meet the inclusion criteria for Maintenance Phase 2 and will now be closed. 

From the RHEL life cycle page:
https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase
"During Maintenance Support 2 Phase for Red Hat Enterprise Linux version 7,Red Hat defined Critical and Important impact Security Advisories (RHSAs) and selected (at Red Hat discretion) Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available."

If this BZ was closed in error and meets the above criteria please re-open it flag for 7.9.z, provide suitable business and technical justifications, and follow the process for Accelerated Fixes:
https://source.redhat.com/groups/public/pnt-cxno/pnt_customer_experience_and_operations_wiki/support_delivery_accelerated_fix_release_handbook  

Feature Requests can re-opened and moved to RHEL 8 if the desired functionality is not already present in the product. 

Please reach out to the applicable Product Experience Engineer[0] if you have any questions or concerns.  

[0] https://bugzilla.redhat.com/page.cgi?id=agile_component_mapping.html&product=Red+Hat+Enterprise+Linux+7

Note You need to log in before you can comment on or make changes to this bug.