Bug 1852032

Summary: CACHESIZE value is probably too high and not customizable through hieradata
Product: Red Hat OpenStack Reporter: David Vallee Delisle <dvd>
Component: instack-undercloudAssignee: James Slagle <jslagle>
Status: CLOSED WONTFIX QA Contact: Arik Chernetsky <achernet>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: aschultz, mburns
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-06-29 19:28:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vallee Delisle 2020-06-29 15:44:51 UTC
Description of problem:

On the undercloud
- [1] On OSP10, CACHESIZE in /etc/sysconfig/memcached is set at 95% of available RAM.

- [2] On OSP13, it's set to ~half that amount.

- [3] On our customer's environment, it's set to CACHESIZE="367338" when there's 128773 of RAM. This is like 3x the available RAM.

We are using memcached only to store tokens and at some point, it ballooned to ~15G of RAM and we started getting process killed by oom_killer.

So 3 questions here:
- How is the CACHESIZE calculated?
- Can we override the value in hieradata?
- Considering we're only using memcached for token, on larger environment (300+ nodes), what would be a good limit for CACHESIZE? We thought 1G should be enough. 

Version-Release number of selected component (if applicable):
Customer env:
- puppet-memcached-2.8.1-1.bfa64e0git.el7ost.noarch
- instack-5.1.0-1.el7ost.noarch
- instack-undercloud-5.3.7-6.el7ost.noarch
- facter-2.4.6-3.el7sat.x86_64
OSP10 lab:
- puppet-memcached-2.8.1-1.bfa64e0git.el7ost.noarch
- instack-5.1.0-1.el7ost.noarch
- instack-undercloud-5.3.7-1.el7ost.noarch
- facter-2.4.4-4.el7.x86_64
OSP13 lab:
- puppet-memcached-3.2.0-2.el7ost.noarch
- instack-8.1.1-0.20180313084440.0d768a3.el7ost.noarch
- instack-undercloud-8.4.9-4.el7ost.noarch
- facter-3.9.3-7.el7ost.x86_64
- ruby-facter-3.9.3-7.el7ost.x86_64



Additional info:

[1]
~~~
[stack@undercloud-0 share]$ grep CACHESIZE /etc/sysconfig/memcached
CACHESIZE="26576"
[stack@undercloud-0 share]$ free -m
              total        used        free      shared  buff/cache   available
Mem:          27979        4875        9408           0       13695       22712
Swap:             0           0           0
~~~

[2]
~~~
[stack@undercloud-0 share]$ grep CACHESIZE /etc/sysconfig/memcached 
CACHESIZE="13987"
[stack@undercloud-0 share]$ free -m
              total        used        free      shared  buff/cache   available
Mem:          27979        9547         228           0       18203       18033
Swap:             0           0           0
~~~

[3]
~~~
$ cat ./0010-sosreport-xxx.02684800-20200622153730.tar.xz/sosreport-sreilly.02684800-20200622153730/sos_commands/memory/free_-m
              total        used        free      shared  buff/cache   available
Mem:         128773      109581        8355           3       10835       18216
Swap:          4095        4092           3
~~~

Comment 3 Alex Schultz 2020-06-29 19:27:36 UTC
We've always used puppet-memcached to configure memcache.  It's always been configurable via hieradata as 'memcached::max_memory: <value>'. If <value> ends in a '%' it's interpreted as a percentage, otherwise it assumes its a value in mb.  The place to be able to configure hieradata has changed over the releases and overrides are configurable via undercloud.conf.

In OSP10 it uses 95% if max_memory is not configured (it's not by default). We addressed this due to the oomkiller issue as well via https://review.opendev.org/#/c/431042/. So in OSP13 the default it's 50%. 

Recommended values depend on the usage on the system and the size of the cloud being deployed.  You should be able to get away with 1G for the undercloud if oom is a problem.  That being said, ensuring swap (even a small amount) is configured is a good way to prevent this even if memcached grows.

Comment 4 Alex Schultz 2020-06-29 19:28:24 UTC
Since OSP13 has the hieradata_override configuration option in undercloud.conf and OSP10 is nearing EOL, I'm closing this as WONTFIX since it can be configured.