Bug 1297502
Summary: | [RFE] Add support for modifying the TCMalloc thread cache | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Kyle Bader <kbader> |
Component: | Build | Assignee: | Samuel Just <sjust> |
Status: | CLOSED ERRATA | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
Severity: | medium | Docs Contact: | Bara Ancincova <bancinco> |
Priority: | unspecified | ||
Version: | 1.3.2 | CC: | bhubbard, ceph-eng-bugs, dzafman, flucifre, gmeno, hnallurv, kbader, kchai, kdreyer, mnelson, racpatel, sjust, vumrao |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | 1.3.2 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHEL: ceph-0.94.5-4.el7cp Ubuntu: ceph_0.94.5-3redhat1trusty | Doc Type: | Enhancement |
Doc Text: |
.TCMalloc thread cache is now configurable
With Red Hat Ceph Storage 1.3.2, support for modifying the size of the `TCMalloc` thread cache has been added. Increasing the thread cache size significantly improves Ceph cluster performance.
To set the thread cache size, edit the value of the `TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES` parameter in the Ceph system configuration
file, that is `/etc/sysconfig/ceph` for Red Hat Enterprise Linux and `/etc/default/ceph` for Ubuntu.
In addition, the default value of `TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES` has been changed from 32 MB to 128 MB.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-02-29 14:44:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1299303 |
Description
Kyle Bader
2016-01-11 17:09:39 UTC
Gregory, it would be great if we could have this in 1.3.2 — can this be done? Maybe backport https://github.com/ceph/ceph/pull/6732 , which would give us the ability to set this /etc/sysconfig/ceph (RHEL). For Ubuntu, that PR doesn't touch the upstart files in src/upstart, so we'd need to add something like [ -f /etc/default/ceph ] && . /etc/default/ceph ...to each upstart script, and possibly export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES as well. Re-targeting to 1.3.2 , let's try to get this into the RHEL packaging if we can. Are you sure TCMalloc defaults to 32MB when the user specifies nothing? http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html seems to indicate it's 16MB. Should we default to any value in /etc/sysconfig/ceph, or leave a line commented out there for users to un-comment ? Mark (or anyone), how can I empirically verify that TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is taking effect? (In reply to Ken Dreyer (Red Hat) from comment #5) > Are you sure TCMalloc defaults to 32MB when the user specifies nothing? > http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html seems to > indicate it's 16MB. I see, "The default cache size is 32M, the tcmalloc documentation is outdated" https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg23575.html James Page @ Ubuntu has cherry-picked the patch that makes TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES effective. This is in google-perftools 2.1-2ubuntu1.1. So in theory we can implement a solution for both RHEL 7 and Ubuntu Trusty. Still need to know the following: 1) Do we want to choose a default value (greater than 32MB), or let the user decide? 2) How can I empirically verify that TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is taking effect? For 2, it doesn't look like we can use the existing memory profiling code to determine the total thread cache size: http://docs.ceph.com/docs/master/rados/troubleshooting/memory-profiling/ We should probably add a admin socket command to inspect the tcmalloc thread cache size a la: MallocExtension::instance()->GetNumericProperty(tcmalloc.current_total_thread_cache_bytes, &value); https://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html#Sizing_Thread_Cache_Free_Lists Do we want to report tcmalloc.current_total_thread_cache_bytes or tcmalloc.max_total_thread_cache_bytes? Or both? Who can add that functionality to the admin socket? Yeah, you're right. We want tcmalloc.max_total_thread_cache_bytes. I've verified that you can inspect the thread cache size, and interestingly, you can also set it at runtime. This means that we could potentially have the daemon set it's own value, based off something in ceph.conf. Example: https://gist.github.com/mmgaggle/a5818d4e8528d3681534 We need a patch to Ceph upstream for this. (Mark, if you're not the best assignee, please re-assign as appropriate) proposed init systems change: https://github.com/ceph/ceph/pull/7304 After discussions with Kyle, Brent, Mark, Neil and many others, we all agree that default thread cache should be at 128MB by default. Please change the default setting. I will take care of release notes and doc bugs associated. Hi Federico, I have few questions: 1) It's expected that this fix will improve the performance by 4-5x. Is there a need in 1.3.2 to support this by running performance tests? If yes, then we may have to coordinate with Ben Turner and Mark Nelson. 2) As per comment 15, the default thread cache would be 128MB by default. Do we allow users to change it? If yes, please share the steps to do so on both RHEL and Ubuntu. 3) How to make sure that whatever the value(or default value) we have set for thread cache has taken into effect on both RHEL and Ubuntu clusters? Need steps/instructions for this. 4) We would be running automated tests on the RHEL and some manual tests on Ubuntu with this fix in place. Is there anything else that need to be tested apart from these (Ken, can you please confirm here?) ? I feel the scope of testing this fix for now would be to test 2) and/or 3) [with 4) being regression tests] above. Please let me know your opinion. Thanks, Harish (In reply to Harish NV Rao from comment #16) > 2) As per comment 15, the default thread cache would be 128MB by default. Do > we allow users to change it? If yes, please share the steps to do so on both > RHEL and Ubuntu. Yes, we will add a "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M" setting in /etc/sysconfig/ceph (RHEL) and /etc/default/ceph (Ubuntu). Users will be allowed to edit this setting to "64MB", for example, if they wish. > 3) How to make sure that whatever the value(or default value) we have set > for thread cache has taken into effect on both RHEL and Ubuntu clusters? > Need steps/instructions for this. On your OSDs, check the output of "ps e -p <ceph-osd-pid>". For example, this checks all the OSD pids on a system: ps e -p $(pgrep ceph-osd) | grep --color=auto TCMALLOC It may be a big wall of text that is hard to read, so "--color" helps there. If "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M" is in the output there, you will know that it is in effect. > 4) We would be running automated tests on the RHEL and some manual tests on > Ubuntu with this fix in place. Is there anything else that need to be tested > apart from these (Ken, can you please confirm here?) ? Not that I can think of. If it's not already in the regression tests for gperftools, we might want to use this test to ensure the allocator is honoring the environmental variable: https://launchpadlibrarian.net/202635014/gperftest.c To be clear to QE, things to check with this bug: 1. After installing the ceph-osd packages, verify /etc/default/ceph (Ubuntu) or /etc/sysconfig/ceph (RHEL) contains a TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES setting of 128M out of the box. 2. After starting up the OSD service, verify that TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is part of the OSD pid's environment. Run "ps e -p <ceph-osd-pid>". For example, this checks all the OSD pids on a system: ps e -p $(pgrep ceph-osd) | grep --color=auto TCMALLOC It may be a big wall of text that is hard to read, so "--color" helps there. If "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M" is in the output there, you will know that it is in effect. 3. Change the value to something else (eg 64M), restart the daemons, and check again with "ps" that the environment variable reflects the new "64M" value. Verified as mentioned in comment 21 on RHEL machine. default value is 128 MB and changed to 64MB, 32MB and back to 128MB. working as expected hence moving to verified version:- ceph-osd-0.94.5-4.el7cp.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:0313 upstream change to 128MB by default: https://github.com/ceph/ceph/pull/7934 |