Bug 1259942
| Summary: | coreutils "sort -M" memory leak | |||
|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Sam Elstob <sam.elstob> | |
| Component: | coreutils | Assignee: | Kamil Dudka <kdudka> | |
| Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 21 | CC: | admiller, burhan.ali, kdudka, kzak, ooprala, ovasik, pbrady, p, twaugh | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | 8.23-11.fc22 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1540059 (view as bug list) | Environment: | ||
| Last Closed: | 2015-09-21 10:47:59 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1540059 | |||
|
Description
Sam Elstob
2015-09-03 21:25:34 UTC
> Additional info:
>
> This issue was found when a set of machines were receiving intermittent OOM > > killer triggers. We tracked it down to a "sort -M <large file>" command being > run as root which was using all the available memory.
To clarify the <large file> that lead to finding this issue was around 1GB of syslog text data.
I believe the test case I have included in the previous comment demonstrates the memory leak exists regardless of the input size.
I've not looked into it, but it does seem restricted to the i18n patch.
There are an amazing amount of allocs to begin with,
never mind the number of frees don't match.
$ valgrind sort -M 10000.txt > /dev/null
HEAP SUMMARY:
in use at exit: 92,797,172 bytes in 482,074 blocks
total heap usage: 723,145 allocs, 241,071 frees, 186,052,277 bytes allocated
LEAK SUMMARY:
definitely lost: 92,794,835 bytes in 482,054 blocks
indirectly lost: 1,088 bytes in 2 blocks
possibly lost: 1,001 bytes in 4 blocks
still reachable: 248 bytes in 14 blocks
$ valgrind sort-upstream -M 10000.txt > /dev/null
HEAP SUMMARY:
in use at exit: 272 bytes in 15 blocks
total heap usage: 53 allocs, 38 frees, 74,696,851 bytes allocated
LEAK SUMMARY:
definitely lost: 24 bytes in 1 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 248 bytes in 14 blocks
$ export LC_ALL=C
$ valgrind sort -M 10000.txt > /dev/null
HEAP SUMMARY:
in use at exit: 1,344 bytes in 6 blocks
total heap usage: 12 allocs, 6 frees, 74,693,156 bytes allocated
LEAK SUMMARY:
definitely lost: 56 bytes in 2 blocks
indirectly lost: 1,088 bytes in 2 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 200 bytes in 2 blocks
suppressed: 0 bytes in 0 blocks
Yes, likely an issue with i18n patch as Pádraig wrote. This patch is a nightmare and we recommend to use C locales when running scripts - as this patch heavily affects performance, memory consumption (because of the leaks) and reliability. We don't want to invest too much time into i18n patch improvements, as there is an effort to do upstream-able rewrite. Anyway Ondrej Oprala did several improvements in sort part of this patch, so keeping it opened for investigation. I've confirmed that "LC_ALL=C" is a workaround in our case. Could you please confirm which cases are affected by the memory leak as we need to apply this work around to a number of production systems? 1. sort 2. sort -M 3. Any other "sort" parameters? If the bug only affects "sort -M" then we can make a more focussed change. Many thanks The issue should only be with sort -M I think. I had a quick look and this should fix it: https://github.com/pixelb/coreutils/commit/fbbe8c06 Thanks Pádraig, patch makes sense to me, should I inform Bernie or do you plan to do that? As to "what's affected" - I think LC_ALL=C is generally better idea for sorting in production environment - unless you rely on some locales specific collation rules. It is definitely safer and faster - as i18n patch is downstream and has history of various issues. This particular bug is limited to sort -M only, but we can't rule out there are some other buggy parameters in multibyte path. I'll inform Bernie (Suse). Yep LC_ALL=C is definitely recommended anyway I updated with a test at https://github.com/pixelb/coreutils/commit/4526a88d I also found a crash in this function with some inputs! That's fixed in: https://github.com/pixelb/coreutils/commit/0ca5ebdb Thank you for the patches, Pádraig! I will get them to Fedora... fixed in coreutils-8.24-4.fc24 coreutils-8.24-4.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2015-16076 coreutils-8.24-4.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.\nIf you want to test the update, you can install it with \n su -c 'yum --enablerepo=updates-testing update coreutils'. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-16076 coreutils-8.23-11.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.\nIf you want to test the update, you can install it with \n su -c 'yum --enablerepo=updates-testing update coreutils'. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-16075 coreutils-8.24-4.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report. coreutils-8.23-11.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. |