RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1540059 - coreutils "sort -M" memory leak
Summary: coreutils "sort -M" memory leak
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: coreutils
Version: 7.5
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Kamil Dudka
QA Contact: Radka Brychtova
URL:
Whiteboard:
Depends On: 1259942
Blocks: 1546552 1549617 1549689
TreeView+ depends on / blocked
 
Reported: 2018-01-30 07:51 UTC by Kamil Dudka
Modified: 2018-10-30 11:10 UTC (History)
3 users (show)

Fixed In Version: coreutils-8.22-23.el7
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of: 1259942
Environment:
Last Closed: 2018-10-30 11:10:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3203 0 None None None 2018-10-30 11:10:42 UTC

Description Kamil Dudka 2018-01-30 07:51:09 UTC
Also reported on the CentOS-devel mailing list:

https://lists.centos.org/pipermail/centos-devel/2018-January/016439.html


+++ This bug was initially created as a clone of Bug #1259942 +++

Description of problem:

I believe I have encountered  a major memory leak in coreutils sort when sorting by month "-M"

Version-Release number of selected component (if applicable):

[sam@deben coreutils]$ rpm -q coreutils
coreutils-8.22-22.fc21.x86_64
[sam@deben coreutils]$ sort --version
sort (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.


How reproducible: Every time


Steps to Reproduce:
1. Create a test file

base64 /dev/urandom | head -n 10000 > 10000.txt

2. Run under valgrind (defaults)

valgrind sort 10000.txt > /dev/null

3. Run under valgrind (-M)

valgrind sort -M 10000.txt > /dev/null

Actual results:

[sam@deben coreutils]$ valgrind sort 10000.txt > /dev/null
==8382== Memcheck, a memory error detector
==8382== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==8382== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==8382== Command: sort 10000.txt
==8382== 
==8382== 
==8382== HEAP SUMMARY:
==8382==     in use at exit: 192 bytes in 14 blocks
==8382==   total heap usage: 60 allocs, 46 frees, 74,697,309 bytes allocated
==8382== 
==8382== LEAK SUMMARY:
==8382==    definitely lost: 0 bytes in 0 blocks
==8382==    indirectly lost: 0 bytes in 0 blocks
==8382==      possibly lost: 0 bytes in 0 blocks
==8382==    still reachable: 192 bytes in 14 blocks
==8382==         suppressed: 0 bytes in 0 blocks
==8382== Rerun with --leak-check=full to see details of leaked memory
==8382== 
==8382== For counts of detected and suppressed errors, rerun with: -v
==8382== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


[sam@deben coreutils]$ valgrind sort -M 10000.txt > /dev/null
==8312== Memcheck, a memory error detector
==8312== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==8312== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==8312== Command: sort -M 10000.txt
==8312== 

==8312== 
==8312== HEAP SUMMARY:
==8312==     in use at exit: 92,753,702 bytes in 481,851 blocks
==8312==   total heap usage: 722,815 allocs, 240,964 frees, 186,001,505 bytes allocated
==8312== 
==8312== LEAK SUMMARY:
==8312==    definitely lost: 92,731,870 bytes in 481,751 blocks
==8312==    indirectly lost: 0 bytes in 0 blocks
==8312==      possibly lost: 21,021 bytes in 78 blocks
==8312==    still reachable: 811 bytes in 22 blocks
==8312==         suppressed: 0 bytes in 0 blocks
==8312== Rerun with --leak-check=full to see details of leaked memory
==8312== 
==8312== For counts of detected and suppressed errors, rerun with: -v
==8312== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


Expected results:

No "definitely lost" blocks when using -M

Additional info:

This issue was found when a set of machines were receiving intermittent OOM killer triggers.  We tracked it down to a "sort -M <large file>" command being run as root which was using all the available memory.

--- Additional comment from Sam Elstob on 2015-09-03 23:31:22 CEST ---

> Additional info:
> 
> This issue was found when a set of machines were receiving intermittent OOM > > killer triggers.  We tracked it down to a "sort -M <large file>" command being > run as root which was using all the available memory.

To clarify the <large file> that lead to finding this issue was around 1GB of syslog text data.

I believe the test case I have included in the previous comment demonstrates the memory leak exists regardless of the input size.

--- Additional comment from Pádraig Brady on 2015-09-04 01:09:23 CEST ---

I've not looked into it, but it does seem restricted to the i18n patch.
There are an amazing amount of allocs to begin with,
never mind the number of frees don't match.

$ valgrind sort -M 10000.txt > /dev/null
  HEAP SUMMARY:
     in use at exit: 92,797,172 bytes in 482,074 blocks
   total heap usage: 723,145 allocs, 241,071 frees, 186,052,277 bytes allocated
  LEAK SUMMARY:
    definitely lost: 92,794,835 bytes in 482,054 blocks
    indirectly lost: 1,088 bytes in 2 blocks 
      possibly lost: 1,001 bytes in 4 blocks 
    still reachable: 248 bytes in 14 blocks

$ valgrind sort-upstream -M 10000.txt > /dev/null
  HEAP SUMMARY:
     in use at exit: 272 bytes in 15 blocks
   total heap usage: 53 allocs, 38 frees, 74,696,851 bytes allocated
  LEAK SUMMARY:
    definitely lost: 24 bytes in 1 blocks 
    indirectly lost: 0 bytes in 0 blocks 
      possibly lost: 0 bytes in 0 blocks 
    still reachable: 248 bytes in 14 blocks

$ export LC_ALL=C
$ valgrind sort -M 10000.txt > /dev/null
  HEAP SUMMARY:
     in use at exit: 1,344 bytes in 6 blocks 
   total heap usage: 12 allocs, 6 frees, 74,693,156 bytes allocated
  LEAK SUMMARY:
    definitely lost: 56 bytes in 2 blocks 
    indirectly lost: 1,088 bytes in 2 blocks 
      possibly lost: 0 bytes in 0 blocks 
    still reachable: 200 bytes in 2 blocks 
         suppressed: 0 bytes in 0 blocks

--- Additional comment from Ondrej Vasik on 2015-09-04 09:06:21 CEST ---

Yes, likely an issue with i18n patch as Pádraig wrote. This patch is a nightmare and we recommend to use C locales when running scripts - as this patch heavily affects performance, memory consumption (because of the leaks) and reliability.

We don't want to invest too much time into i18n patch improvements, as there is an effort to do upstream-able rewrite. Anyway Ondrej Oprala did several improvements in sort part of this patch, so keeping it opened for investigation.

--- Additional comment from Sam Elstob on 2015-09-08 12:02:21 CEST ---

I've confirmed that "LC_ALL=C" is a workaround in our case.

Could you please confirm which cases are affected by the memory leak as we need to apply this work around to a number of production systems?

1. sort
2. sort -M
3. Any other "sort" parameters?

If the bug only affects "sort -M" then we can make a more focussed change.

Many thanks

--- Additional comment from Pádraig Brady on 2015-09-08 14:08:41 CEST ---

The issue should only be with sort -M I think.
I had a quick look and this should fix it:
https://github.com/pixelb/coreutils/commit/fbbe8c06

--- Additional comment from Ondrej Vasik on 2015-09-10 10:07:16 CEST ---

Thanks Pádraig, patch makes sense to me, should I inform Bernie or do you plan to do that?

As to "what's affected" - I think LC_ALL=C is generally better idea for sorting in production environment - unless you rely on some locales specific collation rules. It is definitely safer and faster - as i18n patch is downstream and has history of various issues.
This particular bug is limited to sort -M only, but we can't rule out there are some other buggy parameters in multibyte path.

--- Additional comment from Pádraig Brady on 2015-09-10 12:07:42 CEST ---

I'll inform Bernie (Suse).

Yep LC_ALL=C is definitely recommended anyway

--- Additional comment from Pádraig Brady on 2015-09-11 03:31:06 CEST ---

I updated with a test at https://github.com/pixelb/coreutils/commit/4526a88d

I also found a crash in this function with some inputs!
That's fixed in: https://github.com/pixelb/coreutils/commit/0ca5ebdb

Comment 5 errata-xmlrpc 2018-10-30 11:10:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3203


Note You need to log in before you can comment on or make changes to this bug.