Bug 729052 - grep consumes lots of memory when grepping a large *sparse* file
Summary: grep consumes lots of memory when grepping a large *sparse* file
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: grep
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jaroslav Škarvada
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-08 15:38 UTC by Richard W.M. Jones
Modified: 2011-10-20 14:25 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-20 14:25:44 UTC


Attachments (Terms of Use)

Description Richard W.M. Jones 2011-08-08 15:38:47 UTC
Description of problem:

If you run grep on a large sparse file (eg. multi-gigabyte)
then grep consumes more and more memory.  The system as a whole
becomes more unusable.  Eventually grep dies:

grep: memory exhausted

Version-Release number of selected component (if applicable):

grep-2.9-3.fc16.x86_64

How reproducible:

100%

Steps to Reproduce:
1. truncate -s 32G sparse
2. grep foo sparse
3.
  
Actual results:

System becomes unusable until grep finally dies with

grep: memory exhausted

Expected results:

grep should work a bit better in this case.  After all, this
file is completely empty.

Additional info:

Possibly http://savannah.gnu.org/bugs/?31928  ?

Comment 1 Jaroslav Škarvada 2011-08-16 08:24:07 UTC
> Possibly http://savannah.gnu.org/bugs/?31928  ?
No, it doesn't seem to be related and the exit code has been already fixed in Fedora.

Upstream ticket: http://savannah.gnu.org/bugs/index.php?34020

Comment 2 Paolo Bonzini 2011-08-29 11:23:23 UTC
> After all, this file is completely empty.

No, it's not, it has a single line consisting of 34 billion NUL characters.  But yes, it should work a bit better.

Comment 3 Richard W.M. Jones 2011-08-29 15:45:00 UTC
I should say that I don't really care about grepping large
sparse files.

The problem arises because libtool runs grep (and sed) on all of
its parameters.

In libguestfs this causes a problem, because we often run
commands which come down to:

  libtool --mode=execute guestfish -a test1.img

where 'test1.img' is a huge sparse file.

Therefore either libtool or grep need to be fixed to deal
with this case.

Comment 4 Richard W.M. Jones 2011-10-20 14:18:02 UTC
FYI there is a similar bug concerning 'sed', 'libtool' and huge
sparse files:
https://bugzilla.redhat.com/show_bug.cgi?id=636045

Comment 5 Paolo Bonzini 2011-10-20 14:25:44 UTC
I think the bug is in libtool, though the observation you made about non-sparse files is interesting.

As mentioned in comment 2: to grep and sed, a large sparse file has a huge line consisting of all zeros, and grep/sed need to slurp it all in memory (for sed; in order to edit it; for grep, in case it has to print it).

Since the libtool problem is already opened, I'm closing this one.


Note You need to log in before you can comment on or make changes to this bug.