Bug 729052

Summary: grep consumes lots of memory when grepping a large *sparse* file
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: grepAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: jskarvad, lkundrak, pbonzini
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-20 10:25:44 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Richard W.M. Jones 2011-08-08 11:38:47 EDT
Description of problem:

If you run grep on a large sparse file (eg. multi-gigabyte)
then grep consumes more and more memory.  The system as a whole
becomes more unusable.  Eventually grep dies:

grep: memory exhausted

Version-Release number of selected component (if applicable):

grep-2.9-3.fc16.x86_64

How reproducible:

100%

Steps to Reproduce:
1. truncate -s 32G sparse
2. grep foo sparse
3.
  
Actual results:

System becomes unusable until grep finally dies with

grep: memory exhausted

Expected results:

grep should work a bit better in this case.  After all, this
file is completely empty.

Additional info:

Possibly http://savannah.gnu.org/bugs/?31928  ?
Comment 1 Jaroslav Škarvada 2011-08-16 04:24:07 EDT
> Possibly http://savannah.gnu.org/bugs/?31928  ?
No, it doesn't seem to be related and the exit code has been already fixed in Fedora.

Upstream ticket: http://savannah.gnu.org/bugs/index.php?34020
Comment 2 Paolo Bonzini 2011-08-29 07:23:23 EDT
> After all, this file is completely empty.

No, it's not, it has a single line consisting of 34 billion NUL characters.  But yes, it should work a bit better.
Comment 3 Richard W.M. Jones 2011-08-29 11:45:00 EDT
I should say that I don't really care about grepping large
sparse files.

The problem arises because libtool runs grep (and sed) on all of
its parameters.

In libguestfs this causes a problem, because we often run
commands which come down to:

  libtool --mode=execute guestfish -a test1.img

where 'test1.img' is a huge sparse file.

Therefore either libtool or grep need to be fixed to deal
with this case.
Comment 4 Richard W.M. Jones 2011-10-20 10:18:02 EDT
FYI there is a similar bug concerning 'sed', 'libtool' and huge
sparse files:
https://bugzilla.redhat.com/show_bug.cgi?id=636045
Comment 5 Paolo Bonzini 2011-10-20 10:25:44 EDT
I think the bug is in libtool, though the observation you made about non-sparse files is interesting.

As mentioned in comment 2: to grep and sed, a large sparse file has a huge line consisting of all zeros, and grep/sed need to slurp it all in memory (for sed; in order to edit it; for grep, in case it has to print it).

Since the libtool problem is already opened, I'm closing this one.