Bug 495521 - grep segfaults when grepping files of certain sizes
grep segfaults when grepping files of certain sizes
Status: CLOSED DUPLICATE of bug 483073
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: grep (Show other bugs)
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Stepan Kasal
Depends On:
  Show dependency treegraph
Reported: 2009-04-13 12:25 EDT by Mattias Slabanja
Modified: 2009-09-24 06:26 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-09-22 08:31:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Mattias Slabanja 2009-04-13 12:25:45 EDT
User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv: Gecko/2009032608 Firefox/3.0.8

When using grep on multiple files (e.g. "grep something file1 file2 ..."), grep  segfaults when the files (file1 file2 ...) have certain combinations of sizes.

Reproducible: Always

Steps to Reproduce:
1. dd if=/dev/zero of=file1 bs=1 count=0 seek=31715200 
2. dd if=/dev/zero of=file2 bs=1 count=0 seek=289401200
3. grep something first second
Actual Results:  

$ grep something file1 file2
Segmentation fault

Expected Results:  

$ grep something file1 file2

The files does not need to be sparse (as in the provided example), it works equally well with ordinary files.

The bug was first encountered when searching through files that were not identically zero.

If file1 is 31195133 bytes or less, no segfault occures. 
If file1 is 31195134 bytes, the bug is triggered.

The bug is reproducible also on CentOS-systems and on RHEL4-systems and on i386-systems.

From gdb:
(gdb) set args something file1 file2
(gdb) run
Starting program: /bin/grep something file1 file2

Program received signal SIGSEGV, Segmentation fault.
grepfile (file=0x7fff35a93b92 "second", stats=0x6145a0) at grep.c:790
790	      oldc = beg[-1];
Comment 1 Vesa-Matti Kari 2009-08-28 11:03:53 EDT
It is simply UNBELIEVABLY PATHETIC that YEARS have passed, Red Hat still hasn't fixed this!! Please see Bug #237518.

Fedora 11 does ship a newer version grep, and more importantly, they have removed the TOTALLY ABSURD patch 


that RHEL 5.3 still (proudly???) ships!! 

How on earth can there be patches like that? That worthless piece of "fix" just truncates the requested amount of memory and then returns "OKAY, I have allocated the amount you requested. Good luck." 

It is no wonder these greps are totally BROKEN when they apply patches like this and won't remove them when people tell them they're are BROKEN!!

How is one supposed to do any serious sysadmin job on RHEL when, for example, one cannot use grep to scan big log files?

This is like breaking a hammer or a screwdriver in someone's toolbox. These basic utilities are FUNDAMENTAL, they're supposed to ALWAYS WORK!!!

Shame, shame, shame!!!
Comment 2 Stepan Kasal 2009-09-22 07:35:08 EDT
(In reply to comment #1)
> basic utilities are FUNDAMENTAL, they're supposed to ALWAYS WORK!!!

grep is a text processing utility, thus it is supposed to work on text files.

If a file contains excessive segments with no newline character, grep, as a line oriented tool, can not work reliably.
(A reliable way to process such a file might be a pipe consisting of commands strings and grep.)
Comment 3 Stepan Kasal 2009-09-22 08:31:34 EDT
(In reply to comment #0)
> Actual Results:  
> $ grep something file1 file2
> Segmentation fault
> $
> Expected Results:  
> $ grep something file1 file2
> $

First, please note that grep is a text processing tool, thus it has to process a 16GB or 138GB "line".  This is far, far beyond what would you call a "text file".

Anyway, segfault is not an adequate behavior.

If we followed the GNU mantra "no arbitrary limits", then grep would happily process the multi-gigabyte line, loading it whole to memory, and after a substantial amount of time, the correct answer would come.
If the virtual memory (swap space) is not big enough, the computation would end with an error message that the memory was exhausted.
This would be the natural behavior for a text processing tool when it is fed with binary data.
And this is also the approach used in recent Fedora releases.  (Recent Fedora builds of grep does not contained the grep-mem-exhausted.patch, as mentioned in comment #1.)

But this behavior, though correct from the theoretic viewpoint, has caused problems previously, see RHEL4 bug #198167.  Consequently, a limit was imposed on the memory consumed.  Unfortunately, earlier versions of the patch might cause segmentation faults.  Updated version of the patch (grep-2.5.1-55.el5, released as part of RHEL 5.4, see #483073) eliminates this issue.

With this latest update, the result is:

$ grep something file1 file2; echo $?
grep: line too long

And this answer is printed promptly; grep does not spend ages allocating exhaustive amounts of memory.  This is the optimal solution for RHEL 4 and 5 with respect to backward compatibility.

*** This bug has been marked as a duplicate of bug 483073 ***
Comment 4 Vesa-Matti Kari 2009-09-24 06:26:19 EDT

I have already sent a private apology to Stepan, but I think it is necessary to repeat it here. My moronic ranting was a very bad bug report indeed. There is no excuse for such behavior. I should have taken the time to cool down and then write a neutral bug report. 

Sorry about the inapproriate outburst and many thanks for the Red Hat crew for replying and fixing grep.


Note You need to log in before you can comment on or make changes to this bug.