Bug 198165
Summary: | grep should not take all memory | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Russell Coker <russell> |
Component: | grep | Assignee: | Tim Waugh <twaugh> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | kasal, staubach |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | 2.5.1-54.1.2.fc6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-12-12 16:22:30 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 198167, 207681 |
Description
Russell Coker
2006-07-10 12:39:45 UTC
Fixed in update: grep-2.5.1-54.1.2.fc6. First, the grep-mem-exhausted.patch used in Fedora since comment #2 until now was not a correct implementation of the idea presented in comment #0. See bug #481765 for details. Second, the idea to take the risk that a possible match on the huge "line" is missed is IMHO not optimal: Grep is a line-oriented tool to process text files. When processing general binary data (even in the so-called binary mode), grep searches for the occurences of the newline character, and processed the "lines" delimited by the occurences. Grep is not meant to process binary files. Indeed, this bug shows that its implementation is not ready to process them. In particular, grep internal matchers work only with lines which are fully loaded into the memory. Unless that assumption is relaxed, grep cannot correctly process files with "lines" with size close to or bigger than the amount of available virtual memory (and it is slow to process lines longer than the amount of available RAM). But relaxing the assumption would require a substantial redesign of the matchers. It is plausible if grep exits with an error message and exit code 2 in that situation, "giving up". But it is less accurate if grep prints an incorrect result (though in rare situations), without any indication that a problem occured. Consequently, grep might err out as soon as the buffer size reaches the limit, or it might simply allocate as much memory as the OS allows. For Fedora rawhide, the latter seems better aligned with the GNU credo "no arbitrary limits". (No matter that 500 MB seems reasonable today, it may become ridiculous over the time. Traditional UNIX defines text files with lines of maximal length of 1024. And 640K must be enough for everybody.) IOW, as of grep-2.5.3-3, I'm removing grep-mem-exhausted.patch. |