Red Hat Bugzilla – Bug 210876
'grep' takes minutes to complete the operation if using invert-match and "or"
Last modified: 2007-11-16 20:14:54 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
Description of problem:
To be clear, this appears to be a RHEL4 Intel issue. This has been reproduced on IA64 servers and EMT64 workstation. It does *not* occur on an AMD Opteron WS.
If a grep is constructed such that an extended "or" and invert-match which causes a large number of files to be omitted are used, the time it takes to complete the operation is in minutes even though it should take sub-seconds.
Below are some examples showing the individual operations taking a few seconds, but when adding the "or", it takes minutes. This was done against a 5mb text file.
On the AMD opteron using the same command/file, it takes .3 seconds!
[root@can16 ~]# time egrep -iv '100' /boot/System.map|wc
61 183 2601
[root@can16 ~]# time egrep -iv '200' /boot/System.map|wc
21183 63549 837563
[root@can16 ~]# time egrep -iv '100|200' /boot/System.map|wc
60 180 2553
We need a fix for this from Red Hat.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create a large text file (e.g. 5 MB text file) to search pattern.
2. Do the 'egrep' operation on the file with "or" and invert-match to cause a large number of files to be omitted.
3. Time the operation for completion. See the examples in the bug Description.
The grep operation with -E (egrep) and "or" and "invert-match" took minutes to complete.
The operation should have taken sub-seconds.
Please try the package from the latest update (Update 4), which contains several
changes to address performance problems.
Alternatively if your search pattern contains no '.' or non-ASCII characters,
you can speed the operation up significantly by getting grep to process the
input as ASCII instead of the default UTF-8. Just set 'LC_CTYPE=C' in the
environment ('export LC_CTYPE=C' in bash).
Thank you for the solution. We've tried the latest update version of
grep 'grep-2.5.1-32.2.ia64.rpm' and the slow operation problem has been