Bug 111800

Summary: grep probs w/ regex and utf-8
Product: [Fedora] Fedora Reporter: Ralf Corsepius <corsepiu>
Component: grepAssignee: Tim Waugh <twaugh>
Status: CLOSED ERRATA QA Contact: Mike McLean <mikem>
Severity: high Docs Contact:
Priority: medium    
Version: 1   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-12-11 10:08:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
input data to reproduce the bug mentioned in the PR none

Description Ralf Corsepius 2003-12-10 09:58:01 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1)
Gecko/20031114

Description of problem:
grep '^[bs][ir][nc]' <file>

occasionally reports incorrect results.


Version-Release number of selected component (if applicable):
grep > 2.5.1-17

How reproducible:
Always

Steps to Reproduce:
1. Download the file foo from the attachment.
2. Run
grep '^[bs][ir][nc]' foo > out

Each line from file foo is supposed to match the regex, so 
wc -l foo
and 
wc -l out
should report the same number of lines


Actual Results:  # grep '^[bs][ir][nc]' foo > out && wc -l foo out
    445 foo
    397 out
    842 total

In a LANG=en_US.UTF-8 environment, the number of lines differ,
apparently the regex match has failed.

In a C-locale, this issue does not occur:
# LANG=C grep '^[bs][ir][nc]' foo > out && wc -l foo out
    445 foo
    445 out
    890 total

Expected Results:  Function. The input file is plain ASCII and does
not contain any UTF-8 chars.


Additional info:
grep-2.5.1-17 was not affected.

In FC1, grep-2.5.1-17.2 was the first version of grep to expose this
issue.
All versions in rawhide / FC1/development up to and including 2.5.1-22
suffer from this issue.

Comment 1 Ralf Corsepius 2003-12-10 10:00:19 UTC
Created attachment 96442 [details]
input data to reproduce the bug mentioned in the PR

Comment 2 Tim Waugh 2003-12-10 16:37:02 UTC
Please try this package:

ftp://people.redhat.com/twaugh/tmp/grep-2.5.1-17.4.i386.rpm

and let me know whether it fixes the problem for you.

Comment 3 Ralf Corsepius 2003-12-11 06:30:39 UTC
Yes, it seems to fix my problem. At least my testcases don't fail anymore.

Comment 4 Jay Turner 2004-09-02 02:13:27 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-079.html