Bug 179698

Summary: grep -w broken in multibyte locales
Product: [Fedora] Fedora Reporter: Tim Robbins <tim>
Component: grepAssignee: Tim Waugh <twaugh>
Status: CLOSED RAWHIDE QA Contact: Mike McLean <mikem>
Severity: medium Docs Contact:
Priority: medium    
Version: 4   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 2.5.1-53 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 574350 (view as bug list) Environment:
Last Closed: 2006-02-20 14:51:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 574350    
Attachments:
Description Flags
Potential fix for grep -w problem none

Description Tim Robbins 2006-02-02 04:53:43 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/417.9 (KHTML, like Gecko) Safari/417.8

Description of problem:
grep -w appears not to work for any non-UTF-8 multibyte locale; matches are often not reported when 
they should be.

Version-Release number of selected component (if applicable):
grep-2.5.1-48.2

How reproducible:
Always

Steps to Reproduce:
echo za a | LANG=ja_JP.eucjp grep -w a

Actual Results:  No output.

Expected Results:  Output: "za a".

Additional info:

The command gives the expected output on FC3.

Comment 1 Tim Waugh 2006-02-03 17:26:02 UTC
Thanks for the report.

Corrections:

* this is for non-UTF-8 multibyte locales.  UTF-8 input works correctly.
* FC-3 gives the same output for me (grep-2.5.1-31.4).

Comment 2 Tim Robbins 2006-02-19 22:52:40 UTC
Created attachment 124869 [details]
Potential fix for grep -w problem

This is a potential fix for the grep -w problem in non-UTF-8 multibyte locales.
It attempts to correctly locate the multibyte character preceeding the matched
string by (slowly) scanning forward from the start of the buffer until it
reaches the match position. Using "last_char" instead of performing this search
seems to be incorrect.

Comment 3 Tim Waugh 2006-02-20 14:51:45 UTC
Thanks!  Fixed in grep-2.5.1-53.