Bug 549651

Summary: Grep incorrect work with Unicode string
Product: [Fedora] Fedora Reporter: Pavel Zhukov <pavel>
Component: grepAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 21CC: andy.evil, cwyse, jskarvad, kasal, lkundrak, pzhukov, russ+redhat
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-01 21:32:20 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Pavel Zhukov 2009-12-22 04:16:03 EST
Description of problem:

Grep incorrect work with Russian text

Version-Release number of selected component (if applicable):

Version     : 2.5.3                           
Release     : 4.fc11  

How reproducible:

Try to do such commands:

$ echo "Это просто текст" | grep '\<просто\>'
(no result)

$ echo "This is a text" | grep '\<is\>'
This is a text

  
Actual results:
sed and other utilites work properly, for example:

$ echo  "Это просто простой текст" | sed s/'\<просто\>'/'не очень'/
Это не очень простой текст

$ locale
LANG=ru_RU.UTF-8

in other *nix systems it works propely (Debian, RHEL and FreeBSD have been tested)
Comment 1 Jaroslav Škarvada 2010-04-13 03:45:49 EDT
The latest grep-2.6.3 is also affected, forwarded to upstream bugzilla.
Comment 5 Bug Zapper 2010-04-28 07:36:40 EDT
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 6 Bug Zapper 2010-07-30 06:48:58 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 7 Jaroslav Škarvada 2010-09-21 10:15:39 EDT
Upstream bugreport: https://savannah.gnu.org/bugs/?29537
Comment 8 Russ Hammer 2010-10-20 19:31:00 EDT
I think this needs to be a higher severity than just medium! This is totally broken and will cause data corruption and incorrect results in scripts.

$ printf "%s\n" {a..z} {A..Z} | grep '[^a-z]'
Z
$ grep --version
GNU grep 2.6.3

Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Comment 9 Russ Hammer 2010-10-20 19:36:44 EDT
This is more than just case insensitivity. The severity needs to be upped.

$ printf "%s\n" {a..z} {A..Z} | grep '[^A-Z]'
a

$ rpm -q grep
grep-2.6.3-1.fc13.x86_64
Comment 10 Jaroslav Škarvada 2010-10-21 03:38:54 EDT
(In reply to comment #9)
> This is more than just case insensitivity. The severity needs to be upped.
> 
> $ printf "%s\n" {a..z} {A..Z} | grep '[^A-Z]'
> a
> 
> $ rpm -q grep
> grep-2.6.3-1.fc13.x86_64

This seems to be unrelated to this bug report and it was fixed in grep-2.7 that is currently available in F14.
Comment 11 Jaroslav Škarvada 2010-12-03 08:53:02 EST
Reopened, it still does not work in RU locale.
Comment 12 Fedora End Of Life 2013-04-03 16:06:26 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Comment 13 Andrey Zaslavskiy 2014-02-20 02:14:32 EST
Same thing with --word-regexp

grep --version
grep (GNU grep) 2.16


cat rus-test 
Не стой столбом за правду жизни.
стойлол!
столбик
конец файла

grep -w стой rus-test 
Не стой столбом за правду жизни.
стойлол!


But there should be only one match!
Comment 14 Fedora End Of Life 2015-01-09 16:41:37 EST
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 15 Fedora End Of Life 2015-11-04 10:45:01 EST
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 16 Fedora End Of Life 2015-12-01 21:32:26 EST
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.
Comment 17 Pavel Zhukov 2015-12-02 04:36:17 EST
Works fine in F23 btw