Bug 1161832 - grep: invalid UTF-8 byte sequence in input (grep -P file)
Summary: grep: invalid UTF-8 byte sequence in input (grep -P file)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: grep
Version: 21
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jaroslav Škarvada
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-08 09:52 UTC by tomi ollila
Modified: 2014-11-16 14:45 UTC (History)
3 users (show)

Fixed In Version: grep-2.20-6.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1253296 (view as bug list)
Environment:
Last Closed: 2014-11-16 14:45:07 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Backported fix (4.57 KB, patch)
2014-11-11 15:27 UTC, Jaroslav Škarvada
no flags Details | Diff

Description tomi ollila 2014-11-08 09:52:02 UTC
Description of problem:

In UTF-8 locale grepping file that has non-utf-8 data (e.g. latin1-encoded email) with option -P

grep exits with code 2 printing:

  grep: invalid UTF-8 byte sequence in input

Version-Release number of selected component (if applicable):

$ rpm -qi grep
Name        : grep
Version     : 2.20
Release     : 3.fc21
Architecture: x86_64
Group       : Applications/Text
Size        : 1194652
License     : GPLv3+
Signature   : RSA/SHA256, Sun 13 Jul 2014 17:44:13 EEST, Key ID 89ad4e8795a43f54
Source RPM  : grep-2.20-3.fc21.src.rpm
Build Date  : Sat 12 Jul 2014 18:40:14 EEST
Build Host  : buildhw-03.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : http://www.gnu.org/software/grep/
Summary     : Pattern matching utilities


How reproducible:

On command line: 

( LC_ALL=C perl -le 'print "\351"' ) | grep -P none


Steps to Reproduce:
1. open terminal
2. make sure there is utf-8 locale in use
3. execute ( LC_ALL=C perl -le 'print "\351"' ) | grep -P none


Actual results:

grep: invalid UTF-8 byte sequence in input
zsh: done       ( LC_ALL=C perl -le 'print "\351"'; ) | 
zsh: exit 2     grep --color=auto -P .

Expected results:

zsh: done       ( LC_ALL=C perl -le 'print "\351"'; ) | 
zsh: exit 1     grep --color=auto -P none


Additional info:

Comment 1 Jaroslav Škarvada 2014-11-11 15:27:57 UTC
Created attachment 956321 [details]
Backported fix

Comment 2 Fedora Update System 2014-11-11 15:56:29 UTC
grep-2.20-5.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/grep-2.20-5.fc21

Comment 3 Fedora Update System 2014-11-13 18:06:49 UTC
Package grep-2.20-5.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing grep-2.20-5.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-14808/grep-2.20-5.fc21
then log in and leave karma (feedback).

Comment 4 Dan Horák 2014-11-14 13:29:22 UTC
the internal tests don't pass on s390(x)

...
PASS: pcre
PASS: pcre-abort
pcre-infloop: failed test: libpcre's match function appears to infloop
FAIL: pcre-infloop

and the next test (pcre-invalid-utf8-input) seems to enter an endless loop


[sharkcz@devel3 tests]$ cat pcre-infloop.log 
++ initial_cwd_=/home/sharkcz/grep/grep-2.20/tests
++ fail=0
+++ testdir_prefix_
+++ printf gt
++ pfx_=gt
+++ mktempd_ /home/sharkcz/grep/grep-2.20/tests gt-pcre-infloop.XXXX
+++ case $# in
+++ destdir_=/home/sharkcz/grep/grep-2.20/tests
+++ template_=gt-pcre-infloop.XXXX
+++ MAX_TRIES_=4
+++ case $destdir_ in
+++ case $template_ in
++++ unset TMPDIR
+++ d=/home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+++ case $d in
+++ test -d /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++++ ls -dgo /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++++ tr S -
+++ perms='drwx------. 2 4096 Nov 14 13:21 /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5'
+++ case $perms in
+++ test 0 = 0
+++ echo /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+++ return
++ test_dir_=/home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++ cd /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++ gl_init_sh_nl_='
'
++ IFS=' 	
'
++ for sig_ in 1 2 3 13 15
+++ expr 1 + 128
++ eval 'trap '\''Exit 129'\'' 1'
+++ trap 'Exit 129' 1
++ for sig_ in 1 2 3 13 15
+++ expr 2 + 128
++ eval 'trap '\''Exit 130'\'' 2'
+++ trap 'Exit 130' 2
++ for sig_ in 1 2 3 13 15
+++ expr 3 + 128
++ eval 'trap '\''Exit 131'\'' 3'
+++ trap 'Exit 131' 3
++ for sig_ in 1 2 3 13 15
+++ expr 13 + 128
++ eval 'trap '\''Exit 141'\'' 13'
+++ trap 'Exit 141' 13
++ for sig_ in 1 2 3 13 15
+++ expr 15 + 128
++ eval 'trap '\''Exit 143'\'' 15'
+++ trap 'Exit 143' 15
++ trap remove_tmp_ 0
+ path_prepend_ ../src
+ test 1 '!=' 0
+ path_dir_=../src
+ case $path_dir_ in
+ abs_path_dir_=/home/sharkcz/grep/grep-2.20/tests/../src
+ case $abs_path_dir_ in
+ PATH=/home/sharkcz/grep/grep-2.20/tests/../src:/home/sharkcz/grep/grep-2.20/src:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/sharkcz/.local/bin:/home/sharkcz/bin
+ create_exe_shims_ /home/sharkcz/grep/grep-2.20/tests/../src
+ case $EXEEXT in
+ return 0
+ shift
+ test 0 '!=' 0
+ export PATH
+ require_pcre_
+ echo .
+ grep -P .
.
+ compare /dev/null err
+ compare_dev_null_ /dev/null err
+ test 2 = 2
+ test x/dev/null = x/dev/null
+ test -s err
+ return 0
+ return 0
+ require_timeout_
+ timeout 10s false
+ test 1 = 1
+ require_en_utf8_locale_
+ path_prepend_ .
+ test 1 '!=' 0
+ path_dir_=.
+ case $path_dir_ in
+ abs_path_dir_=/home/sharkcz/grep/grep-2.20/tests/.
+ case $abs_path_dir_ in
+ PATH=/home/sharkcz/grep/grep-2.20/tests/.:/home/sharkcz/grep/grep-2.20/tests/../src:/home/sharkcz/grep/grep-2.20/src:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/sharkcz/.local/bin:/home/sharkcz/bin
+ create_exe_shims_ /home/sharkcz/grep/grep-2.20/tests/.
+ case $EXEEXT in
+ return 0
+ shift
+ test 0 '!=' 0
+ export PATH
+ case $(get-mb-cur-max en_US.UTF-8) in
++ get-mb-cur-max en_US.UTF-8
+ require_compiled_in_MB_support
+ require_en_utf8_locale_
+ path_prepend_ .
+ test 1 '!=' 0
+ path_dir_=.
+ case $path_dir_ in
+ abs_path_dir_=/home/sharkcz/grep/grep-2.20/tests/.
+ case $abs_path_dir_ in
+ PATH=/home/sharkcz/grep/grep-2.20/tests/.:/home/sharkcz/grep/grep-2.20/tests/.:/home/sharkcz/grep/grep-2.20/tests/../src:/home/sharkcz/grep/grep-2.20/src:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/sharkcz/.local/bin:/home/sharkcz/bin
+ create_exe_shims_ /home/sharkcz/grep/grep-2.20/tests/.
+ case $EXEEXT in
+ return 0
+ shift
+ test 0 '!=' 0
+ export PATH
+ case $(get-mb-cur-max en_US.UTF-8) in
++ get-mb-cur-max en_US.UTF-8
+ printf $'\303\251'
+ LC_ALL=en_US.UTF-8
+ grep '[[:lower:]]'
é
+ printf 'a\201b\r'
+ fail=0
+ LC_ALL=en_US.UTF-8
+ timeout 3 grep -P 'a.?..b' in
+ test 124 = 1
+ fail_ 'libpcre'\''s match function appears to infloop'
+ warn_ 'pcre-infloop: failed test: libpcre'\''s match function appears to infloop'
+ case $IFS in
+ printf '%s\n' 'pcre-infloop: failed test: libpcre'\''s match function appears to infloop'
pcre-infloop: failed test: libpcre's match function appears to infloop
+ test 9 = 2
+ printf '%s\n' 'pcre-infloop: failed test: libpcre'\''s match function appears to infloop'
+ sed 1q
+ Exit 1
+ set +e
+ exit 1
+ exit 1
+ remove_tmp_
+ __st=1
+ cleanup_
+ :
+ cd /home/sharkcz/grep/grep-2.20/tests
+ chmod -R u+rwx /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+ rm -rf /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+ exit 1
FAIL pcre-infloop (exit status: 1)

Comment 5 Dan Horák 2014-11-14 13:30:28 UTC
Interesting is that build on ppc64/ppc64le passes - http://ppc.koji.fedoraproject.org/koji/buildinfo?buildID=276813

Comment 6 Jaroslav Škarvada 2014-11-14 16:31:12 UTC
(In reply to Dan Horák from comment #4)
> the internal tests don't pass on s390(x)

Thanks for the report, it is a bit more complicated that it seemed :)

Comment 7 Fedora Update System 2014-11-14 16:47:09 UTC
grep-2.20-6.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/grep-2.20-6.fc21

Comment 8 tomi ollila 2014-11-16 12:25:22 UTC
Name        : grep
Version     : 2.20
Release     : 6.fc21
Architecture: x86_64

Works OK for me.

Comment 9 Fedora Update System 2014-11-16 14:45:07 UTC
grep-2.20-6.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.