Bug 1253296 - grep: invalid UTF-8 byte sequence in input (grep -P file)
grep: invalid UTF-8 byte sequence in input (grep -P file)
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: grep (Show other bugs)
7.1
All Linux
unspecified Severity medium
: rc
: ---
Assigned To: Jaroslav Škarvada
BaseOS QE - Apps
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-13 08:27 EDT by Akemi Yagi
Modified: 2015-08-13 09:17 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1161832
Environment:
Last Closed: 2015-08-13 09:17:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
CentOS 9182 None None None Never

  None (edit)
Description Akemi Yagi 2015-08-13 08:27:34 EDT
+++ This bug was initially created as a clone of Bug #1161832 +++

Description of problem:

In UTF-8 locale grepping file that has non-utf-8 data (e.g. latin1-encoded email) with option -P

grep exits with code 2 printing:

  grep: invalid UTF-8 byte sequence in input

Version-Release number of selected component (if applicable):

$ rpm -qi grep
Name        : grep
Version     : 2.20
Release     : 3.fc21
Architecture: x86_64
Group       : Applications/Text
Size        : 1194652
License     : GPLv3+
Signature   : RSA/SHA256, Sun 13 Jul 2014 17:44:13 EEST, Key ID 89ad4e8795a43f54
Source RPM  : grep-2.20-3.fc21.src.rpm
Build Date  : Sat 12 Jul 2014 18:40:14 EEST
Build Host  : buildhw-03.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : http://www.gnu.org/software/grep/
Summary     : Pattern matching utilities


How reproducible:

On command line: 

( LC_ALL=C perl -le 'print "\351"' ) | grep -P none


Steps to Reproduce:
1. open terminal
2. make sure there is utf-8 locale in use
3. execute ( LC_ALL=C perl -le 'print "\351"' ) | grep -P none


Actual results:

grep: invalid UTF-8 byte sequence in input
zsh: done       ( LC_ALL=C perl -le 'print "\351"'; ) | 
zsh: exit 2     grep --color=auto -P .

Expected results:

zsh: done       ( LC_ALL=C perl -le 'print "\351"'; ) | 
zsh: exit 1     grep --color=auto -P none


Additional info:

--- Additional comment from Jaroslav Škarvada on 2014-11-11 10:27:57 EST ---



--- Additional comment from Fedora Update System on 2014-11-11 10:56:29 EST ---

grep-2.20-5.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/grep-2.20-5.fc21

--- Additional comment from Fedora Update System on 2014-11-13 13:06:49 EST ---

Package grep-2.20-5.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing grep-2.20-5.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-14808/grep-2.20-5.fc21
then log in and leave karma (feedback).

--- Additional comment from Dan Horák on 2014-11-14 08:29:22 EST ---

the internal tests don't pass on s390(x)

...
PASS: pcre
PASS: pcre-abort
pcre-infloop: failed test: libpcre's match function appears to infloop
FAIL: pcre-infloop

and the next test (pcre-invalid-utf8-input) seems to enter an endless loop


[sharkcz@devel3 tests]$ cat pcre-infloop.log 
++ initial_cwd_=/home/sharkcz/grep/grep-2.20/tests
++ fail=0
+++ testdir_prefix_
+++ printf gt
++ pfx_=gt
+++ mktempd_ /home/sharkcz/grep/grep-2.20/tests gt-pcre-infloop.XXXX
+++ case $# in
+++ destdir_=/home/sharkcz/grep/grep-2.20/tests
+++ template_=gt-pcre-infloop.XXXX
+++ MAX_TRIES_=4
+++ case $destdir_ in
+++ case $template_ in
++++ unset TMPDIR
+++ d=/home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+++ case $d in
+++ test -d /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++++ ls -dgo /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++++ tr S -
+++ perms='drwx------. 2 4096 Nov 14 13:21 /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5'
+++ case $perms in
+++ test 0 = 0
+++ echo /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+++ return
++ test_dir_=/home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++ cd /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
++ gl_init_sh_nl_='
'
++ IFS=' 	
'
++ for sig_ in 1 2 3 13 15
+++ expr 1 + 128
++ eval 'trap '\''Exit 129'\'' 1'
+++ trap 'Exit 129' 1
++ for sig_ in 1 2 3 13 15
+++ expr 2 + 128
++ eval 'trap '\''Exit 130'\'' 2'
+++ trap 'Exit 130' 2
++ for sig_ in 1 2 3 13 15
+++ expr 3 + 128
++ eval 'trap '\''Exit 131'\'' 3'
+++ trap 'Exit 131' 3
++ for sig_ in 1 2 3 13 15
+++ expr 13 + 128
++ eval 'trap '\''Exit 141'\'' 13'
+++ trap 'Exit 141' 13
++ for sig_ in 1 2 3 13 15
+++ expr 15 + 128
++ eval 'trap '\''Exit 143'\'' 15'
+++ trap 'Exit 143' 15
++ trap remove_tmp_ 0
+ path_prepend_ ../src
+ test 1 '!=' 0
+ path_dir_=../src
+ case $path_dir_ in
+ abs_path_dir_=/home/sharkcz/grep/grep-2.20/tests/../src
+ case $abs_path_dir_ in
+ PATH=/home/sharkcz/grep/grep-2.20/tests/../src:/home/sharkcz/grep/grep-2.20/src:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/sharkcz/.local/bin:/home/sharkcz/bin
+ create_exe_shims_ /home/sharkcz/grep/grep-2.20/tests/../src
+ case $EXEEXT in
+ return 0
+ shift
+ test 0 '!=' 0
+ export PATH
+ require_pcre_
+ echo .
+ grep -P .
.
+ compare /dev/null err
+ compare_dev_null_ /dev/null err
+ test 2 = 2
+ test x/dev/null = x/dev/null
+ test -s err
+ return 0
+ return 0
+ require_timeout_
+ timeout 10s false
+ test 1 = 1
+ require_en_utf8_locale_
+ path_prepend_ .
+ test 1 '!=' 0
+ path_dir_=.
+ case $path_dir_ in
+ abs_path_dir_=/home/sharkcz/grep/grep-2.20/tests/.
+ case $abs_path_dir_ in
+ PATH=/home/sharkcz/grep/grep-2.20/tests/.:/home/sharkcz/grep/grep-2.20/tests/../src:/home/sharkcz/grep/grep-2.20/src:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/sharkcz/.local/bin:/home/sharkcz/bin
+ create_exe_shims_ /home/sharkcz/grep/grep-2.20/tests/.
+ case $EXEEXT in
+ return 0
+ shift
+ test 0 '!=' 0
+ export PATH
+ case $(get-mb-cur-max en_US.UTF-8) in
++ get-mb-cur-max en_US.UTF-8
+ require_compiled_in_MB_support
+ require_en_utf8_locale_
+ path_prepend_ .
+ test 1 '!=' 0
+ path_dir_=.
+ case $path_dir_ in
+ abs_path_dir_=/home/sharkcz/grep/grep-2.20/tests/.
+ case $abs_path_dir_ in
+ PATH=/home/sharkcz/grep/grep-2.20/tests/.:/home/sharkcz/grep/grep-2.20/tests/.:/home/sharkcz/grep/grep-2.20/tests/../src:/home/sharkcz/grep/grep-2.20/src:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/sharkcz/.local/bin:/home/sharkcz/bin
+ create_exe_shims_ /home/sharkcz/grep/grep-2.20/tests/.
+ case $EXEEXT in
+ return 0
+ shift
+ test 0 '!=' 0
+ export PATH
+ case $(get-mb-cur-max en_US.UTF-8) in
++ get-mb-cur-max en_US.UTF-8
+ printf $'\303\251'
+ LC_ALL=en_US.UTF-8
+ grep '[[:lower:]]'
é
+ printf 'a\201b\r'
+ fail=0
+ LC_ALL=en_US.UTF-8
+ timeout 3 grep -P 'a.?..b' in
+ test 124 = 1
+ fail_ 'libpcre'\''s match function appears to infloop'
+ warn_ 'pcre-infloop: failed test: libpcre'\''s match function appears to infloop'
+ case $IFS in
+ printf '%s\n' 'pcre-infloop: failed test: libpcre'\''s match function appears to infloop'
pcre-infloop: failed test: libpcre's match function appears to infloop
+ test 9 = 2
+ printf '%s\n' 'pcre-infloop: failed test: libpcre'\''s match function appears to infloop'
+ sed 1q
+ Exit 1
+ set +e
+ exit 1
+ exit 1
+ remove_tmp_
+ __st=1
+ cleanup_
+ :
+ cd /home/sharkcz/grep/grep-2.20/tests
+ chmod -R u+rwx /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+ rm -rf /home/sharkcz/grep/grep-2.20/tests/gt-pcre-infloop.HTz5
+ exit 1
FAIL pcre-infloop (exit status: 1)

--- Additional comment from Dan Horák on 2014-11-14 08:30:28 EST ---

Interesting is that build on ppc64/ppc64le passes - http://ppc.koji.fedoraproject.org/koji/buildinfo?buildID=276813

--- Additional comment from Jaroslav Škarvada on 2014-11-14 11:31:12 EST ---

(In reply to Dan Horák from comment #4)
> the internal tests don't pass on s390(x)

Thanks for the report, it is a bit more complicated that it seemed :)

--- Additional comment from Fedora Update System on 2014-11-14 11:47:09 EST ---

grep-2.20-6.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/grep-2.20-6.fc21

--- Additional comment from tomi ollila on 2014-11-16 07:25:22 EST ---

Name        : grep
Version     : 2.20
Release     : 6.fc21
Architecture: x86_64

Works OK for me.

--- Additional comment from Fedora Update System on 2014-11-16 09:45:07 EST ---

grep-2.20-6.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 2 Jaroslav Škarvada 2015-08-13 09:17:38 EDT
It is already fixed in grep-2.20-2.el7, please wait for the update.

Note You need to log in before you can comment on or make changes to this bug.