Description of problem:
\w and [[:alnum:]] seems to match different set of characters:
$ echo 'á' | grep '\w'
$ echo 'á' | grep '[[:alnum:]]'
Their negations are inconsistent as well:
$ echo 'á' | grep '[^[:alnum:]]'
$ echo 'á' | grep '\W'
This doesn't seem to be problem of a locale (I tried it with the en_US.UTF-8 and cs_CZ.UTF-8, both made the same results).
Affected are accented characters Á Č Ď É Ě Í Ň Ó Ř Š Ť Ú Ů Ý Ž (I have tested just these)
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. echo 'á' | grep '\w'
Also reproducible with latest grep-2.11.
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.
Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.
Created attachment 952258 [details]
This can be also resolved by rebase to grep > 2.20.
Created attachment 985631 [details]
(In reply to Jaroslav Škarvada from comment #6)
> This can be also resolved by rebase to grep > 2.20.
This is now preferred way, patch for grep-2.20 is attached.
RHEL-7 is also affected by this, thus cloning to RHEL-7, not to have regression there.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.