Created attachment 1441096 [details] Reproducer Description of problem: E.g. '[[=a=]]' regex doesn't match 'á' in Czech locale as it should. It seems to be regression, because it worked in glibc-2.26 and older. All locales seems to be affected, not only the Czech. Version-Release number of selected component (if applicable): glibc-2.27-14.fc28.x86_64 How reproducible: Always Steps to Reproduce: 1. gcc -o regex regex.c 2. ./regex 3. Actual results: locale: cs_CZ.UTF-8 regcomp: 0 regexec: 1 Expected results: locale: cs_CZ.UTF-8 regcomp: 0 regexec: 0 Additional info: It's blocking grep rebuild.
*** Bug 1582224 has been marked as a duplicate of this bug. ***
It's not only about 'á' in cs_CZ.UTF-8 or en_US.UTF-8. There are more matches that worked and don't work now, e.g.: $ echo 'é' | LC_ALL=fr_FR.UTF-8 grep '[[=e=]]' $ echo 'è' | LC_ALL=fr_FR.UTF-8 grep '[[=e=]]' $ echo 'ê' | LC_ALL=fr_FR.UTF-8 grep '[[=e=]]' ...
This appears to be a deliberate change in character equivalences. As part of the updates for https://sourceware.org/bugzilla/show_bug.cgi?id=14095, most accented and non-accented characters are no longer considered equivalent. I do not know if this the intend of the current Unicode version.
This may be an algorithmic issue after all, not a data problem.
glibc-2.27-30.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-85c0ff9183
glibc-2.27-30.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-85c0ff9183
glibc-2.27-30.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.