Bug 1706742
| Summary: | Compilation failed: escape sequence is invalid in character class since 10.33 | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Remi Collet <fedora> |
| Component: | pcre2 | Assignee: | Petr Pisar <ppisar> |
| Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 29 | CC: | equistango, ppisar |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-05-06 09:50:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Upstream bug tracker https://github.com/FriendsOfPHP/PHP-CS-Fixer/issues/4405 pcre2 reports the compile error since its very first release. This is not a regression. Maybe PHP modifies the bundled PCRE2.
$ printf '%s\n%s\n' '/(\R[^\R]*)$/' 'foo' | ./pcre2test
PCRE2 version 10.10 2015-03-06
/(\R[^\R]*)$/
Failed: error 107 at offset 6: invalid escape sequence in character class
foo
The reproducer can be minimized to /[\R]/:
$ pcre2test
PCRE2 version 10.33 2019-04-16
re> /[\R]/
Failed: error 107 at offset 2: escape sequence is invalid in character class
So it seems PCRE2 thinks a character class cannot contain a end-of-line escape sequence. That's probably because \R is a sequence. Not a character. E.g. DOS-like systems have \R equaled to \r\n. There is no way how to fit a two-character sequence into a-character-long primitive that a character class represents.
However, pcre2pattern(3) provides this explanation:
Escape sequences in character classes
All the sequences that define a single character value can be used both inside
and outside character classes. In addition, inside a character class, \b is
interpreted as the backspace character (hex 08).
When not followed by an opening brace, \N is not allowed in a character class.
\B, \R, and \X are not special inside a character class. Like other unrecognized
alphabetic escape sequences, they cause an error. Outside a character class,
these sequences have different meanings.
So reporting an error on \R inside a character class is a documented behavior.
Before replacing \R with [\r\n] or similar classes, please read "Newline sequences" section in the same manual that documents what all other characters \R matches (there are much more than \r and \n).
Notice: as [^\R] doesn't seems to work as expected in pcre2 < 10.33, I have proposed a upstream fix: https://github.com/FriendsOfPHP/PHP-CS-Fixer/pull/4406 BTW, your eye welcome about this issue Thanks for your explanation. > pcre2 reports the compile error since its very first release. This is not a regression. Maybe PHP modifies the bundled PCRE2.
PHP is build against system libpcre2 ;)
The strange thing, is that pcre2 < 10.33 doesn't report the invalid escape sequence (from PHP) only 10.33 start to report about it.
*** Bug 1715472 has been marked as a duplicate of this bug. *** |
Working on some PHP test suite, I notice the following code was OK with previous version 10.32 (which is the one bundled in PHP) and not working anymore with 10.33 With 10.32 # php -r "var_dump(preg_match('/(\R[^\R]*)$/', 'foo'));" int(0) With 10.33 # php -r "var_dump(preg_match('/(\R[^\R]*)$/', 'foo'));" PHP Warning: preg_match(): Compilation failed: escape sequence is invalid in character class at offset 6 in Command line code on line 1 bool(false) I don't see any immadiate change in ChangeLog which may explain this P.S this only affects PHP 7.3 (older version sill use pcre)