Bug 1706742 - Compilation failed: escape sequence is invalid in character class since 10.33
Summary: Compilation failed: escape sequence is invalid in character class since 10.33
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: pcre2
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Petr Pisar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1715472 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-06 08:33 UTC by Remi Collet
Modified: 2019-05-30 13:10 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-05-06 09:50:30 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Remi Collet 2019-05-06 08:33:07 UTC
Working on some PHP test suite, I notice the following code was OK with previous version 10.32 (which is the one bundled in PHP) and not working anymore with 10.33

With 10.32

# php -r "var_dump(preg_match('/(\R[^\R]*)$/', 'foo'));"
int(0)

With 10.33

# php -r "var_dump(preg_match('/(\R[^\R]*)$/', 'foo'));"
PHP Warning:  preg_match(): Compilation failed: escape sequence is invalid in character class at offset 6 in Command line code on line 1
bool(false)


I don't see any immadiate change in ChangeLog which may explain this



P.S this only affects PHP 7.3 (older version sill use pcre)

Comment 1 Remi Collet 2019-05-06 08:38:03 UTC
Upstream bug tracker https://github.com/FriendsOfPHP/PHP-CS-Fixer/issues/4405

Comment 2 Petr Pisar 2019-05-06 09:50:30 UTC
pcre2 reports the compile error since its very first release. This is not a regression. Maybe PHP modifies the bundled PCRE2.

$ printf '%s\n%s\n' '/(\R[^\R]*)$/' 'foo' | ./pcre2test 
PCRE2 version 10.10 2015-03-06
/(\R[^\R]*)$/
Failed: error 107 at offset 6: invalid escape sequence in character class
foo

The reproducer can be minimized to /[\R]/:

$ pcre2test 
PCRE2 version 10.33 2019-04-16
  re> /[\R]/
Failed: error 107 at offset 2: escape sequence is invalid in character class

So it seems PCRE2 thinks a character class cannot contain a end-of-line escape sequence. That's probably because \R is a sequence. Not a character. E.g. DOS-like systems have \R equaled to \r\n. There is no way how to fit a two-character sequence into a-character-long primitive that a character class represents.

However, pcre2pattern(3) provides this explanation:

   Escape sequences in character classes

       All  the  sequences that define a single character value can be used both inside
       and outside character classes. In addition, inside  a  character  class,  \b  is
       interpreted as the backspace character (hex 08).

       When  not  followed by an opening brace, \N is not allowed in a character class.
       \B, \R, and \X are not special inside a character class. Like other unrecognized
       alphabetic  escape  sequences,  they  cause an error. Outside a character class,
       these sequences have different meanings.

So reporting an error on \R inside a character class is a documented behavior.

Before replacing \R with [\r\n] or similar classes, please read "Newline sequences" section in the same manual that documents what all other characters \R matches (there are much more than \r and \n).

Comment 3 Remi Collet 2019-05-06 09:52:49 UTC
Notice: as [^\R] doesn't seems to work as expected in pcre2 < 10.33, I have proposed a upstream fix: https://github.com/FriendsOfPHP/PHP-CS-Fixer/pull/4406

BTW, your eye welcome about this issue

Comment 4 Remi Collet 2019-05-06 09:56:39 UTC
Thanks for your explanation.

Comment 5 Remi Collet 2019-05-06 10:28:36 UTC
> pcre2 reports the compile error since its very first release. This is not a regression. Maybe PHP modifies the bundled PCRE2.

PHP is build against system libpcre2 ;)

The strange thing, is that pcre2 < 10.33 doesn't report the invalid escape sequence (from PHP) only 10.33 start to report about it.

Comment 6 Petr Pisar 2019-05-30 13:10:44 UTC
*** Bug 1715472 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.