Red Hat Bugzilla – Bug 494773
Assertion rx->sublen >= (s - rx->subbeg) + i failed: file "regcomp.c"
Last modified: 2009-10-06 06:07:08 EDT
Created attachment 338649 [details]
Perl script to reproduce with
Description of problem:
When Perl is compiled with "-g", the assertion listed in the summary triggers on the programs at the URL given, specifically:
(The character is UTF-8.)
Version-Release number of selected component (if applicable):
Execute the Perl program in the RT URL or Bugzilla attachment.
Steps to Reproduce:
1. Install perl and perl-XML-Twig
2. Copy the ticket program into a file.
3. Execute the Perl script.
Assertion rx->sublen >= (s - rx->subbeg) + i failed: file "regcomp.c", line 5098 at /usr/lib/perl5/vendor_perl/5.10.0/XML/Twig.pm line 7806.
at bugtest.pl line 12
Removing the gcc option -g from the Configure -Doptimize creates a non-failing Perl binary, so this may be more of a gcc problem.
George: -g does not cause the problem, it merely enables the assert. Therefore the error condition is most likely present also when -g is not in use, it is just not checked for.
You're right; there's a trick to it: Perl's Configure script detects the -g and enables -DDEBUGGING itself, which is what Perl's assert() depends on. That would explain why my "gcc -E" with and without -g both had the assert in it. Fedora adds the -g so -DDEBUGGING is on, which is itself a useful thing other than this particular error aborting my script.
Created attachment 345115 [details]
Test patch to assist Perl developers in locating problem.
Note: Patch is versus blead, not branch for perl-5.10.1 release.
Highly recommend this be reviewed by the Perl developers first, but it does fix the test cases and passes the Perl test suite.
Thanks for the patch. I'd rather see review from someone from p5p, who is familiar with utf8 and regexp code. This could be intrusive change.
The patch looks as revert of change fae667d5a60f37538a5761795f7af2165c7d4fb0
which has to fix similar problem. The comment from git:
"Regular expression changes to fix failing tests in XML::Twig and Mail::SpamAssassin. The breakages occured in changes #28785 and #29279."
(In reply to comment #4)
> Thanks for the patch. I'd rather see review from someone from p5p, who is
> familiar with utf8 and regexp code. This could be intrusive change.
> The patch looks as revert of change fae667d5a60f37538a5761795f7af2165c7d4fb0
> which has to fix similar problem. The comment from git:
> "Regular expression changes to fix failing tests in XML::Twig and
> Mail::SpamAssassin. The breakages occured in changes #28785 and #29279."
The git revision changes when the swap_match_buff() is called but my patch deletes the ->swap logic entirely. The problem is that when Perl re-enters the regex engine to handle utf8::SWASHNEW, the ->swap is not saved/restored so any result from the utf8 (Perl) code modifies the regex match that caused the utf8 swash to get built. Since ->swap isn't used much, it was easier to whack the whole concept and keep the pointer on the stack in the match function rather than play saved/restore games. The assert() in the subject line doesn't catch all of the cases where this happens since not all strings are shorter than the utf8 code's match offsets, though it is likely rare.
Nobody on p5p commented, unfortunately, so I'll revise my patch based on some things I missed removing that was in the git revision you mentioned and resend it.
Thanks for reminder. The patch will be applied in rawhide. Maybe in next update it will be added also in previous versions of Fedora.
perl-5.10.0-82.fc11 has been submitted as an update for Fedora 11.
perl-5.10.0-82.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.