Bug 86107 - Subexpression and newline problem in regex (glibc)
Summary: Subexpression and newline problem in regex (glibc)
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: glibc
Version: 8.0
Hardware: i386
OS: Linux
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2003-03-14 06:47 UTC by Ben Kao
Modified: 2016-11-24 15:07 UTC (History)
2 users (show)

Fixed In Version: 9.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2003-04-23 18:16:05 UTC

Attachments (Terms of Use)
Std C code illustrating a bug in regex of RH8.0 (2.22 KB, text/plain)
2003-03-14 06:51 UTC, Ben Kao
no flags Details

Description Ben Kao 2003-03-14 06:47:42 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)

Description of problem:
Regex behavior has changed from RH7.3 (libc-2.2.5) to RH8.0 (libc-2.2.93) where 
it is broken.

The bug is related to newlines in searched text and subexpression quantifiers 
(*, {2}, etc.)

When I use the character class [:space:] to match spaces and newlines, a 
newline in the search text causes matching to stop immediately after the 
newline (in the text: "100"\ncellpadding my regex will find "100"\n).  When I 
substitute a space for the newline ("100" cellpadding), matching will continue 
on as it should.

To further complicate matters, for some reason, if you specify the exact number 
of subexpressions that exist in the search text, it will continue matching to 
get the expected result.

The following program compiles two patterns.  Pattern1 shows the regex is 
broken b/c the '*' quantifier does not work as expected.  Pattern2 shows a 
curious condition that surprisingly produces proper output.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Compile and run attached program.  It's output shows (1) the broken output and 
(2) a curious condition that produces the proper output (though it is not a 

Additional info:

Comment 1 Ben Kao 2003-03-14 06:51:16 UTC
Created attachment 90596 [details]
Std C code illustrating a bug in regex of RH8.0

Comment 2 Ulrich Drepper 2003-04-22 08:14:33 UTC
With RHL9 I get this output:

Testing pattern1:
(0, 35): <table width="100"
Testing pattern2:
(0, 35): <table width="100"

The same output and it looks OK.  Can you install RHL9 and try it?

Comment 3 Ben Kao 2003-04-23 18:16:05 UTC
Confirm, problem appears to be fixed in Red Hat 9.0.

Note You need to log in before you can comment on or make changes to this bug.