Bug 86107 - Subexpression and newline problem in regex (glibc)
Subexpression and newline problem in regex (glibc)
Product: Red Hat Linux
Classification: Retired
Component: glibc (Show other bugs)
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Jakub Jelinek
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2003-03-14 01:47 EST by Ben Kao
Modified: 2016-11-24 10:07 EST (History)
2 users (show)

See Also:
Fixed In Version: 9.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2003-04-23 14:16:05 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Std C code illustrating a bug in regex of RH8.0 (2.22 KB, text/plain)
2003-03-14 01:51 EST, Ben Kao
no flags Details

  None (edit)
Description Ben Kao 2003-03-14 01:47:42 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)

Description of problem:
Regex behavior has changed from RH7.3 (libc-2.2.5) to RH8.0 (libc-2.2.93) where 
it is broken.

The bug is related to newlines in searched text and subexpression quantifiers 
(*, {2}, etc.)

When I use the character class [:space:] to match spaces and newlines, a 
newline in the search text causes matching to stop immediately after the 
newline (in the text: "100"\ncellpadding my regex will find "100"\n).  When I 
substitute a space for the newline ("100" cellpadding), matching will continue 
on as it should.

To further complicate matters, for some reason, if you specify the exact number 
of subexpressions that exist in the search text, it will continue matching to 
get the expected result.

The following program compiles two patterns.  Pattern1 shows the regex is 
broken b/c the '*' quantifier does not work as expected.  Pattern2 shows a 
curious condition that surprisingly produces proper output.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Compile and run attached program.  It's output shows (1) the broken output and 
(2) a curious condition that produces the proper output (though it is not a 

Additional info:
Comment 1 Ben Kao 2003-03-14 01:51:16 EST
Created attachment 90596 [details]
Std C code illustrating a bug in regex of RH8.0
Comment 2 Ulrich Drepper 2003-04-22 04:14:33 EDT
With RHL9 I get this output:

Testing pattern1:
(0, 35): <table width="100"
Testing pattern2:
(0, 35): <table width="100"

The same output and it looks OK.  Can you install RHL9 and try it?
Comment 3 Ben Kao 2003-04-23 14:16:05 EDT
Confirm, problem appears to be fixed in Red Hat 9.0.

Note You need to log in before you can comment on or make changes to this bug.