Bug 86107
| Summary: | Subexpression and newline problem in regex (glibc) | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Linux | Reporter: | Ben Kao <bkao5> | ||||
| Component: | glibc | Assignee: | Jakub Jelinek <jakub> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 8.0 | CC: | bkao5, fweimer | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | i386 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | 9.0 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2003-04-23 18:16:05 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 90596 [details]
Std C code illustrating a bug in regex of RH8.0
With RHL9 I get this output: Testing pattern1: (0, 35): <table width="100" cellpadding="2"> Testing pattern2: (0, 35): <table width="100" cellpadding="2"> The same output and it looks OK. Can you install RHL9 and try it? Confirm, problem appears to be fixed in Red Hat 9.0. |
From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Description of problem: Regex behavior has changed from RH7.3 (libc-2.2.5) to RH8.0 (libc-2.2.93) where it is broken. The bug is related to newlines in searched text and subexpression quantifiers (*, {2}, etc.) When I use the character class [:space:] to match spaces and newlines, a newline in the search text causes matching to stop immediately after the newline (in the text: "100"\ncellpadding my regex will find "100"\n). When I substitute a space for the newline ("100" cellpadding), matching will continue on as it should. To further complicate matters, for some reason, if you specify the exact number of subexpressions that exist in the search text, it will continue matching to get the expected result. The following program compiles two patterns. Pattern1 shows the regex is broken b/c the '*' quantifier does not work as expected. Pattern2 shows a curious condition that surprisingly produces proper output. Version-Release number of selected component (if applicable): libc-2.2.93 How reproducible: Always Steps to Reproduce: Compile and run attached program. It's output shows (1) the broken output and (2) a curious condition that produces the proper output (though it is not a workaround). Additional info: