Bug 718711

Summary: C++ <regex> does not match as expected on caret
Product: [Fedora] Fedora Reporter: Tim Niemueller <tim>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: bkoz, jakub
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-28 07:18:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Program demonstrating the problem none

Description Tim Niemueller 2011-07-04 11:15:23 UTC
Created attachment 511172 [details]
Program demonstrating the problem

Description of problem:
When compiling a test program using C++ new <regex> header and regex_match or regex_search in combination with a regex using caret does not (never?) match as expected. Comparing to C's <regex.h> shows that on extended expressions the behavior is different, while it should not be.

Version-Release number of selected component (if applicable):
libstdc++-4.6.0-9.fc15.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Compile attached example program with "gcc -o regex-cpp -std=c++0x -lstdc++ regex.cpp"
2. Run with "./regex-cpp '^foo.*' 'foobar'"
  
Actual results:
C++ regex does not match, while C does.


Expected results:
Both should match.


Additional info:
When run with regex 'foo.*' it matches as expected. Using regex_search yields the same result.

Comment 1 Tim Niemueller 2011-07-18 17:26:38 UTC
Can someone at least confirm this, or point out if there is an error in how I used it?

Comment 2 Jakub Jelinek 2011-07-18 17:37:44 UTC
Benjamin, could you please look at this?  Thanks.

Comment 3 Benjamin Kosnik 2011-07-27 16:13:58 UTC
Confirmed. I don't see an error in usage, this is a bug in libstdc++'s <regex>.
I've opened up an upstream bug report here:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49870

As a temporary work-around, you can use boost's regex, which has the same interface but gives the correct result. Here's your example with that component:

#include <boost/regex.hpp>
#include <regex>
#include <cstdio>

// compile like
// g++ -std=gnu++0x -g -O2 71871-regex.cpp -lboost_regex -Wfatal-errors


// execute like
// ./a.out "foo.*" "foobar" 
// ./a.out "^foo.*" "foobar" 
int
main(int argc, char **argv)
{
  
  // 1 std
  {
    std::regex expr(argv[1], std::regex_constants::extended);
    std::string test_string = argv[2];
    printf("Applying regex '%s' to string '%s'\n", argv[1], argv[2]);
    
    if (std::regex_match(test_string, expr)) 
      printf("C++: Match\n");
    else 
      printf("C++: NO match\n");
  }

  // 2 boost
  {
    boost::regex expr(argv[1], boost::regex_constants::extended);
    std::string test_string = argv[2];
    printf("Applying regex '%s' to string '%s'\n", argv[1], argv[2]);

    if (boost::regex_match(test_string, expr)) 
      printf("C++: Match\n");
    else 
      printf("C++: NO match\n");
  }

  return 0;
}

Comment 4 Jakub Jelinek 2011-07-28 07:18:08 UTC
Tracked upstream.