Bug 182621
| Summary: | Allow larger regex buffer to enable long substring filters | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] 389 | Reporter: | Nathan Kinder <nkinder> | ||||||||||||||||
| Component: | Database - Indexes/Searches | Assignee: | Noriko Hosoi <nhosoi> | ||||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Chandrasekar Kannan <ckannan> | ||||||||||||||||
| Severity: | medium | Docs Contact: | |||||||||||||||||
| Priority: | high | ||||||||||||||||||
| Version: | 1.0 | CC: | benl, pauloviolada, ulf.weltman | ||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||
| Hardware: | All | ||||||||||||||||||
| OS: | Linux | ||||||||||||||||||
| Whiteboard: | |||||||||||||||||||
| Fixed In Version: | RHSA-2008-0268 | Doc Type: | Bug Fix | ||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||
| Last Closed: | 2008-05-09 17:23:38 UTC | Type: | --- | ||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
| Embargoed: | |||||||||||||||||||
| Bug Depends On: | |||||||||||||||||||
| Bug Blocks: | 249650, 444712 | ||||||||||||||||||
| Attachments: |
|
||||||||||||||||||
Created attachment 125147 [details]
regex.c
Updated fix to regex.c, use ptrdiff_t to store the offsets to be restored after
the realloc, and use a constant for the value of "how much the NFA buffer can
grow in one iteration on the pattern".
Created attachment 125148 [details]
string.c
Updated fix to string.c, used wrong pointer (pat instead of p) in a debug
message, and performed an unneeded strcat of ".*"
*** Bug 443955 has been marked as a duplicate of this bug. *** Steps to reproduce the problem 1. script to create a test entry; add the entry to the server ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ echo dn: uid=tuser1, dc=example,dc=com echo objectClass: top echo objectClass: person echo objectClass: organizationalPerson echo objectClass: inetOrgPerson echo cn: test user1 echo sn: user0123456789012345678901234567890123456789012345678901234567890123456789 i=0 while [ $i -lt 100 ] do echo " 01234567890123456789012345678901234567890123456789012345678901234567890123456789" i=`expr $i + 1` done echo uid: tuser1 echo givenName: test echo description: test user 0 echo userPassword: tuser1tuser1 echo mail: tuser1 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2. scripts to generate search commands: # 2-1 output: srch10.sh t="0123456789" T=$t$t$t$t$t$t$t$t$t$t FILTER="$T$T$T$T$T$T*" echo ldapsearch -T -p \$\1 -D \"cn=Directory Manager\" -w \$2 -b \"dc=example,dc=com\" \"\(sn=*$t*\)\" ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ # 2-2 output: srch100.sh t="0123456789" T=$t$t$t$t$t$t$t$t$t$t FILTER="$T$T$T$T$T$T$T$T$T$T*" echo ldapsearch -T -p \$\1 -D \"cn=Directory Manager\" -w \$2 -b \"dc=example,dc=com\" \"\(sn=*$t*\)\" ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ # 2-3 output: srchcomplex.sh t="0123456789" T=$t$t$t$t$t$t$t$t$t$t FILTER="$T$T$T$T$T$T$T$T$T$T*" echo ldapsearch -T -p \$\1 -D \"cn=Directory Manager\" -w \$2 -b \"dc=example,dc=com\" \"\(\&\(\|\(objectClass=inetorgperson\)\(objectClass=person\)\)\(\|\(sn=*$FILTER*\)\(mail=*bogus*\)\)\)\" 3. run the srch scripts. Attaching the stacktraces of the server next. Created attachment 304042 [details]
stacktraces of the server when the srch scripts are executed
Applied Ulf's patches found in comment #1 and #2, the test cases in the comment #6 do not crash the server and the server returns the matched entry. Created attachment 304058 [details]
cvs commit messages
Checked in the Ulf's patches into HEAD as well as Directory_Server_8_0_Branch.
Created attachment 304153 [details]
valgrind output when the above testcases are run
No memory corruption nor leak from regex are observed.
Created attachment 304274 [details]
cvs commit message (Directory71RtmBranch)
The patches have been checked in with the patch inf files.
Paulo, would you be able to build from source and try to reproduce your crash with the proposed patch? Created attachment 304665 [details]
Patch result
I don`t know if I`m doing something wrong, but the regex.c patch result was: -------------------------- |Index: string.c |=================================================================== |RCS file: /repository/source/ldap2/ldapserver/ldap/servers/plugins/syntaxes/string.c,v |retrieving revision 1.1.1.1 |diff -U8 -r1.1.1.1 string.c |--- string.c 2005/07/08 23:25:37 1.1.1.1 |+++ string.c 2006/02/23 23:47:17 -------------------------- File to patch: /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/plugins/syntaxes/string.c patching file /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/plugins/syntaxes/string.c Hunk #1 succeeded at 184 (offset 6 lines). -------------------------- |Index: regex.c |=================================================================== |RCS file: /repository/source/ldap2/ldapserver/ldap/servers/slapd/regex.c,v |retrieving revision 1.1.1.1 |diff -U8 -r1.1.1.1 regex.c |--- regex.c 2005/07/08 23:25:38 1.1.1.1 |+++ regex.c 2006/02/08 21:39:11 -------------------------- File to patch: /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/slapd/regex.c patching file /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/slapd/regex.c Hunk #1 FAILED at 33. Hunk #2 succeeded at 386 (offset 12 lines). Hunk #4 succeeded at 475 (offset 12 lines). 1 out of 5 hunks FAILED -- saving rejects to file /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/slapd/regex.c.rej An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0268.html *** Bug 448900 has been marked as a duplicate of this bug. *** |
If you search for ("sn=abcdefg...500 characters here...*"), DS returns no entries and the result was LDAP_SUCCESS (0) even when there are entries with sn that have a long value that matches the substring. An error is logged in the errors log. Analysis: The substring filter code uses a BUFSIZ char array to hold the pattern it constructs from the filter value. For example, searching for ("sn=foo*oo*bar*") will set the pattern buffer to "^foo.*ooo.*bar.*". This buffer is then passed to the old regular expression compilation function slapd_re_comp, which compiles it into a finite automaton that it sticks in a static buffer. Then it calls slapd_re_exec with values found in the entries, for example, slapd_re_exec("fooABCooDEFbarXYZ"). If this matches, the filter was true and the entry is returned to the client. The reason the pattern buffer is of fixed size is because the static buffer in the old regex code is of fixed size. Simply expanding our pattern buffer will cause the regex buffer to clobber the stack. Enhancing the old regex code to use a dynamic buffer isn't trivial, it's a little tricky to figure out how much space we will need. For example, a pattern like "foo[a-z]bar" uses something like 22 bytes: 2 for each regular character, 20 for the character class. Instead of using POSIX regex, below are updates to the old regex to make the old global buffer a global malloced buffer. As I iterate over the pattern I check whether we're approaching the end and double its size when we're 20 bytes away (which is how much the NFA might grow in one iteration). I moved the slapd_re_init function to the end so the globals I need for initializing the buffer will be declared. They were cozy all around the file and I didn't want to move them up top. Maybe they should be moved to regex.h, or maybe this stuff should just be scrapped for a more standard implementation when we're bored sometime... Index: slapd/regex.c =================================================================== RCS file: /repository/source/ldap2/ldapserver/ldap/servers/slapd/regex.c,v retrieving revision 1.1.1.1 diff -U8 -r1.1.1.1 regex.c --- regex.c 2005/07/08 23:25:38 1.1.1.1 +++ regex.c 2006/02/07 02:43:03 @@ -33,41 +33,16 @@ * * Copyright (C) 2001 Sun Microsystems, Inc. Used by permission. * Copyright (C) 2005 Red Hat, Inc. * All rights reserved. * END COPYRIGHT BLOCK **/ #include "slap.h" /* must come before regex.h */ #include "portable.h" -static PRLock *regex_mutex = NULL; - -int -slapd_re_init( void ) -{ - if ( NULL == regex_mutex ) { - regex_mutex = PR_NewLock(); - } - return( NULL == regex_mutex ? -1 : 0 ); -} - -void -slapd_re_lock( void ) -{ - PR_ASSERT( NULL != regex_mutex ); - PR_Lock( regex_mutex ); -} - -int -slapd_re_unlock( void ) -{ - PR_ASSERT( NULL != regex_mutex ); - return( PR_Unlock( regex_mutex ) ); -} - #if defined( MACOS ) || defined( DOS ) || defined( _WIN32 ) || defined( NEED_BSDREGEX ) #include "regex.h" /* * regex - Regular expression pattern matching and replacement * * By: Ozan S. Yigit (oz) * Dept. of Computer Science @@ -437,17 +412,18 @@ #define ASCIIB 0177 typedef unsigned char UCHAR; /* char, on the other hand, may be signed or unsigned; * it's platform-dependent. A hard fact of life, in C. */ static int tagstk[MAXTAG]; /* subpat tag stack..*/ -static UCHAR nfa[MAXNFA]; /* automaton.. */ +static UCHAR *nfa = NULL; /* automaton.. */ +static int nfasize = MAXNFA; /* tracks size of nfa buffer */ static int sta = NOP; /* status of lastpat */ static UCHAR bittab[BITBLK]; /* bit table for CCL */ /* pre-set bits... */ static UCHAR bitarr[] = {1,2,4,8,16,32,64,128}; #ifdef DEBUG static void nfadump( UCHAR *ap); @@ -481,16 +457,31 @@ if (sta) return 0; else return badpat("No previous regular expression"); } sta = NOP; for (p = (UCHAR*)pat; *p; p++) { + /* Check if we are approaching end of nfa buffer. 20 bytes is + the max we might add to the nfa per loop. */ + if (mp - (UCHAR*)nfa + 20 >= nfasize) { + /* Save offsets */ + long mppos = mp - nfa; + long sppos = sp - nfa; + + /* Double the nfa buffer size */ + nfasize *= 2; + nfa = (UCHAR*)slapi_ch_realloc((char*)nfa, nfasize); + + /* Restore pointers into realloced space */ + mp = nfa + mppos; + sp = nfa + sppos; + } lp = mp; switch(*p) { case '.': /* match any char.. */ store(ANY); break; case '^': /* match beginning.. */ @@ -1082,8 +1073,40 @@ default: printf("bad nfa. opcode %o\n", ap[-1]); exit(1); break; } } #endif #endif /* MACOS or DOS or NEED_BSDREGEX */ + +static PRLock *regex_mutex = NULL; + +int +slapd_re_init( void ) +{ + if ( NULL == regex_mutex ) { + regex_mutex = PR_NewLock(); + } + + if ( NULL == nfa ) { + nfa = (UCHAR*)slapi_ch_malloc( MAXNFA ); + } + + return( NULL == regex_mutex ? -1 : 0 ); +} + +void +slapd_re_lock( void ) +{ + PR_ASSERT( NULL != regex_mutex ); + PR_Lock( regex_mutex ); +} + +int +slapd_re_unlock( void ) +{ + PR_ASSERT( NULL != regex_mutex ); + return( PR_Unlock( regex_mutex ) ); +} + + Index: plugins/syntaxes/string.c =================================================================== RCS file: /repository/source/ldap2/ldapserver/ldap/servers/plugins/syntaxes/string.c,v retrieving revision 1.1.1.1 diff -U8 -r1.1.1.1 string.c --- string.c 2005/07/08 23:25:37 1.1.1.1 +++ string.c 2006/02/07 02:26:58 @@ -178,87 +178,96 @@ return( rc ); } int string_filter_sub( Slapi_PBlock *pb, char *initial, char **any, char *final, Slapi_Value **bvals, int syntax ) { - int i, j, rc; - char *p, *end, *realval, *tmpbuf; + int i, j, rc, size=0; + char *p, *end, *realval, *tmpbuf, *bigpat = NULL; size_t tmpbufsize; char pat[BUFSIZ]; char buf[BUFSIZ]; char ebuf[BUFSIZ]; LDAPDebug( LDAP_DEBUG_FILTER, "=> string_filter_sub\n", 0, 0, 0 ); - /* * construct a regular expression corresponding to the * filter and let regex do the work for each value * XXX should do this once and save it somewhere XXX */ pat[0] = '\0'; p = pat; end = pat + sizeof(pat) - 2; /* leave room for null */ + if ( initial != NULL ) { - value_normalize( initial, syntax, 1 /* trim leading blanks */ ); - strcpy( p, "^" ); - p = strchr( p, '\0' ); - /* 2 * in case every char is special */ - if ( p + 2 * strlen( initial ) > end ) { - LDAPDebug( LDAP_DEBUG_ANY, "not enough pattern space\n", - 0, 0, 0 ); - return( -1 ); + size = strlen( initial ) + 1; /* add 1 for "^" */ + } + + if ( any != NULL ) { + i = 0; + while( any[i]) { + size += strlen(any[i++]) + 2; /* add 2 for ".*" */ } + } + + if ( final != NULL ) { + size += strlen( final ) + 3; /* add 3 for ".*" and "$" */ + } + + size *= 2; /* doubled in case all filter chars need escaping */ + size++; /* add 1 for null */ + + if ( p + size > end ) { + bigpat = slapi_ch_malloc( size ); + p = bigpat; + } + + if ( initial != NULL ) { + value_normalize( initial, syntax, 1 /* trim leading blanks */ ); + *p++ = '^'; filter_strcpy_special( p, initial ); p = strchr( p, '\0' ); } if ( any != NULL ) { for ( i = 0; any[i] != NULL; i++ ) { value_normalize( any[i], syntax, 0 /* DO NOT trim leading blanks */ ); /* ".*" + value */ - if ( p + 2 * strlen( any[i] ) + 2 > end ) { - LDAPDebug( LDAP_DEBUG_ANY, - "not enough pattern space\n", 0, 0, 0 ); - return( -1 ); - } strcpy( p, ".*" ); - p = strchr( p, '\0' ); + *p++ = '.'; + *p++ = '*'; filter_strcpy_special( p, any[i] ); p = strchr( p, '\0' ); } } if ( final != NULL ) { value_normalize( final, syntax, 0 /* DO NOT trim leading blanks */ ); - /* ".*" + value */ - if ( p + 2 * strlen( final ) + 2 > end ) { - LDAPDebug( LDAP_DEBUG_ANY, "not enough pattern space\n", - 0, 0, 0 ); - return( -1 ); - } - strcpy( p, ".*" ); - p = strchr( p, '\0' ); + *p++ = '.'; + *p++ = '*'; filter_strcpy_special( p, final ); - p = strchr( p, '\0' ); - strcpy( p, "$" ); + strcat( p, "$" ); } /* compile the regex */ + p = (bigpat) ? bigpat : pat; slapd_re_lock(); - if ( (p = slapd_re_comp( pat )) != 0 ) { + if ( (tmpbuf = slapd_re_comp( p )) != 0 ) { LDAPDebug( LDAP_DEBUG_ANY, "re_comp (%s) failed (%s)\n", - pat, p, 0 ); + pat, tmpbuf, 0 ); slapd_re_unlock(); - return( -1 ); + if( bigpat != NULL ) { + slapi_ch_free((void**)&bigpat ); + } + return( LDAP_OPERATIONS_ERROR ); } else { - LDAPDebug( LDAP_DEBUG_TRACE, "re_comp (%s)\n", - escape_string( pat, ebuf ), 0, 0 ); + LDAPDebug( LDAP_DEBUG_TRACE, "re_comp (%s)\n", + escape_string( p, ebuf ), 0, 0 ); } /* * test the regex against each value */ rc = -1; tmpbuf = NULL; tmpbufsize = 0; @@ -289,16 +298,19 @@ rc = 0; break; } } slapd_re_unlock(); if ( tmpbuf != NULL ) { slapi_ch_free((void**)&tmpbuf ); } + if( bigpat != NULL ) { + slapi_ch_free((void**)&bigpat ); + } LDAPDebug( LDAP_DEBUG_FILTER, "<= string_filter_sub %d\n", rc, 0, 0 ); return( rc ); } int string_values2keys( Slapi_PBlock *pb, Slapi_Value **bvals,