Bug 182621 - Allow larger regex buffer to enable long substring filters
Allow larger regex buffer to enable long substring filters
Status: CLOSED ERRATA
Product: 389
Classification: Community
Component: Database - Indexes/Searches (Show other bugs)
1.0
All Linux
high Severity medium
: ---
: ---
Assigned To: Noriko Hosoi
Chandrasekar Kannan
:
: 443955 448900 (view as bug list)
Depends On:
Blocks: 249650 CVE-2008-1677
  Show dependency treegraph
 
Reported: 2006-02-23 13:36 EST by Nathan Kinder
Modified: 2015-01-04 18:19 EST (History)
3 users (show)

See Also:
Fixed In Version: RHSA-2008-0268
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-09 13:23:38 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
regex.c (3.88 KB, patch)
2006-02-23 18:48 EST, Ulf Weltman
no flags Details | Diff
string.c (3.89 KB, patch)
2006-02-23 18:51 EST, Ulf Weltman
no flags Details | Diff
stacktraces of the server when the srch scripts are executed (6.55 KB, text/plain)
2008-04-28 16:58 EDT, Noriko Hosoi
no flags Details
cvs commit messages (2.39 KB, text/plain)
2008-04-28 20:40 EDT, Noriko Hosoi
no flags Details
valgrind output when the above testcases are run (7.01 KB, text/plain)
2008-04-29 15:43 EDT, Noriko Hosoi
no flags Details
cvs commit message (Directory71RtmBranch) (1.65 KB, text/plain)
2008-04-30 17:03 EDT, Noriko Hosoi
no flags Details
Patch result (1.43 KB, application/x-reject)
2008-05-06 13:35 EDT, Paulo Alberto
no flags Details

  None (edit)
Description Nathan Kinder 2006-02-23 13:36:32 EST
If you search for ("sn=abcdefg...500 characters here...*"), DS returns no
entries and the result was LDAP_SUCCESS (0) even when there are entries with sn
that have a long value that matches the substring.  An error is logged in the
errors log.

Analysis:
The substring filter code uses a BUFSIZ char array to hold the pattern it
constructs from the filter value.  For example, searching for ("sn=foo*oo*bar*")
will set the pattern buffer to "^foo.*ooo.*bar.*".  This buffer is then passed
to the old regular expression compilation function slapd_re_comp, which compiles
it into a finite automaton that it sticks in a static buffer.  Then it calls
slapd_re_exec with values found in the entries, for example,
slapd_re_exec("fooABCooDEFbarXYZ").  If this matches, the filter was true and
the entry is returned to the client.

The reason the pattern buffer is of fixed size is because the static buffer in
the old regex code is of fixed size.  Simply expanding our pattern buffer will
cause the regex buffer to clobber the stack.  Enhancing the old regex code to
use a dynamic buffer isn't trivial, it's a little tricky to figure out how much
space we will need.  For example, a pattern like "foo[a-z]bar" uses something
like 22 bytes: 2 for each regular character, 20 for the character class. 

Instead of using POSIX regex, below are updates to the old regex to make the old
global buffer a global malloced buffer.  As I iterate over the pattern I check
whether we're approaching the end and double its size when we're 20 bytes away
(which is how much the NFA might grow in one iteration).

I moved the slapd_re_init function to the end so the globals I need for
initializing the buffer will be declared.  They were cozy all around the file
and I didn't want to move them up top.  Maybe they should be moved to regex.h,
or maybe this stuff should just be scrapped for a more standard implementation
when we're bored sometime...

Index: slapd/regex.c
===================================================================
RCS file: /repository/source/ldap2/ldapserver/ldap/servers/slapd/regex.c,v
retrieving revision 1.1.1.1
diff -U8 -r1.1.1.1 regex.c
--- regex.c     2005/07/08 23:25:38     1.1.1.1
+++ regex.c     2006/02/07 02:43:03
@@ -33,41 +33,16 @@
 *
 * Copyright (C) 2001 Sun Microsystems, Inc. Used by permission.
 * Copyright (C) 2005 Red Hat, Inc.
 * All rights reserved.
 * END COPYRIGHT BLOCK **/
#include "slap.h"      /* must come before regex.h */
#include "portable.h"

-static PRLock  *regex_mutex = NULL;
-
-int
-slapd_re_init( void )
-{
-       if ( NULL == regex_mutex ) {
-               regex_mutex = PR_NewLock();
-       }
-       return( NULL == regex_mutex ? -1 : 0 );
-}
-
-void
-slapd_re_lock( void )
-{
-       PR_ASSERT( NULL != regex_mutex );
-       PR_Lock( regex_mutex );
-}
-
-int
-slapd_re_unlock( void )
-{
-       PR_ASSERT( NULL != regex_mutex );
-       return( PR_Unlock( regex_mutex ) );
-}
-
#if defined( MACOS ) || defined( DOS ) || defined( _WIN32 ) || defined(
NEED_BSDREGEX )
#include "regex.h"

/*
 * regex - Regular expression pattern matching  and replacement
 *
 * By:  Ozan S. Yigit (oz)
 *      Dept. of Computer Science
@@ -437,17 +412,18 @@
#define ASCIIB 0177

typedef unsigned char UCHAR;
/* char, on the other hand, may be signed or unsigned;
 * it's platform-dependent.  A hard fact of life, in C.
 */

static int  tagstk[MAXTAG];             /* subpat tag stack..*/
-static UCHAR nfa[MAXNFA];              /* automaton..       */
+static UCHAR *nfa = NULL;              /* automaton..       */
+static int  nfasize = MAXNFA;          /* tracks size of nfa buffer */
static int  sta = NOP;                 /* status of lastpat */

static UCHAR bittab[BITBLK];           /* bit table for CCL */
                                       /* pre-set bits...   */
static UCHAR bitarr[] = {1,2,4,8,16,32,64,128};

#ifdef DEBUG
static void nfadump( UCHAR *ap);
@@ -481,16 +457,31 @@
               if (sta)
                       return 0;
               else
                       return badpat("No previous regular expression");
        }
       sta = NOP;

       for (p = (UCHAR*)pat; *p; p++) {
+               /* Check if we are approaching end of nfa buffer. 20 bytes is
+                  the max we might add to the nfa per loop. */
+               if (mp - (UCHAR*)nfa + 20 >= nfasize) {
+                       /* Save offsets */
+                       long mppos = mp - nfa;
+                       long sppos = sp - nfa;
+
+                       /* Double the nfa buffer size */
+                       nfasize *= 2;
+                       nfa = (UCHAR*)slapi_ch_realloc((char*)nfa, nfasize);
+
+                       /* Restore pointers into realloced space */
+                       mp = nfa + mppos;
+                       sp = nfa + sppos;
+               }
               lp = mp;
               switch(*p) {

               case '.':               /* match any char..  */
                       store(ANY);
                       break;

               case '^':               /* match beginning.. */
@@ -1082,8 +1073,40 @@
               default:
                       printf("bad nfa. opcode %o\n", ap[-1]);
                       exit(1);
                       break;
               }
}
#endif
#endif /* MACOS or DOS or NEED_BSDREGEX */
+
+static PRLock  *regex_mutex = NULL;
+
+int
+slapd_re_init( void )
+{
+       if ( NULL == regex_mutex ) {
+               regex_mutex = PR_NewLock();
+       }
+
+       if ( NULL == nfa ) {
+               nfa = (UCHAR*)slapi_ch_malloc( MAXNFA );
+       }
+
+       return( NULL == regex_mutex ? -1 : 0 );
+}
+
+void
+slapd_re_lock( void )
+{
+       PR_ASSERT( NULL != regex_mutex );
+       PR_Lock( regex_mutex );
+}
+
+int
+slapd_re_unlock( void )
+{
+       PR_ASSERT( NULL != regex_mutex );
+       return( PR_Unlock( regex_mutex ) );
+}
+
+

Index: plugins/syntaxes/string.c
===================================================================
RCS file:
/repository/source/ldap2/ldapserver/ldap/servers/plugins/syntaxes/string.c,v
retrieving revision 1.1.1.1
diff -U8 -r1.1.1.1 string.c
--- string.c    2005/07/08 23:25:37     1.1.1.1
+++ string.c    2006/02/07 02:26:58
@@ -178,87 +178,96 @@

       return( rc );
}

int
string_filter_sub( Slapi_PBlock *pb, char *initial, char **any, char *final,
    Slapi_Value **bvals, int syntax )
{
-       int             i, j, rc;
-       char            *p, *end, *realval, *tmpbuf;
+       int             i, j, rc, size=0;
+       char            *p, *end, *realval, *tmpbuf, *bigpat = NULL;
       size_t          tmpbufsize;
       char            pat[BUFSIZ];
       char            buf[BUFSIZ];
       char            ebuf[BUFSIZ];

       LDAPDebug( LDAP_DEBUG_FILTER, "=> string_filter_sub\n",
           0, 0, 0 );
-
       /*
        * construct a regular expression corresponding to the
        * filter and let regex do the work for each value
        * XXX should do this once and save it somewhere XXX
        */
       pat[0] = '\0';
       p = pat;
       end = pat + sizeof(pat) - 2;    /* leave room for null */
+
       if ( initial != NULL ) {
-               value_normalize( initial, syntax, 1 /* trim leading blanks */ );
-               strcpy( p, "^" );
-               p = strchr( p, '\0' );
-               /* 2 * in case every char is special */
-               if ( p + 2 * strlen( initial ) > end ) {
-                       LDAPDebug( LDAP_DEBUG_ANY, "not enough pattern space\n",
-                           0, 0, 0 );
-                       return( -1 );
+               size = strlen( initial ) + 1; /* add 1 for "^" */
+       }
+
+       if ( any != NULL ) {
+               i = 0;
+               while( any[i]) {
+                       size += strlen(any[i++]) + 2; /* add 2 for ".*" */
               }
+       }
+
+       if ( final != NULL ) {
+                size += strlen( final ) + 3; /* add 3 for ".*" and "$" */
+       }
+
+       size *= 2; /* doubled in case all filter chars need escaping */
+       size++; /* add 1 for null */
+
+       if ( p + size > end ) {
+               bigpat = slapi_ch_malloc( size );
+               p = bigpat;
+       }
+
+       if ( initial != NULL ) {
+               value_normalize( initial, syntax, 1 /* trim leading blanks */ );
+               *p++ = '^';
               filter_strcpy_special( p, initial );
               p = strchr( p, '\0' );
       }
       if ( any != NULL ) {
               for ( i = 0; any[i] != NULL; i++ ) {
                       value_normalize( any[i], syntax, 0 /* DO NOT trim leading
blanks */ );
                       /* ".*" + value */
-                       if ( p + 2 * strlen( any[i] ) + 2 > end ) {
-                               LDAPDebug( LDAP_DEBUG_ANY,
-                                   "not enough pattern space\n", 0, 0, 0 );
-                               return( -1 );
-                       }
                       strcpy( p, ".*" );
-                       p = strchr( p, '\0' );
+                       *p++ = '.';
+                       *p++ = '*';
                       filter_strcpy_special( p, any[i] );
                       p = strchr( p, '\0' );
               }
       }
       if ( final != NULL ) {
               value_normalize( final, syntax, 0 /* DO NOT trim leading blanks */ );
-               /* ".*" + value */
-               if ( p + 2 * strlen( final ) + 2 > end ) {
-                       LDAPDebug( LDAP_DEBUG_ANY, "not enough pattern space\n",
-                           0, 0, 0 );
-                       return( -1 );
-               }
-               strcpy( p, ".*" );
-               p = strchr( p, '\0' );
+               *p++ = '.';
+               *p++ = '*';
               filter_strcpy_special( p, final );
-               p = strchr( p, '\0' );
-               strcpy( p, "$" );
+               strcat( p, "$" );
       }

       /* compile the regex */
+       p = (bigpat) ? bigpat : pat;
       slapd_re_lock();
-       if ( (p = slapd_re_comp( pat )) != 0 ) {
+       if ( (tmpbuf = slapd_re_comp( p )) != 0 ) {
               LDAPDebug( LDAP_DEBUG_ANY, "re_comp (%s) failed (%s)\n",
-                   pat, p, 0 );
+                   pat, tmpbuf, 0 );
               slapd_re_unlock();
-               return( -1 );
+               if( bigpat != NULL ) {
+                       slapi_ch_free((void**)&bigpat );
+               }
+               return( LDAP_OPERATIONS_ERROR );
       } else {
-               LDAPDebug( LDAP_DEBUG_TRACE, "re_comp (%s)\n",
-                                  escape_string( pat, ebuf ), 0, 0 );
+               LDAPDebug( LDAP_DEBUG_TRACE, "re_comp (%s)\n",
+                                  escape_string( p, ebuf ), 0, 0 );
       }

       /*
        * test the regex against each value
        */
       rc = -1;
       tmpbuf = NULL;
       tmpbufsize = 0;
@@ -289,16 +298,19 @@
                       rc = 0;
                       break;
               }
       }
       slapd_re_unlock();
       if ( tmpbuf != NULL ) {
               slapi_ch_free((void**)&tmpbuf );
       }
+       if( bigpat != NULL ) {
+               slapi_ch_free((void**)&bigpat );
+       }

       LDAPDebug( LDAP_DEBUG_FILTER, "<= string_filter_sub %d\n",
           rc, 0, 0 );
       return( rc );
}

int
string_values2keys( Slapi_PBlock *pb, Slapi_Value **bvals,
Comment 1 Ulf Weltman 2006-02-23 18:48:54 EST
Created attachment 125147 [details]
regex.c

Updated fix to regex.c, use ptrdiff_t to store the offsets to be restored after
the realloc, and use a constant for the value of "how much the NFA buffer can
grow in one iteration on the pattern".
Comment 2 Ulf Weltman 2006-02-23 18:51:29 EST
Created attachment 125148 [details]
string.c

Updated fix to string.c, used wrong pointer (pat instead of p) in a debug
message, and performed an unneeded strcat of ".*"
Comment 5 Noriko Hosoi 2008-04-28 16:50:21 EDT
*** Bug 443955 has been marked as a duplicate of this bug. ***
Comment 6 Noriko Hosoi 2008-04-28 16:56:59 EDT
Steps to reproduce the problem
1. script to create a test entry; add the entry to the server
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
echo dn: uid=tuser1, dc=example,dc=com
echo objectClass: top
echo objectClass: person
echo objectClass: organizationalPerson
echo objectClass: inetOrgPerson
echo cn: test user1
echo sn: user0123456789012345678901234567890123456789012345678901234567890123456789
i=0
while [ $i -lt 100 ]
do
echo "
01234567890123456789012345678901234567890123456789012345678901234567890123456789"
i=`expr $i + 1`
done
echo uid: tuser1
echo givenName: test
echo description: test user 0
echo userPassword: tuser1tuser1
echo mail: tuser1@example.com
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

2. scripts to generate search commands:
# 2-1 output: srch10.sh
t="0123456789"
T=$t$t$t$t$t$t$t$t$t$t
FILTER="$T$T$T$T$T$T*"
echo ldapsearch -T -p \$\1 -D \"cn=Directory Manager\" -w \$2 -b
\"dc=example,dc=com\" \"\(sn=*$t*\)\"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# 2-2 output: srch100.sh
t="0123456789"
T=$t$t$t$t$t$t$t$t$t$t
FILTER="$T$T$T$T$T$T$T$T$T$T*"
echo ldapsearch -T -p \$\1 -D \"cn=Directory Manager\" -w \$2 -b
\"dc=example,dc=com\" \"\(sn=*$t*\)\"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# 2-3 output: srchcomplex.sh
t="0123456789"
T=$t$t$t$t$t$t$t$t$t$t
FILTER="$T$T$T$T$T$T$T$T$T$T*"
echo ldapsearch -T -p \$\1 -D \"cn=Directory Manager\" -w \$2 -b
\"dc=example,dc=com\"
\"\(\&\(\|\(objectClass=inetorgperson\)\(objectClass=person\)\)\(\|\(sn=*$FILTER*\)\(mail=*bogus*\)\)\)\"

3. run the srch scripts.
Attaching the stacktraces of the server next.
Comment 7 Noriko Hosoi 2008-04-28 16:58:19 EDT
Created attachment 304042 [details]
stacktraces of the server when the srch scripts are executed
Comment 8 Noriko Hosoi 2008-04-28 17:00:47 EDT
Applied Ulf's patches found in comment #1 and #2, the test cases in the comment
#6 do not crash the server and the server returns the matched entry.
Comment 9 Noriko Hosoi 2008-04-28 20:40:22 EDT
Created attachment 304058 [details]
cvs commit messages

Checked in the Ulf's patches into HEAD as well as Directory_Server_8_0_Branch.
Comment 10 Noriko Hosoi 2008-04-29 15:43:09 EDT
Created attachment 304153 [details]
valgrind output when the above testcases are run

No memory corruption nor leak from regex are observed.
Comment 13 Noriko Hosoi 2008-04-30 17:03:25 EDT
Created attachment 304274 [details]
cvs commit message (Directory71RtmBranch)

The patches have been checked in with the patch inf files.
Comment 14 Rich Megginson 2008-05-05 17:15:28 EDT
Paulo, would you be able to build from source and try to reproduce your crash
with the proposed patch?
Comment 15 Paulo Alberto 2008-05-06 13:35:19 EDT
Created attachment 304665 [details]
Patch result
Comment 16 Paulo Alberto 2008-05-06 13:36:51 EDT
I don`t know if I`m doing something wrong, but the regex.c patch result was:

--------------------------
|Index: string.c
|===================================================================
|RCS file:
/repository/source/ldap2/ldapserver/ldap/servers/plugins/syntaxes/string.c,v
|retrieving revision 1.1.1.1
|diff -U8 -r1.1.1.1 string.c
|--- string.c   2005/07/08 23:25:37     1.1.1.1
|+++ string.c   2006/02/23 23:47:17
--------------------------
File to patch:
/rpm/sources/fedora-ds-base-1.1.0/ldap/servers/plugins/syntaxes/string.c
patching file
/rpm/sources/fedora-ds-base-1.1.0/ldap/servers/plugins/syntaxes/string.c
Hunk #1 succeeded at 184 (offset 6 lines).





--------------------------
|Index: regex.c
|===================================================================
|RCS file: /repository/source/ldap2/ldapserver/ldap/servers/slapd/regex.c,v
|retrieving revision 1.1.1.1
|diff -U8 -r1.1.1.1 regex.c
|--- regex.c    2005/07/08 23:25:38     1.1.1.1
|+++ regex.c    2006/02/08 21:39:11
--------------------------
File to patch: /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/slapd/regex.c
patching file /rpm/sources/fedora-ds-base-1.1.0/ldap/servers/slapd/regex.c
Hunk #1 FAILED at 33.
Hunk #2 succeeded at 386 (offset 12 lines).
Hunk #4 succeeded at 475 (offset 12 lines).
1 out of 5 hunks FAILED -- saving rejects to file
/rpm/sources/fedora-ds-base-1.1.0/ldap/servers/slapd/regex.c.rej
Comment 23 errata-xmlrpc 2008-05-09 13:23:38 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0268.html
Comment 24 Rich Megginson 2008-06-24 13:01:09 EDT
*** Bug 448900 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.