With the directory server built with DEBUG to enable PR_ASSERT, the following assertion is triggered: void attr_syntax_free( struct asyntaxinfo *a ) { PR_ASSERT( a->asi_refcnt == 0 ); I triggered this by adding the sudo schema via ldapmodify - it was already present in the schema directory, so I was adding a duplicate. The refcnt was -2.
Created attachment 329387 [details] CVS Diffs We were calling attr_syntax_delete() on a syntax that we fetched using attr_syntax_get_by_name_locking_optional(). You are supposed to use attr_syntax_return() when you are finished with a syntax that you fetch in this way, which will take care of freeing it if it is required. There was already code that was doing the proper thing, so the call to attr_syntax_delete() can simply be removed.
Checked into ldapserver (HEAD). Thanks to Rich for his review! Checking in ldap/servers/slapd/attrsyntax.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/attrsyntax.c,v <-- attrsyntax.c new revision: 1.9; previous revision: 1.8 done
There are other areas in the code where we trigger this same assertion. There look to be some inconsistencies in the usage of the attribute syntax API that are triggering these assertions. When we look up an attribute syntax from the global oid and name hashtables, we increment a reference count for the returned struct. When the caller is finished, they are supposed to release the struct by calling attr_syntax_return(). The problem that we have is that there are locking optional versions of the lookup functions that allow the caller to skip incrementing the reference count. Despite this, there is no associated way of releasing the returned syntax structs that bypasses decrementing the reference count. The solution is to not allow the increment of the reference count to be skipped. Whenever a syntax struct is fetched from the global hashtable, we should increment the reference count to prevent it from being free'd by another thread while we are still using it. We should always release the syntax struct with the appropriate return function that decrements the reference count. We should just refactor the functions that allow one to skip the reference count increment to take this ability away.
Created attachment 329503 [details] Additional CVS Diffs This patch refactors a handful of functions to remove a parameter that allows the reference count increment to be skipped. In addition, I also scanned through all of the server code to ensure that anytime we lookup a syntax from the global hashtables, we use the proper return function to release it.
Created attachment 329518 [details] Revised CVS Diffs This modifies the last patch by reverting the first patch attached to this bug. As described in the description of the last patch, the reference count was not being handled correctly, which was resulting in the value being negative. This was the real cause of the original assertion, not the fact that we were flagging the struct for deletion. We indeed want to flag the struct for deletion in the code from the first patch, since we are overriding the definition with a new one.
Checked into ldapserver (HEAD). Thanks to Rich for his reviews! Checking in ldap/servers/slapd/attr.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/attr.c,v <-- attr.c new revision: 1.11; previous revision: 1.10 done Checking in ldap/servers/slapd/attrlist.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/attrlist.c,v <-- attrlist.c new revision: 1.8; previous revision: 1.7 done Checking in ldap/servers/slapd/attrsyntax.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/attrsyntax.c,v <-- attrsyntax.c new revision: 1.10; previous revision: 1.9 done Checking in ldap/servers/slapd/entry.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/entry.c,v <-- entry.c new revision: 1.21; previous revision: 1.20 done Checking in ldap/servers/slapd/proto-slap.h; /cvs/dirsec/ldapserver/ldap/servers/slapd/proto-slap.h,v <-- proto-slap.h new revision: 1.45; previous revision: 1.44 done Checking in ldap/servers/slapd/pw.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/pw.c,v <-- pw.c new revision: 1.22; previous revision: 1.21 done Checking in ldap/servers/slapd/schema.c; /cvs/dirsec/ldapserver/ldap/servers/slapd/schema.c,v <-- schema.c new revision: 1.20; previous revision: 1.19 done Checking in ldap/servers/slapd/slapi-private.h; /cvs/dirsec/ldapserver/ldap/servers/slapd/slapi-private.h,v <-- slapi-private.h new revision: 1.34; previous revision: 1.33 done
*** Bug 480262 has been marked as a duplicate of this bug. ***
Can you please add steps to verify - or would successful scal_02 run be sufficient in verifying? Thanks
Thia can't be verified with optimized bits, as assertions only happen with debug builds. I think it's safe to close this.
Scal_02 was run with optimized bits, and with debug bits without problems. A report of a successful run is here: https://wiki.idm.lab.bos.redhat.com/dirsec/archives-mp1/archives/ds/81/reliability/scal_02/03251426.html
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-0455.html