When processing certain memberOf related operations, a missing member entry can cause the memberOf plug-in to infinitely recurse, causing ns-slapd to crash when it runs out of stack space. The problem happens under heavy load, but it is very time-dependent, so it's not easy to reproduce. The problem can be reproduced by setting up some inconsistent data with the memberOf plug-in disabled, then enabling the plug-in to trigger the crash. Here are the reproduction steps: - Disable the memberOf plug-in and restart ns-slapd. - Add the following two entries: dn: cn=group1,dc=example,dc=com objectClass: groupOfNames objectClass: top objectClass: inetUser cn: group1 member: cn=group2,dc=example,dc=com member: uid=user1,dc=example,dc=com dn: cn=group2,dc=example,dc=com objectClass: groupOfNames objectClass: top objectClass: inetUser cn: group2 memberOf: cn=group1,dc=example,dc=com member: uid=user1,dc=example,dc=com - Enable the memberOf plug-in and restart ns-slapd. - Delete the "cn=group1,dc=example,dc=com" entry. This will cause ns-slapd to crash. The way this issue happens under normal operation is that the thread doing the deletion gets the memberOf operation lock, but another thread deletes the "uid=user1,dc=example,dc=com" entry before the thread holding the memberOf lock does any memberOf processing. This will result in the same data as used in the reproduction steps above.
There is a section of code in the memberof_modop_one_replace_r() function that is only triggered for a delete modify operation. This code attempts to remove a "memberOf" value from an entry that is listed in the "member" attribute of the group entry that is passed into the function. When the entry referred to by the "member" attribute doesn't exist, we have to do some extra work to avoid the possibility of missing some indirect member entries that may have inherited membership through this missing entry. When we detect this scenario, we call memberof_test_membership(), which is supposed to fixup all membership for entries that refer to the group DN that you pass to it. The problem is that we are calling this function with the original group DN. This causes us to get right back in the scenario where we are trying to remove our group DN from the memberOf attribute of the missing entry. This is what loops over and over. The correct thing to do in the situation where we encounter a missing member is to first try to determine if the missing member is a user or a group. The entry is gone, so we can't check exactly what the entry is. What we can do is check if any other entries refer to the missing entry in their "memberOf" attribute. If we don't find any of these entries, we can safely assume that the missing entry is a user (or a group with no members, which we treat the same as a user in this case). For a missing user entry, we can just ignore it. Since it no longer exists, there's no reason to worry about removing it's "memberOf" value. If there are other entries whose "memberOf" attribute refers to the missing entry, we can assume that it's a group. In this case, we need to fix up the membership for all of the entries that refer to this missing group. We can use the memberof_test_membership() function to do the fixup work here by passing the DN of the missing entry as the group DN. This fixes up the dangling entries properly without causing any sort of recursive looping.
Created attachment 310173 [details] CVS Diffs I've been testing with this fix, and so far so good under stress conditions. The fix follows the logic from the previous bug comment. In addition to what is described above, an extra check is needed in memberof_mod_attr_list_r(). We need to avoid trying to do memberOf operations where both the memberOf value and the entry to be modified are one and the same. The lack of this check causes a nearly identical infinite recursion.
Checked into ldapserver (HEAD). Thanks to Rich for his review! Checking in ldap/servers/plugins/memberof/memberof.c; /cvs/dirsec/ldapserver/ldap/servers/plugins/memberof/memberof.c,v <-- memberof.c new revision: 1.10; previous revision: 1.9 done
master: 2301f606528e731c831d02a58a0d4914dbb662ba second patch to fix some comment formatting issues: master: 72a3114a01d0c67cf3b8faf7b28da93e8a6f2de3
QA verification: pass. bug close Same test run by following Nathan's notes. After the test,the ipa server still functionally in good health. I have performed ipa-adduser, ipa-addgroup and ipa-modgroup, they all give right response. test steps: 1. stop dirsrv 2. disable memberof plugin, (by editing dse.ldif), then start dirsrv 3. create u101 with ipa-adduser 4. create grp001, grp002 with ldapmodify 5. stop dirsrv 6. enable memberof plugin, then start dirsrv 7. delete grp001 with ldapmodify -> verify if dirsrv still alive test data used below: cat < messitup.ldif dn: cn=grp001,cn=groups,cn=accounts,dc=ipaqa,dc=com objectClass: groupOfNames objectClass: top objectClass: inetUser cn: grp001 member: cn=grp002,cn=groups,cn=accounts,dc=ipaqa,dc=com member: uid=u101,cn=users,cn=accounts,dc=ipaqa,dc=com dn: cn=grp002,cn=groups,cn=accounts,dc=ipaqa,dc=com objectClass: groupOfNames objectClass: top objectClass: inetUser cn: grp002 memberOf: cn=grp001,cn=groups,cn=accounts,dc=ipaqa,dc=com member: uid=u101,cn=users,cn=accounts,dc=ipaqa,dc=com
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0643.html