Bug 676407 - dirsrv crash segfault in need_new_pw()
dirsrv crash segfault in need_new_pw()
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.1
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Rich Megginson
Chandrasekar Kannan
: screened
Depends On: 675853
Blocks: 639035 389_1.2.8 676871
  Show dependency treegraph
 
Reported: 2011-02-09 13:39 EST by Nathan Kinder
Modified: 2015-01-04 18:46 EST (History)
8 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.8-0.3.a3.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 675853
Environment:
Last Closed: 2011-05-19 08:41:21 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Nathan Kinder 2011-02-09 13:39:41 EST
+++ This bug was initially created as a clone of Bug #675853 +++

I am running 389-ds-base-1.2.6.1-2 on RHEL 5.5:

rpm -qi 389-ds-base
Name        : 389-ds-base                  Relocations: (not relocatable)
Version     : 1.2.6.1                           Vendor: Fedora Project
Release     : 2.el5                         Build Date: Thu 30 Sep 2010
09:15:13 AM EST
Install Date: Mon 18 Oct 2010 03:45:22 PM EST      Build Host:
x86-02.phx2.fedoraproject.org
Group       : System Environment/Daemons    Source RPM:
389-ds-base-1.2.6.1-2.el5.src.rpm
Size        : 5855143                          License: GPLv2 with exceptions
Signature   : DSA/SHA1, Fri 01 Oct 2010 01:56:46 AM EST, Key ID
119cc036217521f6
Packager    : Fedora Project
URL         : http://port389.org/
Summary     : 389 Directory Server (base)
Description :
389 Directory Server is an LDAPv3 compliant server.  The base package includes
the LDAP server and command line utilities for server administration.


$ rpm -qa | grep 389
389-console-1.1.4-1.el5
389-dsgw-1.1.5-1.el5
389-admin-1.1.11-1.el5
389-ds-console-doc-1.2.3-1.el5
389-ds-base-1.2.6.1-2.el5
389-ds-console-1.2.3-1.el5
389-ds-1.2.1-1.el5
389-admin-console-1.1.5-1.el5
389-ds-base-debuginfo-1.2.6.1-2.el5
389-admin-console-doc-1.1.5-1.el5

We have been getting crashes every couple of weeks. See the output from core files below (it's 800MB core):

(gdb) where
#0  slapi_sdn_get_ndn (sdn=0x0) at ldap/servers/slapd/dn.c:1933
#1  0x000000000042113d in need_new_pw (pb=0x2aaab13f0f70, t=0x52b7ce78, e=0x0, pwresponse_req=0) at ldap/servers/slapd/pw_mgmt.c:71
#2  0x000000000040ecbf in do_bind (pb=0x2aaab13f0f70) at ldap/servers/slapd/bind.c:745
#3  0x000000000041336d in connection_threadmain () at ldap/servers/slapd/connection.c:553
#4  0x00000033c5a284ad in ?? () from /usr/lib64/libnspr4.so
#5  0x00000033c1a0673d in start_thread () from /lib64/libpthread.so.0
#6  0x00000033c12d3f6d in clone () from /lib64/libc.so.

(gdb) print *pb
$1 = {pb_backend = 0x1996ec00, pb_conn = 0x2aaaab81f750, pb_op = 0x2aaab29ed580, pb_plugin = 0x1992fb10, pb_opreturn = 0, pb_object = 0x0,
  pb_destroy_fn = 0, pb_requestor_isroot = 0, pb_config_fname = 0x0, pb_config_lineno = 0, pb_config_argc = 0, pb_config_argv = 0x0,
  pb_target_entry = 0x0, pb_existing_dn_entry = 0x0, pb_existing_uniqueid_entry = 0x0, pb_parent_entry = 0x0, pb_newparent_entry = 0x0,
  pb_pre_op_entry = 0x0, pb_post_op_entry = 0x0, pb_seq_type = 0, pb_seq_attrname = 0x0, pb_seq_val = 0x0, pb_ldif_file = 0x0, pb_removedupvals = 0,
  pb_db2index_attrs = 0x0, pb_ldif2db_noattrindexes = 0, pb_ldif_printkey = 0, pb_instance_name = 0x0, pb_task = 0x0, pb_task_flags = 0,
  pb_mr_filter_match_fn = 0, pb_mr_filter_index_fn = 0, pb_mr_filter_reset_fn = 0, pb_mr_index_fn = 0, pb_mr_oid = 0x0, pb_mr_type = 0x0,
  pb_mr_value = 0x0, pb_mr_values = 0x0, pb_mr_keys = 0x0, pb_mr_filter_reusable = 0, pb_mr_query_operator = 0, pb_mr_usage = 0,
  pb_pwd_storage_scheme_user_passwd = 0x0, pb_pwd_storage_scheme_db_passwd = 0x0, pb_managedsait = 0, pb_internal_op_result = 0,
  pb_plugin_internal_search_op_entries = 0x0, pb_plugin_internal_search_op_referrals = 0x0, pb_plugin_identity = 0x0, pb_parent_txn = 0x0, pb_txn = 0x0,
  pb_dbsize = 0, pb_ldif_files = 0x0, pb_ldif_include = 0x0, pb_ldif_exclude = 0x0, pb_ldif_dump_replica = 0, pb_ldif_dump_uniqueid = 0,
  pb_ldif_generate_uniqueid = 0, pb_ldif_namespaceid = 0x0, pb_ldif_encrypt = 0, pb_operation_notes = 0, pb_slapd_argc = 0, pb_slapd_argv = 0x0,
  pb_slapd_configdir = 0x0, pb_ctrls_arg = 0x0, pb_dse_dont_add_write = 0, pb_dse_add_merge = 0, pb_dse_dont_check_dups = 0, pb_dse_is_primary_file = 0,
  pb_schema_flags = 0, pb_result_code = 0, pb_result_text = 0x0, pb_result_matched = 0x0, pb_nentries = 0, urls = 0x0, pb_import_entry = 0x0,
  pb_import_state = 0, pb_destroy_content = 0, pb_dse_reapply_mods = 0, pb_urp_naming_collision_dn = 0x0, pb_urp_tombstone_uniqueid = 0x0,
  pb_server_running = 0, pb_backend_count = 1, pb_pwpolicy_ctrl = 0, pb_vattr_context = 0x0, pb_substrlens = 0x0, pb_plugin_enabled = 0,
  pb_search_ctrls = 0x0, pb_mr_index_sv_fn = 0}

We not sure how this is occurring as we have turned off the ability for users to change their own passwords.

If you need any extra info from the core file let me know.

--- Additional comment from daniel.appleby@deakin.edu.au on 2011-02-07 17:54:46 EST ---

Created attachment 477520 [details]
thread apply all bt output

--- Additional comment from nkinder@redhat.com on 2011-02-07 19:24:51 EST ---

The need_new_pw() function is called when a BIND operation is processed to see if a password is expired.  This crash is not being triggered by a password change operation.  This function does not expect to be passed a NULL pointer for the "e" variable, which should contain the entry that one is trying to bind as.

In do_bind (frame 2 of thread 1), what do the "dn", "rawdn", and "sdn" variables contain?

Do you use the chaining feature of 389?  Do you do any binds that are something other than a simple bind, such as SASL or client certificate authentication?  Are you using the LDAPI autobind feature?

--- Additional comment from daniel.appleby@deakin.edu.au on 2011-02-07 19:45:14 EST ---

(gdb) down
#2  0x000000000040ecbf in do_bind (pb=0x2aaab13f0f70)
    at ldap/servers/slapd/bind.c:745
(gdb) print dn
$1 = 0x2aaaae4aeba0 "uid=qx\t,ou=People,dc=deakin,dc=edu,dc=au"
(gdb) print rawdn
$2 = 0x2aaaae4aeba0 "uid=qx\t,ou=People,dc=deakin,dc=edu,dc=au"
(gdb) print sdn
$3 = {flag = 6 '\006',
  dn = 0x2aaaae4aeba0 "uid=qx\t,ou=People,dc=deakin,dc=edu,dc=au",
  ndn = 0x2aaaaebeb9f0 "uid=qx\t,ou=people,dc=deakin,dc=edu,dc=au",
  ndn_len = 40}
(gdb)

We have turned off all the password policies as we don't use them.

Is the \t in the dn normal?

--- Additional comment from daniel.appleby@deakin.edu.au on 2011-02-07 21:50:39 EST ---

We don't use chaining. LDAPI and autobind are off:

nsslapd-ldapifilepath: /var/run/slapd-auth-f.socket
nsslapd-ldapilisten: off
nsslapd-ldapiautobind: off

We use simple binds however could someone trying to connect as sasl cause an issue? Is their a way to turn it off on the server?

--- Additional comment from nkinder@redhat.com on 2011-02-08 11:52:33 EST ---

(In reply to comment #3)
> 
> We have turned off all the password policies as we don't use them.

This code path will still be hit without password policies turned on.

> 
> Is the \t in the dn normal?

No, this is not normal and may be a part of the problem.  I will run some tests.

--- Additional comment from nkinder@redhat.com on 2011-02-08 13:30:53 EST ---

Do you have an entry in your database that looks something similar to "uid=qx\t,ou=people,dc=deakin,dc=edu,dc=au"?  Do you know what the client application is that is attempting to bind as this DN?

It is possible that the current versions of 389-ds-base do not have this problem as there have been many changes around DN normalization.  I would recommend that you use a more recent version of 389-ds-base.  The latest version in EPEL5 is 389-ds-base-1.2.7.5-1.

I will continue testing to see if I can reproduce the issue on current code.

--- Additional comment from nkinder@redhat.com on 2011-02-08 14:41:28 EST ---

I can reproduce this problem.  It is triggered by a bind with a valid DN and password except that the RDN has a tab at the end of it when binding.  For example, you can add a user like this:

    dn: uid=foo,dc=example,dc=com
    objectclass: posixaccount
    cn: foo
    userpassword: secret
    uidnumber: 500
    gidnumber: 500
    homedirectory: /home/foo

If you then bind as that user like this, ns-slapd will crash:

    ldapsearch -x -D "uid=foo\09,dc=example,dc=com" -w secret -b "dc=example,dc=com" "objectclass=*"

This appears to be related to the entryrdn changes, as it doesn't affect versions prior to supporting modrdn with new superior.

The problem is that the call to get_entry() in do_bind() fails to find the bind target entry, but the call to the backend bind function succeeds, which then puts us on a path to call need_new_pw() with a NULL entry.  Versions prior to the entryrdn changes do not have the problem since the call to the backend bind function fails since it says that the bind target doesn't exist.  At this point, do_bind short circuits and returns an error to the client.

--- Additional comment from nkinder@redhat.com on 2011-02-08 15:02:20 EST ---

The find_entry() function in the backend code does find the entry yet get_entry() in the frontend code fails to find the entry.  These functions differ in the way they locate the bind target entry.

The get_entry() function uses slapi_search_internal_get_entry() to locate the entry.

The find_entry() function uses dn2entry().  This ends up using entryrdn_index_read() to find the entry by consulting the entryrdn index.  This in turn calls slapi_rdn_init_all_sdn() to break the DN into RDNs so it can use the RDN to consult the index.  This ends up removing the tab character, so the left-most RDN is "uid=foo" instead of "uid=foo\t".  Since this truncated RDN does indeed exist, the index entry is found which results in fetching the entry from the database allowing the bind to succeed.

--- Additional comment from nkinder@redhat.com on 2011-02-08 16:32:05 EST ---

The root cause of the problem is that slapi_ldap_explode_dn() is truncating the trailing tab character off of the RDN.  When ns-slapd is built against MozLDAP, we use the MozLDAP ldap_explode_dn() function, which trims the tab character.  When ns-slapd is built against OpenLDAP, we use our own mozldap_ldap_explode_dn() function which mimics the MozLDAP function.  We do this because OpenLDAP has no equivalent function available.

Both the ldap_expode_dn() and mozldap_explode_dn() functions use the ldap_utf8isspace() function to decide which trailing characters to trim.  This function considers tabs (along other whitespace characters) to be a space in both the MozLDAP and 389 versions of the function.  This function is supposed to mimic the isspace() call, so I believe that it is behaving correctly.  I think we should not be using this function to trim trailing space (0x20) characters when exploding a DN.

As a compatibility test with OpenLDAP server, a bind using a trailing tab on the left-most RDN fails as if the user does not exist.  I believe that this is the correct behavior.

--- Additional comment from nkinder@redhat.com on 2011-02-08 17:15:52 EST ---

Created attachment 477699 [details]
Patch

--- Additional comment from daniel.appleby@deakin.edu.au on 2011-02-08 17:41:26 EST ---

Thanks Nathan.

So I have been unable to track which application is sending the tab in the bind. Do you know if the src ip of the connection is stored anywhere that I can access via the core file? maybe in pb variable somewhere?

--- Additional comment from nkinder@redhat.com on 2011-02-08 18:49:07 EST ---

(In reply to comment #11)
> Thanks Nathan.
> 
> So I have been unable to track which application is sending the tab in the
> bind. Do you know if the src ip of the connection is stored anywhere that I can
> access via the core file? maybe in pb variable somewhere?

The address is in pb->pb_conn->cin_addr, but it's not in a readible format.  It's stored as a NSPR PRNetAddr.  I'm not sure if there's an easy way to convert it within gdb, so it would require dumping memory from the core and trying to recreate the PRNetAddr in another test program.  You could then have that test program call PR_NetAddrToString() to get back a human-readible address.

--- Additional comment from nkinder@redhat.com on 2011-02-08 19:14:53 EST ---

Created attachment 477718 [details]
Address conversion program

This is a simple program which can be used to convert a dumped PRNetAddr to a human-readable network address string.

The source (netaddr.c) needs to be modified to make the "addr" array contain the dumped PRNetAddress that you want to convert.  To dump the PRNetAddr from your ns-slapd core, run the following command in gdb inside of the do_bind stack frame (be sure to dump the number of bytes returned by the call to sizeof on your system):

(gdb) call sizeof(PRNetAddr)
$20 = 112
(gdb) x/112x pb->pb_conn->cin_addr

At this point you will get output like the following from gdb:

0x1d22070:	0x0a	0x00	0xce	0x52	0x00	0x00	0x00	0x00
0x1d22078:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d22080:	0x00	0x00	0xff	0xff	0x0a	0x0e	0x36	0x8c
0x1d22088:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d22090:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d22098:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d220a0:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d220a8:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d220b0:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d220b8:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x1d220c0:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00

Eliminate the addresses (left hand column) and copy the bytes into the initialization of the "addr" array in the netaddr.c source.  The bytes will need to be separated by commas as you will see in the example address in the source code.

To build netaddr, run the build.sh script.  You will need to have the nspr-devel package installed.  Once the program is build, simply run netaddr and it will print out the address in a readable format like this:

[nkinder@localhost netaddr]$ ./netaddr 
Address is ::ffff:10.14.54.140.

--- Additional comment from daniel.appleby@deakin.edu.au on 2011-02-08 20:38:35 EST ---

This is what i get from gdb:

(gdb) call sizeof(PRNetAddr)
$1 = 112
(gdb) x/112x pb->pb_conn->cin_addr
0x2aaabd9b8fb0: 0x1b6c000a      0x00000000      0x00000000      0x00000000
0x2aaabd9b8fc0: 0xffff0000      0xf0a7b880      0x00000000      0x00000000
0x2aaabd9b8fd0: 0x00000000      0x00000000      0x00000000      0x00000000
0x2aaabd9b8fe0: 0x00000000      0x00000000      0x00000000      0x00000000
0x2aaabd9b8ff0: 0x00000000      0x00000000      0x00000000      0x00000000
0x2aaabd9b9000: 0x00000000      0x00000000      0x00000000      0x00000000
0x2aaabd9b9010: 0x00000000      0x00000000      0x00000000      0x00000000
0x2aaabd9b9020: 0x00000080      0x00000000      0x00000035      0x00000000
0x2aaabd9b9030: 0x31313032      0x39313130      0x35313530      0x005a3235
0x2aaabd9b9040: 0xbc27e8c0      0x00002aaa      0xbc4a6360      0x00002aaa
0x2aaabd9b9050: 0x00000030      0x00000000      0x00000035      0x00000000
0x2aaabd9b9060: 0xbc76d5d0      0x00002aaa      0x00000004      0x00000000
0x2aaabd9b9070: 0xb43009c0      0x00002aaa      0xb5121490      0x00002aaa
0x2aaabd9b9080: 0x00000000      0x00000000      0x00000025      0x00000000
0x2aaabd9b9090: 0x72746e65      0x00646979      0x5470756f      0x6c706972
0x2aaabd9b90a0: 0x3d630065      0x00007561      0x00000045      0x00000000
0x2aaabd9b90b0: 0x31642d66      0x7374692d      0x6d73632d      0x3736312d
0x2aaabd9b90c0: 0x74616e2d      0x6c6f6f70      0x74656e2e      0x6165642e
0x2aaabd9b90d0: 0x2e6e696b      0x2e756465      0x00007561      0x00000016
0x2aaabd9b90e0: 0x00000017      0x00000018      0x00000025      0x00000000
0x2aaabd9b90f0: 0x706d6973      0x0000656c      0x00000000      0x00000000
0x2aaabd9b9100: 0x00000020      0x00000000      0x00000045      0x00000000
0x2aaabd9b9110: 0x3d646975      0x7365706d      0x6f2c616b      0x65503d75
0x2aaabd9b9120: 0x656c706f      0x3d63642c      0x6b616564      0x642c6e69
0x2aaabd9b9130: 0x64653d63      0x63642c75      0x0075613d      0x00000000
0x2aaabd9b9140: 0x00000000      0x00000000      0x00000055      0x00000000
0x2aaabd9b9150: 0x00000001      0x00000000      0xb09fd220      0x00002aaa
0x2aaabd9b9160: 0xaf4fc350      0x00002aaa      0x00005290      0x00000000

Placing the data into the program i get:

$ ./netaddr
Address is ::.

--- Additional comment from nkinder@redhat.com on 2011-02-09 13:24:21 EST ---

(In reply to comment #14)
> This is what i get from gdb:
> 
> (gdb) call sizeof(PRNetAddr)
> $1 = 112
> (gdb) x/112x pb->pb_conn->cin_addr
> 0x2aaabd9b8fb0: 0x1b6c000a      0x00000000      0x00000000      0x00000000

This data in in the wrong format.  It turns out that gdb remembers the last size and format for the examine (x) command, and I had used "xb" previously.  Try using x/112xb to get the data listed in single bytes.

--- Additional comment from nkinder@redhat.com on 2011-02-09 13:33:06 EST ---

Pushed patch to master.  Thanks to Noriko for her review!

Counting objects: 11, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 1.18 KiB, done.
Total 6 (delta 4), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   10f6c0e..30cb812  master -> master
Comment 1 Nathan Kinder 2011-02-09 13:42:02 EST
The patch for this has been pushed to the 389-ds-base-1.2.8 branch.

Counting objects: 11, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 1.19 KiB, done.
Total 6 (delta 4), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   e87f581..f0e39fd  128-local -> 389-ds-base-1.2.8
Comment 3 Chandrasekar Kannan 2011-04-17 21:22:44 EDT

works ok now. verified with 389-ds-base-1.2.8.2-1.el6.i686

[root@ds90-rhel6-32vm ~]# ldapsearch -x -D "uid=foo,dc=idm,dc=lab,dc=bos,dc=redhat,dc=com" -w secret123 -b "dc=idm,dc=lab,dc=bos,dc=redhat,dc=com" "objectclass=*" | grep ^dn | wc -l
210
[root@ds90-rhel6-32vm ~]# 
[root@ds90-rhel6-32vm ~]# 
[root@ds90-rhel6-32vm ~]# 
[root@ds90-rhel6-32vm ~]# ldapsearch -x -D "uid=foo\09,dc=idm,dc=lab,dc=bos,dc=redhat,dc=com" -w secret123 -b "dc=idm,dc=lab,dc=bos,dc=redhat,dc=com" "objectclass=*" | grep ^dn | wc -l
ldap_bind: No such object (32)
	matched DN: dc=idm,dc=lab,dc=bos,dc=redhat,dc=com
0
Comment 4 errata-xmlrpc 2011-05-19 08:41:21 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0533.html

Note You need to log in before you can comment on or make changes to this bug.