Bug 432408

Summary: Bug in Replication aggrenment connections (SSL)
Product: [Retired] 389 Reporter: Carlos Barrales Ruiz <cbarrales>
Component: Directory ServerAssignee: Rich Megginson <rmeggins>
Status: CLOSED WONTFIX QA Contact: Chandrasekar Kannan <ckannan>
Severity: medium Docs Contact:
Priority: low    
Version: 1.1.0CC: benl
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-28 04:16:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Carlos Barrales Ruiz 2008-02-11 20:28:44 UTC
Description of problem:
There is a strange behavior in the replications initialization in at least 2 of our servers (by now).
The SEGFAULT happens randomly. This reproduction belongs to a server with FDS 1.1-CVS at the HEAD 
revision (at the date of this post) compiled with debug symbols.  

Starting program: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-grs
[Thread debugging using libthread_db enabled]
[New Thread -1208707392 (LWP 25645)]
warning: Lowest section in /usr/lib/libicudata.so.36 is .gnu.hash at 051120b4
[New LWP 25682]
[New LWP 25685]
Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 25685]
0x01af2429 in toupper_w (val=83) at lib/util_unistr.c:422
422	lib/util_unistr.c: No existe el fichero o el directorio.
	in lib/util_unistr.c
(gdb) bt
#0  0x01af2429 in toupper_w (val=83) at lib/util_unistr.c:422
#1  0x01af30f6 in toupper_ascii (c=83) at lib/util_unistr.c:1062
#2  0x01af1041 in StrCaseCmp (s=0x1b32861 "SO_KEEPALIVE", t=0xafffc634 "SO_BROADCAST") at 
lib/util_str.c:235
#3  0x01af120f in strequal (s1=0x1b3b1ec "??\v", s2=0x0) at lib/util_str.c:290
#4  0x01afc928 in set_socket_options (fd=163, options=0x1b19e5d "") at lib/util_sock.c:233
#5  0x01a976a7 in _nss_wins_gethostbyname_r (hostname=0xae316b8 
"xxxxxxxxxxxxxxx.yyyyy.zzzzz.tt", he=0xafffcc38, buffer=0xafffc900 "\177", buflen=512, 
    h_errnop=0xafffcc60) at nsswitch/wins.c:74
#6  0x01a9792c in _nss_wins_gethostbyname2_r (name=0xae316b8 
"xxxxxxxxxxxxxxx.yyyyy.zzzzz.tt", af=2, he=0xafffcc38, buffer=0xafffc900 "\177", buflen=512, 
    h_errnop=0xafffcc60) at nsswitch/wins.c:380
#7  0x003377cd in gaih_inet () from /lib/libc.so.6
#8  0x00338860 in getaddrinfo () from /lib/libc.so.6
#9  0x001e5e00 in PR_GetAddrInfoByName () from /usr/lib/libnspr4.so
#10 0x0027ed9e in prldap_socket_arg_alloc () from /usr/lib/libprldap60.so
#11 0x00defcf4 in ldap_start_tls_s () from /usr/lib/libssldap60.so
#12 0x00249dd7 in nsldapi_connect_to_host () from /usr/lib/libldap60.so
#13 0x0024d7a7 in nsldapi_new_connection () from /usr/lib/libldap60.so
#14 0x002482b5 in nsldapi_open_ldap_defconn () from /usr/lib/libldap60.so
#15 0x0024e78b in nsldapi_send_server_request () from /usr/lib/libldap60.so
#16 0x0024eb8e in nsldapi_send_initial_request () from /usr/lib/libldap60.so
#17 0x00253af1 in ldap_simple_bind () from /usr/lib/libldap60.so
#18 0x06ed1a61 in do_simple_bind (conn=0xb0202958, ld=0x0, binddn=0x1b3b1ec "??\v", 
password=0xae2db28 "xxxxxxxxxxxx")
    at ../dirsec/ldapserver/ldap/servers/plugins/replication/repl5_connection.c:1650
#19 0x06ed3f6b in bind_and_check_pwp (conn=0xb0202958, binddn=0xae2ecd0 "cn=Replication 
Manager,cn=config", password=0xae2db28 "xxxxxxxxxxxx")
    at ../dirsec/ldapserver/ldap/servers/plugins/replication/repl5_connection.c:1561
#20 0x06ed4311 in conn_connect (conn=0xb0202958) at 
../dirsec/ldapserver/ldap/servers/plugins/replication/repl5_connection.c:1004
#21 0x06edc386 in acquire_replica (prp=0xb0202a78, prot_oid=0x6f05057 
"2.16.840.1.113730.3.6.1", ruv=0xafffd378)
    at ../dirsec/ldapserver/ldap/servers/plugins/replication/repl5_protocol_util.c:168
#22 0x06ed5b67 in repl5_inc_run (prp=0xb0202a78) at 
../dirsec/ldapserver/ldap/servers/plugins/replication/repl5_inc_protocol.c:796
#23 0x06edb881 in prot_thread_main (arg=0xb02028d0) at 
../dirsec/ldapserver/ldap/servers/plugins/replication/repl5_protocol.c:313
#24 0x001f319d in PR_JoinThread () from /usr/lib/libnspr4.so
#25 0x003c845b in start_thread () from /lib/libpthread.so.0
#26 0x0035124e in clone () from /lib/libc.so.6
(gdb) quit

Additional info:
78 Replication agreements. Most of them against non responding servers yet.

Comment 1 Rich Megginson 2008-02-11 21:05:30 UTC
Are you using pam_ldap and nss_ldap?  Can you post your /etc/nsswitch.conf?

Comment 2 Carlos Barrales Ruiz 2008-02-11 22:20:27 UTC
Not in the server but clients.

nss_ldap-253-5.el5 (Centos 5.1)

/etc/nsswitch.conf :
passwd:     files ldap
shadow:     files ldap
group:      files ldap
(No more LDAP related configurations)

/etc/ldap.conf: (libnss issues)
currently missconfigured

Samba Services are also LDAP-enabled.

Thank you.
Regards.

Comment 3 Carlos Barrales Ruiz 2008-02-12 10:25:59 UTC
I'm sorry so i didn't understand well that you asked me for my
/etc/nsswitch.conf cause of the wins hostname resolution. Initially i thought
that the problem was caused by a string overwritted or a buffer overflow in FDS,
but i was unable to trace it.

We had in /etc/nsswitch.conf:
hosts:      files dns wins

Now, removing wins issue:
hosts:  files dns

And Server seems to start ok all the times.


Comment 4 Rich Megginson 2008-02-12 16:18:15 UTC
That's odd.  I wonder what the problem is with using wins in the directory server?

I notice on my RHEL5 system:
rpm -q --whatprovides /lib/libnss_wins.so.2 
samba-common-3.0.23c-2.el5.2.0.2

I haven't heard of any problems with wins/samba code in the same process space
as directory server code.  I know there is a problem with pam_ldap/nss_ldap
because it uses openldap client libraries, but I'm not aware of any problems
with wins.

Comment 5 Carlos Barrales Ruiz 2008-02-12 20:51:09 UTC
That is.

wins resolution brokes our systems. 

O our system:
# rpm -q --whatprovides /lib/libnss_wins.so.2
samba3-winbind-3.0.26-35

Steps to reproduce (for help in other similar cases):
* Enable wins for host name resolution: /etc/nsswitch.conf. ex:
hosts: files dns wins
* Set up a replication agreenment against a non DNS nor wins resoluble hostname.
* Start/Stop dirsrv several times, try to initiallize the consummer or just wait.

--
It would be thankful that you recommended to us what to do exactly with this kind of non directly 
related to dirsrv. Maybe we should post them to the fds-devel list?



Comment 6 Rich Megginson 2008-02-12 21:03:46 UTC
(In reply to comment #5)
> That is.
> 
> wins resolution brokes our systems. 
> 
> O our system:
> # rpm -q --whatprovides /lib/libnss_wins.so.2
> samba3-winbind-3.0.26-35
> 
> Steps to reproduce (for help in other similar cases):
> * Enable wins for host name resolution: /etc/nsswitch.conf. ex:
> hosts: files dns wins
> * Set up a replication agreenment against a non DNS nor wins resoluble hostname.
> * Start/Stop dirsrv several times, try to initiallize the consummer or just wait.
> 
> --
> It would be thankful that you recommended to us what to do exactly with this
kind of non directly 
> related to dirsrv. Maybe we should post them to the fds-devel list?

Probably just the fedora-directory-users list to let people know there is a
problem mixing wins with fedora ds.

Comment 7 Rich Megginson 2008-02-28 04:16:27 UTC
I don't think there is much we can do about this bug in the directory server itself.