+++ This bug was initially created as a clone of Bug #154314 +++
Description of problem:
Second time already when I hear nss_ldap is replying with wrong results and
causing peoples' mails to be shown to wrong people:
http://www.dovecot.org/list/dovecot/2005-March/006345.htmlhttp://www.dovecot.org/list/dovecot/2005-April/006859.html
Something should really be done about this. At the very least I'm adding a check
to make sure getpwnam() returns the same user name that is being requested, and
if not put out some huge warnings about something being broken..
-- Additional comment from tjanouse on 2007-02-06 11:08 EST --
Oh yeah, got it.
The problem relies in the fact that if an application is linked against
pthread and uses a nss_ldap call and then forks, both processes share the ldap
connection, having no locking mechanism, and bad things happen. This is the
case with dovecot -- dovecot-auth is linked with pthread and uses pam in
forked processes. A race condition causes dovecot-auth to receive a reply that
should have gone to pam.
The direct cause of this is that an assumption that __pthread_once is nonnull
(ldap-nss.c:1048) implies __pthread_atfork being nonnull (ldap-nss.c:504,
bits/libc-lock.h:290) is plain wrong. These two variables have no connection
to each other and each of them becomes non-null at the time linker resolves
them, which happens upon them being called. And it happens upon them being
called in the object that checks for them -- that means calling pthread_atfork
in dovecot has no effect on __pthread_atfork value in nss_ldap and vice versa.
For some not so obvious reason __pthread_once is nonnull at the enter of main
function, __pthread_atfork is null. This means that nss_ldap assumes we have
pthreads working, calls the __libc_atfork (ldap-nss.c:504), which is a noop in
this case, and then has no idea about the forks and such.
The easiest solution would be to help nss_ldap's configure find pthreads
(telling it to -lpthread), which would make nss_ldap use pthreads directly and
avoid such crazy things -- and using those libc internal functions is bad
anyway, but I'm not sure whether we should do it.
Also, we could fix it to chech for both __pthread_once and __pthread_atfork
but it would not find them and use the pid-comparing method, which is probably
slower.
I hope this information helps :)
-- Additional comment from tjanouse on 2007-05-10 09:16 EST --
The upstream has accepted my two patches:
http://people.redhat.com/tjanouse/dovecot/154314/sent_upstream/