Bug 474181 - race in fork()
Summary: race in fork()
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: nss_ldap
Version: 5.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Nalin Dahyabhai
QA Contact: Ondrej Moriš
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-02 17:35 UTC by Dominik Strasser
Modified: 2011-01-13 23:31 UTC (History)
9 users (show)

Fixed In Version: nss_ldap-253-36.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-13 23:31:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
CentOS 3280 0 None None None Never
PADL Software 422 0 None None None Never
Red Hat Product Errata RHBA-2011:0097 0 normal SHIPPED_LIVE nss_ldap bug fix update 2011-01-12 17:29:13 UTC

Description Dominik Strasser 2008-12-02 17:35:04 UTC
Description of problem:

It is generally considered as unsafe to call any system functions between fork and exec because during this time, deadlocks can happen.

It seems the glibc has such a race itself:
3 0xf6c4d96b in free (ptr=0xc4ba000)
4 0xf682329a in _nss_ldap_mergeconfigfromdns () from /lib/libnss_ldap.so.2
5 0xf680d205 in _nss_ldap_mergeconfigfromdns () from /lib/libnss_ldap.so.2
6 0xf68026a3 in _nss_ldap_mergeconfigfromdns () from /lib/libnss_ldap.so.2
0000007 0xf67eeb30 in _nss_ldap_test_initgroups_ignoreuser () from /lib/libnss_ldap.so.2
0000008 0xf67f21f4 in _nss_ldap_leave () from /lib/libnss_ldap.so.2
0000009 0x00951b52 in fork () from /lib/libc.so.6
0000010 0x00a14424 in fork () from /lib/libpthread.so.0
0000011 0x0a83c2e2 in TclpCreateProcess ()

This is a part of a gdb backtrace from my application which hung at this point trying to acquire a lock in free(),

It seems that glibc calls _nss_ldap_leave in fork, after the actual fork has already happened.

Version-Release number of selected component (if applicable):
glibc-2.5-24
nss_ldap-253-13.el5_2.1


How reproducible:
Unfortunately only in my application. I tried to make a small test example but failed.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Dominik Strasser 2008-12-17 21:26:59 UTC
Î've raised the priority because this issue leads to frequent hangs in our application.

Comment 2 Fredrik Carlsson 2008-12-29 08:11:20 UTC
Hi,

I can confirm this bug to, Usually its sshd that is affected but crond aswell as postfix can be affected.

It's quite annoying bug and affects the function of the servers so it would be nice to have fixed ;)

Regards
Fredrik

Comment 3 Ulrich Drepper 2008-12-30 16:40:46 UTC
This is a bug in nss_ldap which is not part of glibc.

Since fork can be called asynchronously it is not allowed to call any function that is not async-safe in the atfork handlers.  nss_ldap's atfork handler calls free() which is not async-safe.

Comment 4 Fredrik Carlsson 2009-01-08 13:42:50 UTC
Any news?

Comment 5 Dominik Strasser 2009-07-28 10:55:58 UTC
Any news on this issue ?
It is now 8 months old, and no reaction.

Comment 6 Mirko Fit 2010-04-22 07:06:54 UTC
This issue is more than 2 years old now.
From an outside point of view the fix looks easy, find out why nss_ldap calls free() in atfork() and move the clean up to a safer location.

Comment 18 errata-xmlrpc 2011-01-13 23:31:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0097.html


Note You need to log in before you can comment on or make changes to this bug.