Bug 426155 - nscd takes a long time to decide
nscd takes a long time to decide
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: nss_ldap (Show other bugs)
4.6
All Linux
low Severity medium
: rc
: ---
Assigned To: Nalin Dahyabhai
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-12-18 16:27 EST by Kaj J. Niemi
Modified: 2010-10-22 17:14 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-02-22 11:37:11 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Kaj J. Niemi 2007-12-18 16:27:42 EST
Description of problem:
On RHEL 4.6, x86_64, nscd sometimes (frequently) locks and makes the user wait 5 seconds (first 
timeout) until returning. This does not seem to happen with glibc/nscd 2.3.4-2.36 (from RHEL 4.5).

nsswitch.conf has been configured to use ldap. There is nothing wrong with the ldap servers nor is 
there anything wrong with the connectivity to the same servers. Both have been ruled out as problems.


Version-Release number of selected component (if applicable):
nscd 2.3.4-2.39

How reproducible:
Semireliably

Steps to Reproduce:
1. start nscd
2. time id foo < 0.01s
3. wait until it decides to misbehave
4. time id foo > 5.0s always
  
Actual results:
% time id kajtzu
uid=501(kajtzu) gid=501(kajtzu) groups=500(basen),501(kajtzu)
id kajtzu  0.00s user 0.00s system 0% cpu 5.032 total

Expected results:
% time id kajtzu
uid=501(kajtzu) gid=501(kajtzu) groups=500(basen),501(kajtzu)
id kajtzu  0.00s user 0.00s system 0% cpu 0.003 total

Additional info:
nscd -d simply says:

13289: handle_request: request received (Version = 2) from PID 8071
13289:  GETFDPW
13289: provide access to FD 5, for passwd
13289: handle_request: request received (Version = 2) from PID 8071
13289:  GETPWBYNAME (kajtzu)
13289: Haven't found "kajtzu" in password cache!
13289: handle_request: request received (Version = 2) from PID 8071
13289:  GETFDHST   
13289: provide access to FD 9, for hosts
13289: handle_request: request received (Version = 2) from PID 8071
13289:  GETFDGR
13289: provide access to FD 7, for group

during that same time stracing it returns:

Process 13289 attached - interrupt to quit
time(NULL)                              = 1198012747
epoll_wait(11, {{EPOLLRDNORM, {u32=10, u64=10}}}, 100, 29988) = 1
time(NULL)                              = 1198012748
accept(10, 0, NULL)                     = 19
epoll_ctl(11, EPOLL_CTL_ADD, 19, {EPOLLRDNORM, {u32=19, u64=19}}) = 0
epoll_wait(11, {{EPOLLRDNORM, {u32=19, u64=19}}}, 100, 29976) = 1
time(NULL)                              = 1198012748
epoll_ctl(11, EPOLL_CTL_DEL, 19, {0, {u32=0, u64=0}}) = 0
futex(0x552abc25a4, FUTEX_WAKE, 1)      = 1
epoll_wait(11, {{EPOLLRDNORM, {u32=10, u64=10}}}, 100, 29988) = 1
time(NULL)                              = 1198012748
accept(10, 0, NULL)                     = 19
epoll_ctl(11, EPOLL_CTL_ADD, 19, {EPOLLRDNORM, {u32=19, u64=19}}) = 0
epoll_wait(11, {{EPOLLRDNORM, {u32=19, u64=19}}}, 100, 29976) = 1
time(NULL)                              = 1198012748
epoll_ctl(11, EPOLL_CTL_DEL, 19, {0, {u32=0, u64=0}}) = 0
futex(0x552abc25a4, FUTEX_WAKE, 1)      = 1
epoll_wait(11, {{EPOLLRDNORM, {u32=10, u64=10}}}, 100, 29988) = 1
time(NULL)                              = 1198012752
accept(10, 0, NULL)                     = 19
epoll_ctl(11, EPOLL_CTL_ADD, 19, {EPOLLRDNORM, {u32=19, u64=19}}) = 0
epoll_wait(11, {{EPOLLRDNORM, {u32=19, u64=19}}}, 100, 29976) = 1
time(NULL)                              = 1198012752
epoll_ctl(11, EPOLL_CTL_DEL, 19, {0, {u32=0, u64=0}}) = 0
futex(0x552abc25a4, FUTEX_WAKE, 1)      = 1
futex(0x552abc25a0, FUTEX_WAKE, 1)      = 1
epoll_wait(11, {{EPOLLRDNORM, {u32=10, u64=10}}}, 100, 29988) = 1
time(NULL)                              = 1198012752
accept(10, 0, NULL)                     = 19
epoll_ctl(11, EPOLL_CTL_ADD, 19, {EPOLLRDNORM, {u32=19, u64=19}}) = 0
epoll_wait(11, {{EPOLLRDNORM, {u32=19, u64=19}}}, 100, 29976) = 1
time(NULL)                              = 1198012752
epoll_ctl(11, EPOLL_CTL_DEL, 19, {0, {u32=0, u64=0}}) = 0
futex(0x552abc25a4, FUTEX_WAKE, 1)      = 1
futex(0x552abc25a0, FUTEX_WAKE, 1)      = 1
epoll_wait(11, {{EPOLLRDNORM, {u32=10, u64=10}}}, 100, 29988) = 1
time(NULL)                              = 1198012757
accept(10, 0, NULL)                     = 20
epoll_ctl(11, EPOLL_CTL_ADD, 20, {EPOLLRDNORM, {u32=20, u64=20}}) = 0
epoll_wait(11, {{EPOLLRDNORM, {u32=20, u64=20}}}, 100, 29976) = 1
time(NULL)                              = 1198012757
epoll_ctl(11, EPOLL_CTL_DEL, 20, {0, {u32=0, u64=0}}) = 0
mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) 
= 0x41209000
mprotect(0x41209000, 4096, PROT_NONE)   = 0
clone(child_stack=0x41409270, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_S
ETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, 
parent_tidptr=0x414099f0, tls=0x41409960, child_tidptr=0x414099f0) = 8072
epoll_wait(11, {{EPOLLRDNORM, {u32=10, u64=10}}}, 100, 29988) = 1
time(NULL)                              = 1198012757
accept(10, 0, NULL)                     = 20
epoll_ctl(11, EPOLL_CTL_ADD, 20, {EPOLLRDNORM, {u32=20, u64=20}}) = 0
epoll_wait(11, {{EPOLLRDNORM, {u32=20, u64=20}}}, 100, 29976) = 1
time(NULL)                              = 1198012757
epoll_ctl(11, EPOLL_CTL_DEL, 20, {0, {u32=0, u64=0}}) = 0
futex(0x552abc25a4, FUTEX_WAKE, 1)      = 1
epoll_wait(11,  <unfinished ...>
Process 13289 detached
Comment 1 Eijiro Sumii 2008-01-02 00:02:21 EST
Ping?  Exactly the same is happening on my RHEL 4.6, 32-bit x86 box.  It is also
running nscd-2.3.4-2.39 with nss_ldap-226-20.  The LDAP server is working
completely well.  Any idea?
Comment 2 Mark D. Foster 2008-01-15 18:37:05 EST
This is also happening to us, on 32bit.
Please, update the hardware field?
Comment 3 Mark D. Foster 2008-01-17 11:01:50 EST
We have identified this as a problem with nss_ldap-226-20 and a downgrade to
nss_ldap-226-18 fixes it for us.
It seems the newer version of nss_ldap has a problem vis-a-vis nscd and
nonexistent users/groups in LDAP. 

Note You need to log in before you can comment on or make changes to this bug.