Created attachment 338949 [details] sosreport-MFrohm.1907516-249173-7f48db.tar.bz2 > ##### General Escalation Information > > State the problem > > 1. Provide time and date of the problem Sporadic > 2. Indicate the platform(s) (architectures) the problem is being reported > against. RHEL 4.7 ES and AS i386 > 3. Provide clear and concise problem description as it is understood at the > time of escalation > > * Observed behavior nscd hangs on futex call and the nscd processes are using up 100% CPU on several of our machines. nscd isn't responding at all. 'service restart nscd' is not able to stop the process and nscd will only respond to a 'kill -9'. We are currently restarting nscd in a daily cronjob as a workaround. We have also noticed that on the machines where the nscd processes are using up 100% CPU, 'lsof' shows two fd:s opens /var/run/nscd/socket. But on the machines with a normal nscd 'lsof' shows only one opened /var/run/nscd/socket. This problem occurs on machines both with and without LDAP connection. > > * Desired behavior nscd should not use 100% CPU and should respond normally to kill signals etc > 4. State specific action requested of SEG Analyse the problem and advise if we can gather any extra data. > 5. State whether or not a defect in the product is suspected This is suspected to be a bug in both RHEL 4.7 and CentOS. This customer and others have actually opened a bug directly in bugzilla (which I have discouraged them from doing in future) and a CentOS bug tracker as well: > * Provide Bugzilla if one already exists https://bugzilla.redhat.com/show_bug.cgi?id=492581 N.B. This has already been assigned to Jakub Jelinek http://bugs.centos.org/view.php?id=3373 > 8. This is especially important for severity one and two issues. What is the > impact to the customer when they experience this problem? This is happening frequently and is affecting users and is frustrating the customer. > ##### Provide supporting info > > 1. State other actions already taken in working the problem: > > * tech-list, google searches, fulltext, consulting with another engineer > > * Provide any relevant data found > > 2. Attach sosreport Attached an sosreport from an example system. It looks like they might be using a customer kernel, so I'm going to ask if they can reproduce with the stock kernel. However, since they can reproduce on multiple architectures, AS/ES and on CentOS and other people have reported the same behaviour, my suspicion is that it's not related to the kernel version and we should progress without quibbling. > 3. Attach other supporting data See Bugzilla referenced above > > 4. Provide issue repro information: None applicable > 5. List any known hot-fix packages on the system None > 6. List any customer applied changes from the last 30 days None
*** This bug has been marked as a duplicate of bug 495082 ***