Bug 496201

Summary: [RHEL4] Nscd consumes many cpu resources ( nearly 100% ) continuously.
Product: Red Hat Enterprise Linux 4 Reporter: Veaceslav Falico <vfalico>
Component: kernelAssignee: Jason Baron <jbaron>
Status: CLOSED ERRATA QA Contact: Evan McNabb <emcnabb>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.5CC: alanm, cevich, cward, dhoward, djuran, drepper, fhirtz, jakub, james-p, jolsa, jskrabal, k.georgiou, kgraham, knoel, ndoane, ofourdan, peterm, qcai, sghosh, sputhenp, syeghiay, tao
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Off-by-one errors in the time normalization code could have caused clock_gettime() to return one billion nanoseconds, rather than adding an extra second. This bug could have caused the name service cache daemon (nscd) to consume excessive CPU resources.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-16 15:41:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 501800, 507833    
Attachments:
Description Flags
patch
none
heres' the patch i used on Jacob's suggestion none

Description Veaceslav Falico 2009-04-17 08:23:41 UTC
Created attachment 339968 [details]
patch

< Top information >
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                    
18221 nscd      16   0  108m  99m  99m S 98.0  1.6   2035:37 nscd                                                                       
16177 root      15   0  3160  952  684 R  2.0  0.0   0:00.01 top                                                                        
   1 root      16   0  3064  592  508 S  0.0  0.0   0:02.10 init                                                                       


pthread_cond_timedwait() may return EINVAL if given timespec is invalid.
Please see the mark *2 in the after-mentioned source code(pthread_cond_timedwait.c).
Actually the core showed that EINVAL was returned.

However, for such case, nscd_run() just retries pthread_cond_timedwait() with
the same timespec and fall downto he infinite loop if readylist is NULL.
Please see the mark *1 in the after-mentioned source code(connections.c).

I uploaded the patch that recalculates timespec for pthread_cond_timedwait()
when it returns EINVAL.
Could you review this patch and incorporate it?

------------------------------------------------
nscd/connections.c
------------------------------------------------
/* This is the main loop.  It is replicated in different threads but the
  `poll' call makes sure only one thread handles an incoming connection.  */
static void *
__attribute__ ((__noreturn__))
nscd_run (void *p)
{
 const long int my_number = (long int) p;
 const int run_prune = my_number < lastdb && dbs[my_number].enabled;
 struct timespec prune_ts;
 int to = 0;
 char buf[256];

 if (run_prune)
   {
     setup_thread (&dbs[my_number]);

     /* We are running.  */
     dbs[my_number].head->timestamp = time (NULL);

     if (clock_gettime (timeout_clock, &prune_ts) == -1)
       /* Should never happen.  */
       abort ();

     /* Compute timeout time.  */
     prune_ts.tv_sec += CACHE_PRUNE_INTERVAL;
   }

 /* Initial locking.  */
 pthread_mutex_lock (&readylist_lock);

 /* One more thread available.  */
 ++nready;

 while (1)
   {
     while (readylist == NULL)
       {
         if (run_prune)
           {
             /* Wait, but not forever.  */
             to = pthread_cond_timedwait (&readylist_cond, &readylist_lock,
                                          &prune_ts);

             /* If we were woken and there is no work to be done,
                just start pruning.  */
             if (readylist == NULL && to == ETIMEDOUT)   <<####### (*1)
               {
                 --nready;

                 if (sighup_pending)
                   goto sighup_prune;

                 pthread_mutex_unlock (&readylist_lock);
                 goto only_prune;
               }
           }
         else
           /* No need to timeout.  */
           pthread_cond_wait (&readylist_cond, &readylist_lock);
       }

     if (sighup_pending)
       {
  :
< SNIP >
  :

------------------------------------------------
nptl/sysdeps/pthread/pthread_cond_timedwait.c
------------------------------------------------

int
__pthread_cond_timedwait (cond, mutex, abstime)
    pthread_cond_t *cond;
    pthread_mutex_t *mutex;
    const struct timespec *abstime;
{
 struct _pthread_cleanup_buffer buffer;
 struct _condvar_cleanup_buffer cbuffer;
 int result = 0;

 /* Catch invalid parameters.  */
 if (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000)
   return EINVAL;   <<###### (*2)
 :
< SNIP >
 :

How reproducible:
Unknown.

Steps to Reproduce:
There is no way to reproduce. However, the same condition can be caused with gbd command as follows.
1. Stop at the front of pthread_cond_timedwait().
2. Change tv_nsec to 1000000000.
3. Continue gdb processing.

Actual results:
Nscd consumes cpu resources due to infinite loop.

Comment 1 Chris Ward 2009-05-13 18:02:43 UTC
Setting some flags to make sure this issue is reviewed.

Comment 2 James Pearson 2009-05-13 20:07:32 UTC
I've been running nscd with this patch on all my 4.7 systems for over a week now and not seen any running at 100% CPU

Comment 3 Jakub Jelinek 2009-05-13 21:11:38 UTC
The only case in which pthread_cond_timedwait returns EINVAL because of the timespec is if tv_nsec is < 0 or >= 1000000.  But the only places that touch this variable are clock_gettime, which supposedly should never fill it with
invalid tv_nsec, that would be a kernel bug IMHO, then
prune_ts.tv_sec += CACHE_PRUNE_INTERVAL;
which doesn't change tv_nsec and pthread_cond_timedwait which only reads it.
It seems clock_gettime is always run first, if it fails aborts and so doesn't reach pthread_cond_timedwait.

So, does RHEL4 clock_gettime return invalid tv_nsec at times?

Comment 4 Jakub Jelinek 2009-05-13 21:23:16 UTC
I meant 1000000000 of course.

#include <time.h>

int
main (void)
{
  struct timespec ts;
  while (1)
    {
      if (clock_gettime (CLOCK_MONOTONIC, &ts) == -1)
        return 1;
      if (ts.tv_nsec < 0 || ts.tv_nsec >= 1000000000)
        return 2;
    }
  return 0;
}

or

#include <time.h>

int
main (void)
{
  struct timespec ts;
  while (1)
    {
      if (clock_gettime (CLOCK_REALTIME, &ts) == -1)
        return 1;
      if (ts.tv_nsec < 0 || ts.tv_nsec >= 1000000000)
        return 2;
    }
  return 0;
}

should both loop forever.

Comment 5 Kostas Georgiou 2009-05-13 21:47:02 UTC
With CLOCK_MONOTONIC the program exits in less than a sec, printing the "bad" value it is always 1000000000.

Comment 6 Jakub Jelinek 2009-05-13 22:14:15 UTC
That is a severe kernel bug then and should be fixed in the kernel, not worked around in glibc.

I guess the fix is:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3f39894d1b5c253b10fcb8fbbbcf65a330f6cdc7

Comment 9 Chris Ward 2009-05-14 06:51:33 UTC
Is this issue related to bug 495082?

Comment 10 Chris Ward 2009-05-14 06:53:05 UTC
I guess you're statement regarding fixing this in kernel rather than glibc hints that it is. 

We need to have someone test the patch from comment #6. I'll try to track someone down.

Comment 15 Jason Baron 2009-05-14 17:47:34 UTC
ok, i've kicked off a brew build....i'll post a link when it completes.

Comment 16 Jason Baron 2009-05-14 20:32:42 UTC
ok, kernel is building in dist-4E-scrach as: 89.EL.time.2. I'm heading out...and wouldn't be back online until Monday....so hopefully this kernel will build :)

Comment 17 Jason Baron 2009-05-14 20:34:17 UTC
Created attachment 344044 [details]
heres' the patch i used on Jacob's suggestion

Comment 18 Jason Baron 2009-05-14 20:35:56 UTC
ok, i've verified the patch on a test system using the supplied testcase, and it instantly fails, without the patch, but i've had is successfully run, for several minutes.

Comment 28 Chris Ward 2009-05-18 15:11:52 UTC
Test kernel can be found here:

http://people.redhat.com/cward/test_kernels/kernel-2.6.9-89.EL.time.2/

Comment 29 Issue Tracker 2009-05-19 09:27:51 UTC
Hi, 

Here's test kernel for this issue.
Could you please test if problem is fixed?

----
Test kernel can be found here:

http://people.redhat.com/cward/test_kernels/kernel-2.6.9-89.EL.time.2/
----

Also I've attached SRPM and patch which is currently proposed.

Regards,
Masaki Furuta


This event sent from IssueTracker by mfuruta 
 issue 264249
it_file 221607

Comment 31 Jason Baron 2009-05-20 16:12:02 UTC
I've verified the fix with the test kernels, in case there is any confusion.

Comment 33 Vivek Goyal 2009-05-21 13:36:37 UTC
Committed in 89.1.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 40 Andreas Schwab 2009-09-07 12:41:31 UTC
*** Bug 495082 has been marked as a duplicate of this bug. ***

Comment 49 Douglas Silas 2011-01-30 23:40:28 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Off-by-one errors in the time normalization code could have caused clock_gettime() to return one billion nanoseconds, rather than adding an extra second. This bug could have caused the name service cache daemon (nscd) to consume excessive CPU resources.

Comment 50 errata-xmlrpc 2011-02-16 15:41:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html