Bug 1064059 - clock_nanosleep returns early with TIMER_ABSTIME
Summary: clock_nanosleep returns early with TIMER_ABSTIME
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.0
Hardware: Unspecified
OS: Linux
medium
unspecified
Target Milestone: rc
: ---
Assignee: Stanislaw Gruszka
QA Contact: Qiao Zhao
URL:
Whiteboard:
Keywords:
: 1163507 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-11 22:21 UTC by Patsy Franklin
Modified: 2015-12-01 07:57 UTC (History)
6 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2015-11-19 20:02:57 UTC


Attachments (Terms of Use)
Reduced test case (3.58 KB, text/plain)
2014-09-10 13:30 UTC, Siddhesh Poyarekar
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:2152 normal SHIPPED_LIVE Important: kernel security, bug fix, and enhancement update 2015-11-20 00:56:02 UTC

Description Patsy Franklin 2014-02-11 22:21:23 UTC
Description of problem:
glibc test suite - tst-cpuclock2 test fails on:
i686. ppc, ppc64, x86_64

Version-Release number of selected component (if applicable):
2.17-48.el7

How reproducible:
Fails consistently on RHEL 7.0


Steps to Reproduce:
1. See build log

Comment 4 Siddhesh Poyarekar 2014-09-09 11:39:34 UTC
I haven't seen this failure on recent rawhide builds, which led me to look at whether there may have been a recent kernel bug that may have been fixed.  This looks quite related:

https://lkml.org/lkml/2014/6/24/20

Testing to see if I am right.

Comment 5 Siddhesh Poyarekar 2014-09-10 05:58:57 UTC
Nope, that wasn't it.  In fact, I checked manually on rawhide and the test still can fail:

live thread clock ffffffffffff98ee resolution 0.000000001
live thread before sleep => 0.000657509
self thread before sleep => 0.016908503
live thread after sleep => 0.490654902
self thread after sleep => 0.017012290
clock_nanosleep on process slept 99885847 (outside reasonable range)

Looking deeper, it seems like clock_nanosleep may be returning earlier and that clock_gettime is probably just reporting what it sees.  The clock_nanosleep wrapper in glibc is also quite minimal, so it still looks like a kernel bug to me.

Still working on it.

Comment 6 Siddhesh Poyarekar 2014-09-10 13:30:08 UTC
Created attachment 936152 [details]
Reduced test case

Compile with:

cc -o tst-cpuclock2 -std=gnu99 tst-cpuclock2.c -g -pthread -lrt -Wall

and run it like so:

while ./tst-cpuclock2; do true; done

Comment 7 Siddhesh Poyarekar 2014-09-10 13:32:48 UTC
Reassigning to kernel because this does not look like a glibc bug.  The first thing I verified is that clock_gettime(CLOCK_PROCESS_CPUTIME_ID) was monotonic using the following simple program:

~~~
#include <time.h>
#include <stdint.h>
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>

pthread_barrier_t barrier;

/* Help advance the clock.  */
static void *
chew_cpu (void *u)
{
  pthread_barrier_wait (&barrier);
  while (1);

  return NULL;
}

static void
verify_time (struct timespec *b, struct timespec *a)
{
  unsigned long long bi = 1000000000ULL * b->tv_sec + b->tv_nsec;
  unsigned long long ai = 1000000000ULL * a->tv_sec + a->tv_nsec;

  if (ai < bi)
    {
      printf ("clock went backwards from %llu.%llu to %llu.%llu\n",
              b->tv_sec, b->tv_nsec, a->tv_sec, a->tv_nsec);
    }
}

int
main (void)
{
  struct timespec before, after;
  clock_gettime (CLOCK_PROCESS_CPUTIME_ID, &before);
  after = before;

  pthread_t th;   

  pthread_barrier_init (&barrier, NULL, 2);

  if (pthread_create (&th, NULL, chew_cpu, NULL) != 0)
    {
      perror ("pthread_create");
      return 1;
    }

  pthread_barrier_wait (&barrier);

  while (1)
    {
      clock_gettime (CLOCK_PROCESS_CPUTIME_ID, &after);
      verify_time (&before, &after);

      before = after;
    }
}
~~~

This runs for hours without printing any errors.  the only remaining possibility in the attached test case then is that of clock_nanosleep() syscall incorrectly returning early for the absolute case.

Comment 10 Siddhesh Poyarekar 2014-11-13 05:32:54 UTC
*** Bug 1163507 has been marked as a duplicate of this bug. ***

Comment 16 Rafael Aquini 2015-07-03 14:17:09 UTC
Patch(es) available on kernel-3.10.0-290.el7

Comment 22 errata-xmlrpc 2015-11-19 20:02:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-2152.html


Note You need to log in before you can comment on or make changes to this bug.