Bug 1368886

Summary: Can we provide a high accuracy function to replace "nanosleep"
Product: Red Hat Enterprise Linux 6 Reporter: Xu Yin <xyin>
Component: kernelAssignee: Yauheni Kaliuta <ykaliuta>
kernel sub component: Scheduler QA Contact:
Status: CLOSED WONTFIX Docs Contact:
Severity: high    
Priority: unspecified CC: ashankar, fweimer, mnewsome, pfrankli, qguo
Version: 6.7   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-01 12:32:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
3 tests
none
modified version without clock_gettime() none

Description Xu Yin 2016-08-22 03:24:03 UTC
Description of problem:

customer wants to know which function is high accuracy cause when they calling nanosleep, the function will weak up after the time they specified

in customer's environment, they called nanosleep sleep 125ms, and they observed this function weak up at 131ms ( exceeded about 5% ) 

according the man page of nanosleep, it's clearly described this function may still be a delay
~~~
 If the interval specified in request is not an exact multiple of the granularity underlying clock (see time(7)), then the interval will be rounded up to the next multiple.  Furthermore, after the sleep completes, there may still be a delay before the CPU becomes free to once again execute the calling thread.
~~~

can we provide a high accuracy function to replace "nanosleep" ?

Version-Release

Red Hat Enterprise Linux 6.7

How reproducible:

We can reproduce this issue by following code
~~~
[root@rhel6-repo test]# cat test.c 
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#if _POSIX_C_SOURCE >= 199309L
#include <time.h>   // for nanosleep
#else
#include <unistd.h> // for usleep
#endif

void sleep_ms(int milliseconds) // cross-platform sleep function
{
#if _POSIX_C_SOURCE >= 199309L
    struct timespec ts;
    ts.tv_sec = milliseconds / 1000;
    ts.tv_nsec = (milliseconds % 1000) * 1000000;
    nanosleep(&ts, NULL);
#else
    usleep(milliseconds * 1000);
#endif
}

int main()
{
	int i;

	for (i = 0; i < 8; i++) {
		printf("%d\n", i);
		sleep_ms(125);
	}
	return 0;
}
---------------------------------------------------
[root@rhel6-repo test]# time ./a.out 
0
1
2
3
4
5
6
7

real	0m1.002s
user	0m0.000s
sys	0m0.000s
~~~

Actual results:
0m1.002s

Expected results:
0m1.000s

Request:
1. can nanosleep improve?
2. any other systemcall can do the same thing and sleep correctly?

many thanks !!

Comment 2 Florian Weimer 2016-08-22 06:28:14 UTC
This is a kernel issue, all glibc does is to call the nanosleep system call.

If the customer has such strict timing requirements, they should perhaps try the real-time kernel.

Comment 4 Yauheni Kaliuta 2016-08-26 18:21:07 UTC
Created attachment 1194424 [details]
3 tests

I guess "time" is not the best check. I've modified the test a bit, got the following results on my machine:

Diff: 1 sec 2687979 nanosec
Diff: 1 sec 2281648 nanosec
Diff: 1 sec 165657 nanosec

and 

Diff: 1 sec 508777 nanosec
Diff: 1 sec 120939 nanosec
Diff: 1 sec 20396 nanosec

on another machine. Is it acceptable latency?

Comment 5 Yauheni Kaliuta 2016-08-26 19:43:35 UTC
Created attachment 1194455 [details]
modified version without clock_gettime()

Comment 7 Yauheni Kaliuta 2017-02-01 12:32:42 UTC
I haven't got any feedback about my measurements and do not see a way to improve latency for RHEL6 kernel.