Description of problem:
when running rhts test /kernel/syscalls/gettimeofday on rhel6 s390x kernel,
gtod backwards happens sometimes. machine is ibm-js22-vios-01-lp1.rhts.eng.bos.redhat.com
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. install rhts test /kernel/syscalls/gettimeofday
2. change into the test directory, and change the gtod_backwards loop count to
3. make run
***** Start gtod_backwards *****
***** Start loop number 1 *****
Test start time = 1273663133.038369s
Test end time = 1273663134.564814s
***** Done loop number 1 *****
***** Start loop number 2 *****
Test start time = 1273663134.567711s
start time = 1273663135.168261
end time = 1273663135.168260
FAIL: time went backwards -1000 nsec (-1.999999 )
***** Done loop number 2 *****
***** Start loop number 3 *****
Test start time = 1273663135.171130s
Test end time = 1273663136.692549s
***** Done loop number 3 *****
***** Start loop number 4 *****
Test start time = 1273663136.695444s
Test end time = 1273663138.217518s
***** Done loop number 4 *****
***** Start loop number 5 *****
Test start time = 1273663138.220447s
start time = 1273663139.168263
end time = 1273663139.168262
FAIL: time went backwards -1000 nsec (-1.999999 )
***** Done loop number 5 *****
there is similar bug on s390x machine, but it is fixed in 2.6.32-25, please see https://bugzilla.redhat.com/show_bug.cgi?id=575728
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release. Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release. This request is not yet committed for
------- Comment From firstname.lastname@example.org 2010-05-12 11:09 EDT-------
Reverse mirror of 591495 - gtod backwards when running rhts test /kernel/syscalls/gettimeofday on ppc64 machine
------- Comment From email@example.com 2010-05-12 20:44 EDT-------
Can we get the source for /kernel/syscalls/gettimeofday to aid in local reproduction?
Created attachment 413632 [details]
Created attachment 424434 [details]
Simple fix to stop gettimeofday() going backwards on ppc64
------- Comment on attachment From firstname.lastname@example.org 2010-06-16 08:51 EDT-------
So, what's happening on ppc64 is that integer truncation in computing the 'stamp_xsec' value used in the VDSO gettimeofday implementation is causing the gettimeofday result to go backwards 1 microsecond occasionally when the kernel updates the VDSO data (which it does every tick).
The stamp_xsec value is the time as of the update in units of 1/2^20 seconds, approximately 0.954 microseconds, known as xsecs. With CONFIG_HZ = 100, the tick is 10000 microseconds long, or 10485.76 xsecs. That means that on successive ticks the stamp_xsec value will increment by either 10485 or 10486. If userspace does gettimeofday right near the end of a tick and the rounding happens just right (or wrong :) it can get a value that is 10486 xsec since the last update. If the kernel updates the vdso data then and stamp_xsec advances by 10485, and userspace then immediately does another gettimeofday(), it can get a value that is only 10485 past the previous update. Under the right conditions, this can give a microseconds value that is 1 less than the previous value.
There are two ways to fix this. The first is very simple but slightly less than ideal -- it involves reducing the 'tb_to_xs' value that userspace uses to convert timebase counts to xsecs by a small amount (0.005% for CONFIG_HZ=100) so that the time computed by userspace will run very slightly slower during the tick and end up about 0.5 microseconds slow by the end of the tick. That's enough to avoid time going backwards due to integer truncation in computing stamp_xsec. Its disadvantage is that the time computed by gettimeofday() is very slightly inaccurate and may be behind what the kernel computes and uses internally by up to around 500 nanoseconds.
The second way involves more code change and needs a new field in the vdso data page structure, but gives a more accurate result. It involves changes to the code in the VDSO to use a different method, not involving stamp_xsec, to convert the timebase to the time of day. We can't actually remove the stamp_xsec field since the structure is exposed in /proc/ppc64/systemcfg and is part of the user/kernel ABI now.
The patch attached here implements the first alternative. I am still working on a patch for the second alternative, which I will send upstream, but it will be a more invasive patch.
Created attachment 424806 [details]
Proper fix for time going backwards
------- Comment on attachment From email@example.com 2010-06-17 09:01 EDT-------
This is the alternative patch which fixes the problem properly by modifying the VDSO code to not use the stamp_xsec field, and instead use a new field which stores the nanoseconds as a 0.32 binary fraction in a new field in the vdso_data. This is the patch which I will be sending upstream shortly to fix the problem in the mainline Linux kernel.
The advantages of this patch are that it makes gettimeofday() and clock_gettime() slightly faster (gettimeofday() in a 64-bit process takes 32.2ns on a POWER7 with the patch compared to 37.4ns without) and it fixes the main problem without losing accuracy. The disadvantage of this patch compared to the other one is that it is larger and more invasive, so may present more risk, though I have checked and tested it thoroughly.
Note that you should not apply both patches; apply one or the other but not both.
One point in the patch is worth mentioning -- in testing I found instances where update_vsyscall() got called with xtime.tv_nsec = 1000000000 or 1000000001. That may indicate a bug in generic code.
posted to rh-kernel mailing list
------- Comment From firstname.lastname@example.org 2010-06-25 11:22 EDT-------
per rh planned for ss7
Patch(es) available on kernel-2.6.32-42.el6
tested on ibm-js22-vios-01-lp1.rhts.eng.bos.redhat.com with kernel version 2.6.32-42.el6.ppc64, ran /kernel/power-management/clock_gettime and /kernel/power-management/gettimeofday , backward does not happen.
set it as verified.
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.