Bug 636255 - Discrepancy between gettimeofday and time
Summary: Discrepancy between gettimeofday and time
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: i686
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jiri Olsa
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-21 18:33 UTC by Stefan Ring
Modified: 2011-08-30 19:10 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 636241
Environment:
Last Closed: 2011-08-30 19:10:51 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Stefan Ring 2010-09-21 18:33:36 UTC
+++ This bug was initially created as a clone of Bug #636241 +++

Description of problem:

Interestingly, the problem described in the parent bug still persists on i686, even in F14. Presumably, it has not been fixed in the mainline kernel either and should be fixed upstream.


Version-Release number of selected component (if applicable):
2.6.35.4-28.fc14

Comment 1 Stefan Ring 2010-09-21 18:58:30 UTC
The test program does not fail on x86_64 since at least F10.

Comment 2 Stanislaw Gruszka 2010-09-23 19:37:11 UTC
I think this should be fixed by current upstream commit:

commit 8c73626ab28527b7eb7f3061c027fbfe530c488c
Author: John Stultz <johnstul.com>
Date:   Tue Jul 13 17:56:18 2010 -0700

    x86: Fix vtime/file timestamp inconsistencies

Comment 3 Stefan Ring 2010-09-23 20:19:38 UTC
True, this bug is then likely a duplicate of Bug #244697.

Comment 4 Stefan Ring 2010-09-23 20:36:28 UTC
No, actually not. Shouldn't this commit already be in the F14 kernel? As mentioned, the problem still exists in F14.

Comment 5 Stefan Ring 2010-09-23 21:06:07 UTC
(In reply to comment #4)
> Shouldn't this commit already be in the F14 kernel?

No, apparently not. It has only been merged for 2.6.36, not 2.6.35. Sorry for the noise.

Comment 6 Chuck Ebbert 2010-09-24 12:55:07 UTC
But that fix is for x86_64 only.

Comment 7 Stefan Ring 2010-09-24 13:16:13 UTC
True.

But there is a lot of related changes in the vicinity of this patch (branch timers-timekeeping-for-linus); I guess I'll just try it again with a 2.6.36 kernel.

Comment 8 Stefan Ring 2010-09-27 09:39:14 UTC
Still happening with 2.6.36-rc5 x86 (32 bit).

Comment 9 Stanislaw Gruszka 2010-09-29 08:07:34 UTC
Hi Jiri and John, could you please look at this bug report?

Comment 10 Jiri Olsa 2010-09-29 12:05:31 UTC
hi,

AFAIK the issue is not solved for the creat vs. time race.
The fix John did was for time vs. vtime race.

x86: Fix vtime/file timestamp inconsistencies
  commit 8c73626ab28527b7eb7f3061c027fbfe530c488c
  Author: John Stultz <johnstul.com>

I'm pasting here John's answer about the creat vs. time inconsistency:
(whole trail in here http://lkml.org/lkml/2010/7/7/172)

wbr,
jirka

---
Right, so as long as the file timestamps are tick-granular (like other
tick-granular interfaces: current_kernel_time(), time(),
CLOCK_REALTIME_COARSE) you will have the possibility of inconsistencies
against the clocksource resolution interfaces (gettimeofday(),
CLOCK_REALTIME, etc).

But that is to be expected as a constraint of the granularity. So I
don't really see this as an issue.

Folks may want to increase the granularity of filesystem timestamps, but
that will come at the possibly very expensive cost of reading the
clocksource hardware (which can have different access latencies between
architectures and even machines of the same arch). I suspect its not
worth it.

The concerning issue here that you pointed out are the inconsistencies
could be seen between vsyscall time() and time() (or filesystem
timestamps). That is a problem, and my patch should resolve that one.
---

Comment 11 Stanislaw Gruszka 2010-09-29 12:28:05 UTC
(In reply to comment #10)
> AFAIK the issue is not solved for the creat vs. time race.
> The fix John did was for time vs. vtime race.

The problem here is gettimeofday vs. time inconsistency on 32-bit. Problem is not solved in current upstream 2.6.36-rc5 kernel.

Comment 12 john stultz 2010-09-29 18:41:36 UTC
So, I'm not sure what's being described is a bug.

gettimeofday() a nanosecond interface that is clocksource granular, so it will provide as smooth a flow of time as the underlying clocksource can provide (usually sub-usec to sub-nsec granularity). 

time() is a second interface, which is updated with tick (1/HZ) granularity.

So if a tick lands at 10.999, time() will return 10 until the next tick, where as gettimeofday will show 10.999 and then 11.000 and then 11.001 ... etc

Because of this, interleaving time() and gettimeofday() calls is not recommended.

This is historically how time() and gettimeofday() have functioned, with the exception of on x86_64, where for awhile, the vsyscall vtime() method actually just returned the seconds portion of vgettimeofday(). This was a bug, and is corrected by the recent commit mentioned above.

Comment 13 Stefan Ring 2010-09-30 09:01:41 UTC
You may be right. I still think that it's very unintuitive behavior. Shouldn't the spec say, that, when two functions return the same information ("seconds since the epoch"), that they are allowed to return two completely unrelated pieces of data?

And it's still not clear why the test program fails (i.e., prints "time going backwards") on x86, but not on x86_64.

Comment 14 john stultz 2010-09-30 18:42:04 UTC
Stefan: The test program likely does not fail on x86_64 due to the vtime bug in x86_64 (see the quote in comment #10 above for details). In that case vtime calls vgettimeofday and returns the seconds portion, rather then returning the time at the last tick.

A more comparable test would be to use clock_gettime(CLOCK_REALTIME_COARSE,...) with time(). In that case, the seconds field should be identical, since they are both updated each tick.

Comment 15 Stanislaw Gruszka 2010-10-11 16:34:19 UTC
Well, in posix spec and in manuals we have no statement that time() and gettimeofday() should not be interleaved (however posix obsoletes gettimeofday by clock_gettime). So I think closing this issue as NOTABUG is a bit wrong.

If anyone agree I'm going to close bug as CANTFIX since trying to fix that problem may cause other issues, and avoiding the problem in user space programs is easy.

Comment 16 Stefan Ring 2010-10-15 06:12:43 UTC
import time
import math

while True:
    nextsecond = math.floor(time.time()) + 1
    d = nextsecond - time.time()
    while d > 0:
        time.sleep(d)
        d = nextsecond - time.time()
    a = time.strftime("%H:%M:%S")
    b = time.strftime("%H:%M:%S", time.localtime(nextsecond))
    print a, b
    if a < b:
        break

Comment 17 john stultz 2010-10-15 18:49:42 UTC
import time
import math

while True:
    nextsecond = math.floor(time.time()) + 1
    d = nextsecond - time.time()
    while d > 0:
    	lastd = d
        time.sleep(d)
        d = nextsecond - time.time()
    a = time.strftime("%H:%M:%S", time.localtime(time.time()))
    b = time.strftime("%H:%M:%S", time.localtime(nextsecond))
    print a, b, lastd
    if a < b:
        break

Doesn't hit the issue. But yea, in the example in comment #16, python ends up calling gettimeofday for everything except the .strftime() made without an argument, where there it calls time(), this causes the inconsistent behavior to occur.

Comment 18 Stefan Ring 2010-10-15 20:32:02 UTC
(In reply to comment #17)
> Doesn't hit the issue.

Not as often as I had hoped, but it does. I tried it before posting here this morning. Apparently only with a loaded CPU, though. As soon as I ran the test program from the parent case (bug #636241), the if condition fired and the program stopped almost every time.

Comment 19 Stefan Ring 2010-10-16 16:18:34 UTC
Actually I misread your previous comment because I had assumed that the Python code was just quoting mine, which is not the case. Sorry about that. It's true that your version will not have this problem, because only the one-argument time.strftime() will call time(), while time.time() uses gettimeofday().

Comment 20 Josh Boyer 2011-08-30 19:10:51 UTC
I'm closing this as CANTFIX.  If there is some other appropriate resolution for the issue (an upstream fix, or NOTABUG) please let me know.


Note You need to log in before you can comment on or make changes to this bug.