Bug 816752

Summary: systemd v28 changes indirectly break date and ntpdate
Product: [Fedora] Fedora Reporter: Martin Langhoff <martin>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: bugzilla, ddumas, dennis, gabriel, gansalmon, gholms, itamar, johannbg, jonathan, jpokorny, kernel-maint, kzak, laurent.rineau__fedora, lnykryn, lpoetter, madhu.chinakonda, metherid, mlichvar, mmaslano, mschmidt, msekleta, notting, nphilipp, pbrobinson, pertusus, phoenixV, plautrba, praiskup, prarit, rjones, sgallagh, stephent98, systemd-maint, tmraz, tshinnic, vpavlin
Target Milestone: ---Keywords: Regression, Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-08 13:46:27 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Martin Langhoff 2012-04-26 17:25:08 EDT
Description of problem:

The announcement of systemd v28 states

"At shutdown we no longer invoke "hwclock --systohc", i.e. do not write
the system clock back to the RTC. Why? In general there's not really a
reason to assume that the system clock was anymore correct than the RTC
so it's probably a good idea to leave the RTC untouched."

gnome control panel has been fixed to use systemd's services to set the time,  ntpdate and date utilities (as of current Fedora 17 at least) do not use systemd's services, and they do not set the RTC in any way.

Many sysadmins and programs out there will be caught in this.

date, ntpdate and the halt scripts in all distros have confabulated
for the longest of times to hide the difference between "OS time" and
"RTC clock time". "Setting the system time" means for most users
"setting the machine time", not just the OS/kernel time.

Version-Release number of selected component (if applicable):

Tested with systemd v44, date from coreutils-8.15-6

How reproducible:

100%

Steps to Reproduce:
1. use date -s to set the time
2. reboot
3.
  
Actual results:

Set time does not persist.

Expected results:

Set time persists.
Comment 1 Lennart Poettering 2012-09-14 10:24:50 EDT
Well, already, if the machine is forcibly turned off ntpdate's changes would be lost. Really, these date changes should be synced down as they happen, not at some possibly really late point in time that might be too late or never happens.

Reassigning to ntpdate.

ntpdate maintainers! Could you please teach ntpdate to sync down the system clock changes to the RTC? Usually an "hwlock -w" should suffice, after ntpdate, or the equivalent ioctl in ntpdate itself.
Comment 2 Miroslav Lichvar 2012-09-17 09:17:05 EDT
There is an option called SYNC_HWCLOCK in /etc/sysconfig/ntpdate which can be enabled to make the ntpdate service call hwclock after successful NTP sync.

I think it can be enabled by default.
Comment 3 Lennart Poettering 2012-09-17 12:59:25 EDT
Sounds good to me! Please make that change!
Comment 4 Miroslav Lichvar 2012-09-18 10:30:45 EDT
Just an idea. It seems there are way too many programs which can set the system clock. What if it was the timedated's responsibility to keep the hw clock correct? It would occasionally check the offset between CLOCK_REALTIME and CLOCK_MONOTONIC and if there was a larger change, it would sync the hw clock. It could also sync the hw clock on DST change if it's kept in local time. Perhaps later it could monitor the hw clock directly and include some of the hwclock --adjtime functionality and we could finally drop the horrible 11-minute kernel sync hack.

Lennart, what do you think?
Comment 5 Kay Sievers 2012-09-18 10:44:29 EDT
timedated is a D-bus activated service, that acts on behalf of a call-in from
user software, not so much meant as an active time-keeping mechanism.

How is the delta between CLOCK_REALTIME and CLOCK_MONOTONIC relevant to the
RTC's idea of the clock? I'm convinced, if there is no trustable time source,
we should never touch the RTC.

You mean the hwclock --adjtime time drift hackery? Why would that be useful
in systemd code?

What's wrong with the 11 minute mode? It's unfortunate, that it only adjusts
minutes, and never the full time, but so far, I don't see anything really
wrong with that logic.
Comment 6 Miroslav Lichvar 2012-09-18 11:09:55 EDT
I meant to watch the REALTIME-MONOTONIC offset to detect when the system clock is set. However, it seems there is a much better way via the TFD_TIMER_CANCEL_ON_SET timer flag.

The hwclock --adjtime feature is useful when the machine is powered off for longer periods of time. As timedated seems to be setting the hw clock in some cases, I don't see a reason why it couldn't compensate for its drift on boot.

The kernel 11-minute mode makes it impossible to estimate the RTC drift and it makes the adjtimex call slower. I believe this should be done in user-space and it seems to me it would be better do it in timedated rather than in all NTP, PTP clients that can run on the machine.

It's just an idea, I think it would allow us to finally have a good RTC timekeeping, enabled by default.
Comment 7 Kay Sievers 2012-09-18 12:07:34 EDT
*** Bug 759071 has been marked as a duplicate of this bug. ***
Comment 8 Kay Sievers 2012-09-18 12:12:54 EDT
Comparing two timestamps is not an atomic operation and it gets easily messy.
Sure, it would not be a real problem for a 1 sec granular clock, but
TFD_TIMER_CANCEL_ON_SET is at least a proper event not, some polling based
non-atomic operation. :)

Systemd does not really want to get into the adjtime guessing business,
we want to stay entirely away from that. Modern hosts have no significant
drifts that would really benefit from stateful userspace guessing anyway,
and the on-disk state + expectations about the validity of that data
usually creates more problems today than they solves.

The kernel's 11 minute mode sounds like a pretty reasonable facility. I
doubt we would introduce something like that as a new feature today, but
I also don't think there is anything really wrong with it.

timedated is really more a library from its model of functioning than it
is an autonomous daemon. It is implemented as an on-demand activated
daemon to allow privileged operations on behalf of authenticated ordinary users. It will terminate if it is not needed any more. So far, we like to
keep its on-demand model rather than making it a general timekeeping,
management facility.

The current idea what we would like to see instead is that the "ntp daemon"
fully syncs the rtc *once* after startup right at the point where it knows
that the system time is correct up to the granularity of the rtc.

In the current model, the rule would be: Any tool that sets the system clock should also do the sync to the rtc. In the ntp case, ntp should do that once
to apply larger timme differences (possibly hours, days, years), and once
only because all later syncs are covered by the kernel itself (covers deltas
of up to 15/30 min only).

It should be reasonably easy to implement in "ntp", would not derive too
much from the current model of operation, and not involve any other
running daemon. It would basically just make the kernel's 11 minute
mode reliable as one would expect.

What do you think?
Comment 9 Miroslav Lichvar 2012-09-18 13:23:46 EDT
I don't think modern hosts have any better RTCs than hosts 20 years ago. If your laptop's system clock is usually synchronized to NTP, why should it be wrong by tens of seconds when it was off/suspended for couple days and now it has no internet connection to fix it via NTP?

Please note that the kernel doesn't really synchronize the RTC as its frequency can't be adjusted, it just periodically sets it to the system time, which minimizes the error. Also, it does that only when an NTP daemon is running.

Instead of adding the code to sync the RTC to every program which can set the system clock I think it would be better to add it to timedated (or something else outside systemd) and disable CONFIG_GENERIC_CMOS_UPDATE.

BTW, why exactly was hwclock --systohc run on shutdown removed?
Comment 10 Kay Sievers 2012-09-18 19:39:22 EDT
I haven't seen any significant drift that is worth on-disk state files to
make guesses, on any system in the last couple of years.

The kernel always syncs minutes and seconds (never the hours, days) every
11 minutes as soon as an "external time source" is available, like when
running "ntp" on the machine.

There are not many programs accessing the RTC. If no user interaction is done
nothing will do it. If "ntp" is running only the kernel will do it, never
systemd.

The rule is: the one who changes the system time (like "ntp") is obligated
to sync it to the hardware too. A running ntpd, cronyd should do that once
it knows the time from the network.

System must not not call hwclock, systemd has no business in changing the
hardware clock without knowing the actual time it has to store. Blindly
running hwclock at shutdown is a very broken concept without a valid time
source (like "ntp"). 

timedated will do it when a user requests to set the time/timezone manually,
and in no other cases, it's not timedated's task to fiddle around with the
hardware without having any idea about the actual time.
We could discuss implementing an ntp client in timedated, but unless that
happens, timedated should stay out of the business of fiddling with the
hardware.

Only explicit user requests are allowed to touch the hardware RTC,
or something like "ntp" which is a reliable timesource.

Nothing without a reliable timesource (like systemd) should ever fiddle with
the hardware clock, we just don't know the time and we make stuff only worse
not better with guessing and assumptions like this.
Comment 11 Miroslav Lichvar 2012-09-19 05:00:55 EDT
If the RTC was an accurate clock, I'd agree with you, set it only when we know the time is correct and return to the accurate time on every boot. Unfortunately it still has the super cheap quartz crystal which can drift by 10 ppm or more and is highly sensitive to temperature. It's often worse than an uncorrected system clock. If you are running an NTP client, check the system log to see how large is the first system clock correction after boot.

I believe the purpose of the RTC is to backup the system time, not to keep an accurate time (even if it was possible). Its frequency can't be set, so we can either set its time often or on shutdown/suspend and hope the system won't be powered down too long for the RTC to drift away too much, or we can try to estimate its drift and correct for it on every boot. Or we can do a combination of both.

Instead of patching every program which can set the system clock (date, rdate, sntp, all NTP clients, all PTP clients, the desktop time utilities, etc) I think it's better to maintain the system clock backup by the system itself.
Comment 12 Lennart Poettering 2012-09-19 07:36:33 EDT
The adjtime offline drifting stuff is really something we should leave behind. To be useful it needs an accurate network clock anyway (so that it can actually estimate the RTC deviation of NTP), and if that's the case then well, maybe people should just use NTP.

I mean, let's not kid ourselves, the offline adjtime guessing stuff is a toy. If people want accurate clocks they should use NTP.

And anyway, who says that RTCs actually are wrong be the same constant rate? Presumably they change speed depending on outside conditions such as heat, age, and the battery voltage. These parameters tend to change especially if machines are offline for a longer time, which is precisely where the algorithm in question is supposed to help.
Comment 13 Lennart Poettering 2012-09-19 07:42:37 EDT
And regarding the syncing of the clock: it should be the duty of the code which adjusts the system clock to also make that reflected in RTC. It's completely wrong doing that at system shutdown only, at a point in time that is much much later, and doesn't even exist often, when the machine is powered off without a clean shutdown.

It's a very simple rule now: if you adjust the system clock, you also need to adjust the RTC. 

timedated can adjust the system clock, and it does sync it down to the RTC.

The kernel's NTP time adjustment also does this (11min mode), but of course is used only for smaller adjustments.

The only thing missing here is that chrony/ntp/ntpdate when they make big jumps in time they also sync this down into the RTC.

Anything else is just broken!

I mean, you have two text editors: one does auto-saving of what you type. The other doesn't do that but saves the file only when the program exits. Which one is the better design? Obviously the one that does the auto-saving. It makes things more robust and is much more likely to recover your data. -- Now, for the RTC case it's exactly the same story: when the change is made it should be synced down to the RTC, it should not be delayed until a point is reached where it aleady might be too late...
Comment 14 Miroslav Lichvar 2012-09-19 08:52:08 EDT
Yes, the RTC drift changes, but usually less than its absolute error. By removing a long-term average from the drift you can get easily an improvement of factor 10. When you boot your computer without internet access, it really is useful. Also, you don't need an NTP server to estimate the drift, if you set the clock manually once a month, you can get a very good estimate of the drift and it will include the time when the machine was off. That's why it's better than measuring the RTC drift only when synchronized via NTP (e.g. the rtcfile option in chronyd).

Please understand that the moment you set the RTC, it will start drifting away. When you get to the shutdown, the RTC is already off. That's something very different from saving text files to disk.

Again, the RTC is just backup of the system clock and not a particularly good one.

The amount of trouble the change has caused so far speaks for itself and it doesn't seem to improve anything. I'm reassigning this back to systemd and asking to call hwclock --systohc on shutdown again. It's far from a perfect solution, but I think it's better than what you are proposing here.
Comment 15 Stephen Gallagher 2012-09-19 09:42:58 EDT
Lennart, Kay: May I suggest a compromise solution? Let's enable the 'hwclock --systohc' on shutdown again for now (and for the lifetime of Fedora 17 and 18). Then let's create a Fedora 19 Feature Page for fixing up the individual time-management applications to manage the hardware time.

I agree that the shutdown solution is incomplete and somewhat of a hack. However, it has more-or-less worked in the past, and it's better to have a partial solution in place for now while we work towards a more complete solution. Right now, this behavior appears to users as a regression in functionality.
Comment 16 Martin Langhoff 2012-09-19 09:45:06 EDT
I will also note that, in the absence of NTP, system clocks are getting less reliable as systems suspend or sleep more aggresively. We have seen this on OLPC's ARM SoCs. So without NTP your system clock may see significant drift. Adjusting the RTC at shutdown may not be such a great idea. Guessing RTC drift based on system clock is even worse.

So TBH I am down on attempting to guess RTC drift unless you are on NTP (at which point, it is of very limited value).

If we do change all the tools that change system time to also change the rtc however, we need a way to indicate to those tools that this system uses this practice. Other OS builds that still want to monitor and adjust RTC drift will see their drift calculations borked.
Comment 17 Martin Langhoff 2012-09-19 09:50:05 EDT
My last paragraph is unreadable. What I propose is an explicit flag day in RTC mgmnt, so date, ntpdate and friends can check a configuration value somewhere (or a configure flag at build time?) that indicates "sync_hwclock_to_systemtime=yes_please".
Comment 18 Kay Sievers 2012-09-19 09:51:50 EDT
Nothing that has no idea about the actual time should automatically,
without a user request, ever change the RTC.

It's just madness, and we stopped following that broken model. Systemd
has no reference clock, so it can not touch the RTC.
Comment 19 Lennart Poettering 2012-09-19 09:56:52 EDT
I don't think there is anything to fix here in systemd. I am not interested in playing ping-pong with this bug. Closing.
Comment 20 Stephen Gallagher 2012-09-19 10:04:22 EDT
A change was made to systemd that regressed functionality of the system. This is absolutely a bug that needs to be addressed. Please consider restoring the old (incomplete) functionality for the short term.

Quite frankly, replacing "This doesn't work right in 100% of cases" "This works right in 0% of cases" should be obviously wrong.
Comment 21 Martin Langhoff 2012-09-19 10:06:27 EDT
This is a wider system change. Without taking care of fixing ntpdate and date, or at least communicating with the relevant maintainers, the result is a broken system as per the reasonable expectations of sysadmins and users.

Specially sysadmins in headless systems.

I won't play ping pong, but please reopen or communicate with the relevant maintainers.
Comment 22 Kay Sievers 2012-09-19 10:09:01 EDT
Systemd can not touch the RTC, it has no idea what to store there.

Only NTP is a reliable time-source, so NTP should fix it's limitations.

Even the kernel follows the logic not to touch the RTC when no proper
time source is in control. Systemd will not get into the way of that.

Closing this systemd bug again, it's not a systemd problem at all and
it will not become one. Thanks!
Comment 23 Miroslav Lichvar 2012-09-19 10:16:59 EDT
(In reply to comment #18)
> Nothing that has no idea about the actual time should automatically,
> without a user request, ever change the RTC.

Again, the RTC is just a backup of the system clock when the machine is powered off, you need to set it to the system clock, no matter how the system clock is wrong. I'm sure you don't want the system clock to be set after reboot to a time before the machine was rebooted.

> It's just madness, and we stopped following that broken model. Systemd
> has no reference clock, so it can not touch the RTC.

It has a reference clock, it's called CLOCK_REALTIME.

A bigger madness is trying to patch all programs which can call settimeofday, clock_settime or adjtimex.
Comment 24 Lennart Poettering 2012-09-19 10:19:18 EDT
If the  chrony/ntp/ntpdate maintainers are not interested in fixing this properly (i.e. sync the RTC down when they update the system clock), then they are welcome to ship a service that invokes hwclock --systohc at shutdown. 

It's ridiculous that the NTP maintainers believe it is a better idea to delay the syncing until shutdown. It's bad engineering, but then again they maintain NTP...

There's nothing to fix in systemd however, and I don't want to see any code like this in systemd.
Comment 25 Kay Sievers 2012-09-19 10:20:53 EDT
(In reply to comment #20)
> Quite frankly, replacing "This doesn't work right in 100% of cases" "This
> works right in 0% of cases" should be obviously wrong.

"This doesn't work right in 100% of cases" is just nonsense.

The current systems are all fine and properly synced as long as the RTC
is not farther away than 15/30 minutes. And this is a corner case. The
kernel by itself will take care of the usual syncing.

Touching the RTC without a reliable time source causes many problems, many
more than are solved by guessing and amateurish i-hope-i-do-the-right-thing
automatisms like an unconditional hwclock run.

The only missing piece is that NTP does the rtc sync of the hours/days if
the drift it detects is larger than the delta the kernel takes care of.

None of the above mentioned problems is reliably fixable in systemd, not sure
what's so hard to understand here. The problem is pretty clear I think.
Comment 26 Stephen Gallagher 2012-09-19 10:27:04 EDT
(In reply to comment #25)

> The current systems are all fine and properly synced as long as the RTC
> is not farther away than 15/30 minutes. And this is a corner case. The
> kernel by itself will take care of the usual syncing.
> 
> Touching the RTC without a reliable time source causes many problems, many
> more than are solved by guessing and amateurish i-hope-i-do-the-right-thing
> automatisms like an unconditional hwclock run.
> 
> The only missing piece is that NTP does the rtc sync of the hours/days if
> the drift it detects is larger than the delta the kernel takes care of.
> 
> None of the above mentioned problems is reliably fixable in systemd, not sure
> what's so hard to understand here. The problem is pretty clear I think.

I never said that it was *reliably* fixable in systemd. I said that while we know it's not a proper solution, from a user's perspective this changed a behavior that they relied on. I'm asking that we just turn back on this admittedly incorrect workaround so that we have something that works *some of the time* while we fix the more serious underlying problem.

And I agree, we need to address the problem at either the package level or possibly even at the glibc/kernel level to ensure that calls to settimeofday() set both clocks.

What we do *not* want to have is a computer that may boot up with its clock set to an earlier time than when it shut down, as this may cause serious issues with file access.
Comment 27 Lennart Poettering 2012-09-19 10:33:56 EDT
(In reply to comment #23)


> A bigger madness is trying to patch all programs which can call
> settimeofday, clock_settime or adjtimex.

adjtimex doesn't need that.

And the programs which invoke settimeofday/clock_settime are basically ntpdate/ntp/chrony/timedated. And timedated does sync things down to the RTC. Hence this bug was filed against NTP.
Comment 28 Miroslav Lichvar 2012-09-19 10:39:47 EDT
(In reply to comment #25)
> The current systems are all fine and properly synced as long as the RTC
> is not farther away than 15/30 minutes. And this is a corner case. The
> kernel by itself will take care of the usual syncing.

You forgot to mention the problem with DST changes when the RTC is kept in local time. A bad engineering is setting the RTC once and hoping it will not drift away too much from the system clock.

> Touching the RTC without a reliable time source causes many problems, many
> more than are solved by guessing and amateurish i-hope-i-do-the-right-thing
> automatisms like an unconditional hwclock run.

Care to name a few?

(In reply to comment #27)
> (In reply to comment #23)
> > A bigger madness is trying to patch all programs which can call
> > settimeofday, clock_settime or adjtimex.
> 
> adjtimex doesn't need that.

adjtimex(ADJ_SETOFFSET) steps the clock.

> And the programs which invoke settimeofday/clock_settime are basically
> ntpdate/ntp/chrony/timedated. And timedated does sync things down to the
> RTC. Hence this bug was filed against NTP.

There is also date, rdate, sntp and soon ptp4l and phc2sys from linuxptp and all other unpackaged NTP and PTP clients.

The whole idea of expecting that every application which sets the system clock knows about an RTC seems wrong.
Comment 29 Kay Sievers 2012-09-19 10:44:45 EDT
(In reply to comment #26)
> What we do *not* want to have is a computer that may boot up with its clock
> set to an earlier time than when it shut down, as this may cause serious
> issues with file access.

If we don't know the time, we just make things worse. There have been lots
of issues in the past, which we will not introduce again by playing silly
games with guessing and hoping it works.

There is no problem for the majority of boxes, because the time drift
between reboots is less than 30/15 minutes. The kernel takes care of the
syncing.

And because there is no pressing problem in the real world, we do not play
these amateurish games in systemd which people invented in the past.

As long as systemd does not ship its own NTP client, and as long as the kernel
"11 minute mode" does only sync the minutes and seconds, this bug is an NTP
issue, not a systemd one.
Comment 30 Miroslav Lichvar 2012-09-20 04:13:22 EDT
Perhaps the hwclock --systohc call could be in the initscripts package as a fedora specific service?
Comment 31 Kay Sievers 2012-09-20 07:49:30 EDT
(In reply to comment #30)
> Perhaps the hwclock --systohc call could be in the initscripts package as a
> fedora specific service?

You still need to make sure that this is never called on any Live-CD, rescue
system, stateless system, or a multi-boot environment with the RTC in local
time.

It breaks all sorts of things to just blindly touch the RTC, because the
booted system does not have an idea about the hardware and environment it is booted on.
We removed that for a reason, and let all RTC setting to explicitly triggered actions by the user, or the fix only the minutes and seconds automatically
when the time is properly synced.
Comment 32 Miroslav Lichvar 2012-09-20 08:15:48 EDT
(In reply to comment #31)
> You still need to make sure that this is never called on any Live-CD, rescue
> system, stateless system, or a multi-boot environment with the RTC in local
> time.

How is that different from user setting the time manually and timedated writing to the RTC or ntpdate/ntpd setting the RTC after NTP sync? If there are multiple systems, they all have to agree on the UTC/LOCAL setting, there is no way around that. 

> It breaks all sorts of things to just blindly touch the RTC, because the
> booted system does not have an idea about the hardware and environment it is
> booted on.

Give us some examples, please.
Comment 33 Kay Sievers 2012-09-20 09:08:15 EDT
(In reply to comment #32)

> How is that different from user setting the time manually and timedated
> writing to the RTC or ntpdate/ntpd setting the RTC after NTP sync?

Explicit user requests are fine, trying-to-be-smart and messing up things
automatically are not.

> If there
> are multiple systems, they all have to agree on the UTC/LOCAL setting, there
> is no way around that. 

There is zero possibility for a Live-CD to "agree" on anything that is related
to the hardware or the installed system. Every boot of the Live-CD will skew
the clock by the timezone delta if it blindly updates the localtime RTC with
UTC data ...
Comment 34 Miroslav Lichvar 2012-09-20 09:17:59 EDT
(In reply to comment #33)
> Every boot of the Live-CD will skew
> the clock by the timezone delta if it blindly updates the localtime RTC with
> UTC data ...

How exactly? Unless the user or ntp sets the time, the system time will be off  by the timezone delta and hwclock --systohc will undo that error when writing back to RTC. If the user or ntp does set the time, the RTC will be shifted by that delta no matter which program writes to it (adjtimex on shutdown or from ntpdate, or timedated).
Comment 35 Martin Langhoff 2012-09-20 09:33:34 EDT
"It breaks all sorts of things to just blindly touch the RTC"

We have been living with that breakage for, say, 20 years. It clearly has not been a mainstream problem, or it would have been addressed long ago.

The removal of hwclock --systohc at shutdown also breaks things, not in corner cases but in their main usage, unless we go around userland fixing them all.
Comment 36 Kay Sievers 2012-09-20 09:36:48 EDT
With NTP, the kernel will only adjust the minutes/seconds, no harm done, all
config or timezone info, which is not available to the Live-CD, the rescue
system, does not matter. --> GOOD

If the user does not explicitly requests anything everything will be
fine. --> GOOD

If the user does set the time, the system does what the user has asked for,
the system itself will not screw up the hardware clock. --> GOOD

Fiddling with hardware without context or reliable time --> BAD.
Comment 37 Kay Sievers 2012-09-20 09:49:06 EDT
(In reply to comment #35)
> "It breaks all sorts of things to just blindly touch the RTC"
> 
> We have been living with that breakage for, say, 20 years. It clearly has
> not been a mainstream problem, or it would have been addressed long ago.

Maybe you did not get these problems reported, we surely did. And we do not
want to go back to that. We do not want to pretend we can solve a problem
that can by its very nature not be solved.

> The removal of hwclock --systohc at shutdown also breaks things, not in
> corner cases but in their main usage, unless we go around userland fixing
> them all.

Only corner cases are affected here. On the majority of boxes that care about
time, NTP runs, and the drift is less than 15/30 minutes. And with that, all
is synced just fine by the kernel.

Boxes that do not run NTP have no time source, so the RTC is not the "backup"
but the "authoritative" source. The running system must not fiddle with it,
because it has no idea what's right. Only toy software would try to be
smarter than it can be here.

Please stop making up artificial problems here, they do not really exist in the
field. Sure RTCs which are off more than 15/30 minutes and clock in local
time are not automatically fixed by the system, but that's a feature not
a bug. The current behaviour in  these cases is still the better deal from
the default Core OS setup's point of view.

This is all about correctness and predictability here, nothing else. We will
not get into the way of people enabling custom services that fit their needs,
but the Core OS should not do things like that by default.
Comment 38 Miroslav Lichvar 2012-09-20 10:04:31 EDT
(In reply to comment #37)
> Maybe you did not get these problems reported, we surely did. And we do not
> want to go back to that. We do not want to pretend we can solve a problem
> that can by its very nature not be solved.

Well, I haven't heard of any problems before and now people are constantly asking me why NTP steps their clocks so much after start.

> Boxes that do not run NTP have no time source, so the RTC is not the "backup"
> but the "authoritative" source. The running system must not fiddle with it,
> because it has no idea what's right.

That's the problem in your understanding. The RTC is no authoritative source, actually it's usually worse at timekeeping than the system clock. Setting a clock once doesn't mean it will be "right" forever. No clock is absolutely accurate, you can only estimate its maximum error increasing over time.

> Please stop making up artificial problems here, they do not really exist in
> the field.

How exactly are you planning to fix the problem with DST and RTC in local time again?
Comment 39 Martin Langhoff 2012-09-20 10:05:43 EDT
Kay, I am not making up artificial problems. I am not defending running hwclock --systohc on a LiveCD -- I agree that _that_ is broken.

What I _am_ saying, however, is that

 * Users of alternative desktop environments, with their own control panels, will see breakage setting the time.
 * Users of alternative system management UIs (think Webmin)
 * Users of any software that, in a perhaps misguided attempt at being useful, offers to set the system time.

Most sysadmins _don't know the difference between system time and RTC_. date -s <blah> has DTRT for them since time immemorial.

All those use cases break with this systemd change. Several sw packages need to learnto check whether we are in a LiveCD or similarly "don't mess with hw" mode, and if we are not, run hwclock --systohc.

_At least you need to publicize this change with the maintainers of those packages_.
Comment 40 Kay Sievers 2012-09-20 10:43:02 EDT
There is no way to "check for a Live CD", this also applies to all other
systems which are not 1:1 bound to a specific hardware and setup. Keying off
of information like that will just end up in a mess, and we can not seriously
recommend anything like that.

In general: we must not try to be smart, we aren't in this case by the very definition of the problem of information theory.

I really don't see this problem here, almost all boxes that care about a
properly set system time use NTP. I think you really are making up a problem
here that appears as one only in the "heat" of this discussion, it just
does not exist on any significant amount of running systems out there.
Comment 41 Tomas Mraz 2012-09-20 10:47:14 EDT
The problem is simple and defined as follows (but obviously "you don't care" for such systems):

1) you have a system that is disconnected from network -> no ntp

2) the system is running long enough and the rtc clock is slower than the system clock

3) reboot

4) clock goes backwards - We have a problem, Houston!
Comment 42 Kay Sievers 2012-09-20 11:03:35 EDT
We care about correctness and predictability, and that's what counts.

Assuming that the system's clock is any better than the RTC makes not much
sense.

If we do not have a reliable time source, the RTC is the one and only
authoritative source, and nothing by default should fiddle with the
authority without being sure that it makes things more correct than
they were before.

No NTP but RTC sync is a very custom requirement to cover a corner case,
and tools can surely provide that when enabled by an admin, but nothing
should do that by default.

I would welcome if people would stop pretending to solve problems nobody
can solve. This approach creates more new problems than it solves.
Comment 43 Thomas L. Shinnick 2012-09-21 15:27:49 EDT
> This approach creates more new problems than it solves.

Thank you for the apt description of your (the corporate body) handling of this issue, ducking this problem over two releases and at least a year's time.

Someone dropped a note on #750883 (F16 days).  I was on that Cc list b/c of https://bugzilla.redhat.com/show_bug.cgi?id=753642  Which mentioned https://bugzilla.redhat.com/show_bug.cgi?id=749516  I've lost my notes on the other threads, sorry.

Reading those old reports, Kay keeps saying sync-at-shutdown is crazy, and Lennart keeps saying sync should be done by any tool that changes time.  Since at least 2011/10 you've been repeating these statements.  So, okay, you're right, have always been right, and the product has been wrong for F16 and F17, and will be in F18 too?

You aren't fixing the problem for the users.  And you're fine with that?  NOTABUG?  Really?  

Maybe time for a wake up call?
Comment 44 Bill Nottingham 2012-09-25 01:05:51 EDT
Yah, I suspect shouting louder will help a lot.

In any case, moving to 'distribution' - this is a distribution level issue.

1) Assuming that the system clock is more reliable than the RTC does make sense - the system clock is a part relied on on every system for accurate timekeeping while it's up, while the RTC is a part dredged up off some dusty factory floor for one tenth of a cent somewhere whose main performance criteria is 'not completely awful because everyone who cares about timing will use some other source'.

2) That being said, interpreting on shutdown that you're the only owner of the RTC and should reset it is kind of rude, especially if you've gotten no user input that the time you read from the RTC is wrong (in the form of the user/admin resetting the time.)

3) In fact, the logic of doing it on shutdown because you don't trust the RTC not to drift is pretty easily extended to an argument for having 11-minute mode on unilaterally. We could certainly package a service/daemon that does so, but that does seem to be a hack. And it would break the situations from #2 that were broken before, and we did get bug reports on (without also modifying all the live image tools to disable such a service, etc.)

4) The behavior of setting the hwclock when the user requests a time change does seem to be proper.

5) Generally, the simplest solution is to fix it at the lowest level. That would be in the kernel. Above that would be libc. It's worth investigating both of those solutions.

6) If that fails, then you patch the tools.

7) There are plenty of people who are CC'd on this bug who look like they're capable of making such patches. So please do so.
Comment 45 Kay Sievers 2012-09-25 08:09:52 EDT
Let's collect some thoughts here:

* We like to touch the RTC only if we have a supposed-to-be-authoritative
time source, which is running NTP and get a sync, or have the user/admin
asking to set the time to a certain value. In all other cases we like to
leave the RTC alone, because we can not really know what to write there.

* To automatically sync inside the kernel at settimeofday() is difficult
because the kernel does not know the offset to UTC if the clock runs in
localtime. We would need to add infrastructure to the kernel to store
(and also later update) that value from userspace, or drop RTC-in-localtime
support entirely.

* To automatically sync from glibc *could* have unwanted side-effects because
syncing the RTC *can* take a long time, and might need to be synchronized
at the full second granularity.

* With NTP there is also one piece missing, the kernel's 11-minute mode
does only cover minutes and seconds to stay out of the time zone business.
If the time in the RTC is off by more than 15/30 minutes, the kernel will
not fix it. NTP (or another tool) would need to do a full sync here too, if
needed.
Comment 46 Bill Nottingham 2012-09-25 08:32:40 EDT
For the glibc angle, would it be possible to add an async kernel interface?(In reply to comment #45)
> Let's collect some thoughts here:
> 
> * We like to touch the RTC only if we have a supposed-to-be-authoritative
> time source, which is running NTP and get a sync, or have the user/admin
> asking to set the time to a certain value. In all other cases we like to
> leave the RTC alone, because we can not really know what to write there.

Is there a reason we can't have a service that the administrator enables to sync the clock at either shutdown, or via 11-minute mode, even if it's not enabled by default? I can see from this bug that there are certain cases where even if we-the-system do not know if/what we should write there, the administrator may know enough to decide that the RTC is reasonable enough at startup but needs periodic syncing for drift.

> * To automatically sync inside the kernel at settimeofday() is difficult
> because the kernel does not know the offset to UTC if the clock runs in
> localtime. We would need to add infrastructure to the kernel to store
> (and also later update) that value from userspace, or drop RTC-in-localtime
> support entirely.

Is this something we can reasonably add?

> * To automatically sync from glibc *could* have unwanted side-effects because
> syncing the RTC *can* take a long time, and might need to be synchronized
> at the full second granularity.

I'd suggest having it as an async work item in the kernel, although in that case with standard libc interfaces you'd lose the error reporting if it failed. Not sure that the error handling is explicitly required there.

> * With NTP there is also one piece missing, the kernel's 11-minute mode
> does only cover minutes and seconds to stay out of the time zone business.
> If the time in the RTC is off by more than 15/30 minutes, the kernel will
> not fix it. NTP (or another tool) would need to do a full sync here too, if
> needed.

Is this already covered by the code that NTP has for handling DST changes?

The other question: what do other OSes in this space (Solaris, MacOS, Windows) do?
Comment 47 Kay Sievers 2012-09-25 09:08:19 EDT
(In reply to comment #46)
> Is there a reason we can't have a service that the administrator enables to
> sync the clock at either shutdown

Sure, it could be a simple unit that calls hwclock --systohc at shutdown.
We only would need to make sure that this is not enabled by default.

> or via 11-minute mode, even if it's not enabled by default?

You mean enabling of the 11-minute sync, even without NTP? And the minutes+
seconds adjust only, would be sufficient?

> > * To automatically sync inside the kernel at settimeofday() is difficult
> > because the kernel does not know the offset to UTC if the clock runs in
> > localtime. We would need to add infrastructure to the kernel to store
> > (and also later update) that value from userspace, or drop RTC-in-localtime
> > support entirely.
> 
> Is this something we can reasonably add?

It's quite a lot of new interfaces and assumptions to make. Certainly all
possible, but I guess we would have to argue why we need that in the kernel.

> > syncing the RTC *can* take a long time, and might need to be synchronized
> > at the full second granularity.
> 
> I'd suggest having it as an async work item in the kernel

We would have to have all the (above mentioned) infrastructure for the
(almost unsupportable) rtc-in-localtime mode, which will probably raise a
lot of good questions why we want that in the kernel, instead of getting
our act together in userspace. :)

> > * With NTP there is also one piece missing, the kernel's 11-minute mode
> > does only cover minutes and seconds to stay out of the time zone business.

> Is this already covered by the code that NTP has for handling DST changes?

The 11-minute mode is time zone and DST agnostic, it does only cover the
RTCs minutes and seconds in a 15/30 minute window, it has no idea about the
system's local time zone or the DST. How is NTP involved here in DST changes?

> The other question: what do other OSes in this space (Solaris, MacOS,
> Windows) do?

Windows and MacOS do not support multi-boot environments properly, they just
don't care. They don't really supports 2 modes (LOCAL and UTC) for the RTC, they support only one mode assume that one, which simplifies things a lot.
All other OSes basically assume they own the box.

If Windows is set to RTC-in-UTC (by deault LOCAL) with the unsupported
registry flag, Windows will never automatically touch/sync down to the
RTC again.
Comment 48 Miroslav Lichvar 2012-09-26 04:58:49 EDT
Here is a proposal to fix most of the issues mentioned in this bug:

- compile kernel without CONFIG_GENERIC_CMOS_UPDATE
- add a daemon mode to the hwclock tool
  - its purpose is to keep RTC reasonably close to the system clock
  - it can make a rough estimate of the drift and write to the RTC only as necessary (usually much longer interval than 11 minutes)
  - it waits for jumps in the system clock via TFD_TIMER_CANCEL_ON_SET
  - it periodically checks if there was a major change in the system clock via other means (e.g. large tick/freq/offset in adjtimex) by monitoring CLOCK_MONOTONIC and CLOCK_MONOTONIC_RAW (should be cheaper than RTC UIE)
  - in LOCAL mode it waits also for changes in DST (periodically calls tzset() to reload tzdata)
  - (?) it keeps the RTC device open so nothing else can write to it
- create a service for the hwclock daemon mode
- don't set RTC anywhere else, let hwclock be the only thing touching RTC (running in the daemon mode, run on shutdown or disabled)
Comment 49 Lennart Poettering 2012-09-28 06:24:37 EDT
(In reply to comment #48)
> Here is a proposal to fix most of the issues mentioned in this bug:
> 
> - compile kernel without CONFIG_GENERIC_CMOS_UPDATE
> - add a daemon mode to the hwclock tool
>   - its purpose is to keep RTC reasonably close to the system clock
>   - it can make a rough estimate of the drift and write to the RTC only as
> necessary (usually much longer interval than 11 minutes)
>   - it waits for jumps in the system clock via TFD_TIMER_CANCEL_ON_SET
>   - it periodically checks if there was a major change in the system clock
> via other means (e.g. large tick/freq/offset in adjtimex) by monitoring
> CLOCK_MONOTONIC and CLOCK_MONOTONIC_RAW (should be cheaper than RTC UIE)
>   - in LOCAL mode it waits also for changes in DST (periodically calls
> tzset() to reload tzdata)
>   - (?) it keeps the RTC device open so nothing else can write to it
> - create a service for the hwclock daemon mode
> - don't set RTC anywhere else, let hwclock be the only thing touching RTC
> (running in the daemon mode, run on shutdown or disabled)

We need fewer daemons, not more. If people want accurate clocks they should use NTP, and NTP should simply sync big jumps down to the RTC.
Comment 50 Bill Nottingham 2012-09-28 16:26:15 EDT
It's a reasonable tradeoff as opposed to patching every single thing that might set the clock; the point of this proposal is (AFAIR) for those that don't run a NTP client (no public network?) but still trust their system clock more than their RTC.

The other option is, of course, to assume we own the box and do as we please to the RTC. But that's not polite.

That being said, a different way might be to have a hwclock.service that is kicked off by a timer unit for periodic syncing while booted; while that might be more inefficient overall than something that knows about any current slack and is only woken up when the time actually changes, it doesn't involve a daemon and is much simpler to code.
Comment 51 Steve Tyler 2012-09-29 15:11:49 EDT
(In reply to comment #50)
> ...  a hwclock.service that is
> kicked off by a timer unit for periodic syncing while booted; ...

Would the sync interval be configurable?
Could periodic syncing be disabled?
Could the service be configured to "set the hwclock on shutdown or reboot"?

And in reply to some earlier comments, here are a few quotes from an Intel technical document on RTCs, last revised April 2012:

"The voltage of the battery can affect the RTC accuracy." (p. 9)

"The crystal temperature itself will impact the RTC accuracy." (p. 15)

"Condensation from humidity can also affect the RTC accuracy ..." (p. 15)

"Note that the temperature dependency of crystal frequency is a parabolic relationship (ppm / degree squared). The effect of the crystal’s frequency when operating at 0 °C (25 °C below room temperature) is the same when operating at 50 °C (25 °C above room temperature)." (p. 12)

Intel® I/O Controller Hub (Intel® ICH) / Platform Controller Hub (PCH) Family
Real Time Clock (RTC)
Electrical, Mechanical, and Thermal Specification (EMTS) - AP-728
April 2012
http://www.intel.com/content/www/us/en/chipsets/ich-family-real-time-clock-accuracy-considerations-note.html

So it should be possible to model the RTC with enough inputs ... :-)
Comment 52 Steve Tyler 2012-09-29 16:04:43 EDT
(In reply to comment #31)
> (In reply to comment #30)
> > Perhaps the hwclock --systohc call could be in the initscripts package as a
> > fedora specific service?
> 
> You still need to make sure that this is never called on any Live-CD, rescue
> system, stateless system, or a multi-boot environment with the RTC in local
> time.
...

The hwclock.service could be disabled on a Live CD. kickstart supports that:
http://fedoraproject.org/wiki/Anaconda/Kickstart#services
http://git.fedorahosted.org/git/spin-kickstarts.git
Comment 53 Steve Tyler 2012-09-30 16:37:32 EDT
(In reply to comment #51)
> (In reply to comment #50)
> > ...  a hwclock.service that is
> > kicked off by a timer unit for periodic syncing while booted; ...
> 
> Would the sync interval be configurable?
> Could periodic syncing be disabled?
> Could the service be configured to "set the hwclock on shutdown or reboot"?

There used to be an hwclock-save.service in systemd. It was removed here:

drop hwclock-save.service
http://cgit.freedesktop.org/systemd/systemd/commit/?id=da2617378523e007ec0c6efe99d0cebb2be994e1

The kickstart file for the Live CDs disables the hwclock-save.service this way:

# Make sure we don't mangle the hardware clock on shutdown
ln -sf /dev/null /etc/systemd/system/hwclock-save.service

http://git.fedorahosted.org/cgit/spin-kickstarts.git/tree/fedora-live-base.ks

See also:
Bug 707877 - hwclock-save.service lacks [Install] section
Comment 54 Miroslav Lichvar 2012-10-15 04:46:35 EDT
*** Bug 865916 has been marked as a duplicate of this bug. ***
Comment 55 Miroslav Lichvar 2012-10-15 04:48:50 EDT
*** Bug 865917 has been marked as a duplicate of this bug. ***
Comment 56 Bill Nottingham 2012-10-15 16:27:44 EDT
Miroslav - opinions on comment #49/#50?

How does PTPD handle this?
Comment 57 Steve Tyler 2012-10-15 17:50:43 EDT
In reply to some earlier comments ...

After a clean, default, Gnome desktop F18 install, chrony is enabled and /etc/chrony.conf has these lines:

# Enable kernel RTC synchronization.
rtcsync

The chrony documentation says:

4.2.41 rtcsync

The rtcsync directive will enable a kernel mode where the system time is copied to the real time clock (RTC) every 11 minutes.

This directive is supported only on Linux and cannot be used when the normal RTC tracking is enabled, i.e. when the rtcfile directive is used. 

http://chrony.tuxfamily.org/manual.html#rtcsync-directive

Tested with:
Fedora-18-Beta-TC4-x86_64-DVD.iso
chrony-0:1.27-0.5.pre1.git1ca844.fc18.x86_64
Comment 58 Miroslav Lichvar 2012-10-16 05:41:08 EDT
I understand an extra daemon is not desirable. But is there any other way how to do it if we want to handle all the possibilities? Perhaps patching kernel to provide some kind of hook?

Karel, do you have any suggestions?

NTP doesn't and shouldn't care about RTC. It just sets a flag via adjtimex() that the system clock is synchronized. Some systems don't have an RTC, some may have more than one and on some archs the kernel sets the time fully as the RTC is always kept in UTC. These details really shouldn't be handled in every application which can set or adjust the system clock.

None of the PTP implementations I have seen care about RTC. The linuxptp implementation even has two separate services, one to synchronize the PTP hardware clock and other to synchronize the system clock. Similarly, I think it makes sense to have a service which synchronizes the RTC.

As for the problem with multiple systems installed on the machine, perhaps there could be a checkbox in system-config-date which sets whether the system owns the RTC and is allowed to touch it?

As a quick fix, I'd suggest to revert to calling hwclock -w on shutdown and perhaps also periodically from cron. I believe this will fix more issues than reintroduce.
Comment 59 Karel Zak 2012-10-16 07:46:14 EDT
(In reply to comment #58)
> Karel, do you have any suggestions?

I think we should not ignore the fact that on some places is reliable time
source unreachable, but HW clock inaccuracy is predictable.

IMHO:

short-term solution:

 - improve kernel CONFIG_GENERIC_CMOS_UPDATE to make it configurable via /sys
   (disable the update at all without kernel recompilation)

 - add hwclock.service, when enabled then disable CONFIG_GENERIC_CMOS_UPDATE
   (otherwise hwclock have no information about RTC drift), on shutdown or/and
   from cron call hwclock --systohc to calibrate /etc/adjtime

 - the service should be enabled on machines without reliable time source
 
long-term solution:

  - something like comment #48 to improve our ability to detect
    jumps in the system clock, DTS changes when RTC is in LOCAL time, etc.
Comment 60 Lennart Poettering 2012-10-18 10:02:35 EDT
(In reply to comment #50)
> It's a reasonable tradeoff as opposed to patching every single thing that
> might set the clock; 

Which are like two or so? NTP and chrony? (And no, I don't count coreutils' date among these. That's is a low-level tool, and only handles system clock and should stay that way. If people use this low-level tool to change the system clock, it's not much to ask to invoke "hwclock -w" too).

> the point of this proposal is (AFAIR) for those that
> don't run a NTP client (no public network?) but still trust their system
> clock more than their RTC.

If people want this (and I doubt there's many who do) then some package could provide a a service that invokes "hwclock -w" at shutdown or another point in time. This shouldn't be part of the default.
Comment 61 Steve Tyler 2012-10-18 11:29:52 EDT
(In reply to comment #59)
> (In reply to comment #58)
> > Karel, do you have any suggestions?
> 
> I think we should not ignore the fact that on some places is reliable time
> source unreachable, but HW clock inaccuracy is predictable.
> 
> IMHO:
> 
> short-term solution:
> 
>  - improve kernel CONFIG_GENERIC_CMOS_UPDATE to make it configurable via /sys
>    (disable the update at all without kernel recompilation)
> 
>  - add hwclock.service, when enabled then disable CONFIG_GENERIC_CMOS_UPDATE
>    (otherwise hwclock have no information about RTC drift), on shutdown
> or/and
>    from cron call hwclock --systohc to calibrate /etc/adjtime
> 
>  - the service should be enabled on machines without reliable time source
>  
> long-term solution:
> 
>   - something like comment #48 to improve our ability to detect
>     jumps in the system clock, DTS changes when RTC is in LOCAL time, etc.

Excellent suggestions.

What would you make configurable via /sys?
What would the defaults be?
Comment 62 Steve Tyler 2012-10-18 12:23:47 EDT
(In reply to comment #59)
...
>  - add hwclock.service, when enabled then disable CONFIG_GENERIC_CMOS_UPDATE
>    (otherwise hwclock have no information about RTC drift), on shutdown
> or/and
>    from cron call hwclock --systohc to calibrate /etc/adjtime

I believe that systemd can do both with a timer unit:

systemd.timer — Timer unit configuration
http://0pointer.de/public/systemd-man/systemd.timer.html

"A unit configuration file whose name ends in .timer encodes information about a timer controlled and supervised by systemd, for timer-based activation."
...
"Unless DefaultDependencies= is set to false, timer units will implicitly have dependencies of type Conflicts= and Before= on shutdown.target. These ensure that timer units are stopped cleanly prior to system shutdown. Only timer units involved with early boot or late system shutdown should disable this option."
...
"OnUnitActiveSec= defines a timer relative to when the unit the timer is activating was last activated."
Comment 63 Miroslav Lichvar 2012-12-13 04:16:50 EST
*** Bug 885770 has been marked as a duplicate of this bug. ***
Comment 64 Lennart Poettering 2013-01-14 17:14:06 EST
*** Bug 753642 has been marked as a duplicate of this bug. ***
Comment 65 Chris Murphy 2013-02-05 01:24:40 EST
At the expense of reductio ad absurdum: real world example of how this is fundamentally broken from a big picture end user view point.

Fedora 18 right now has zero mechanism to update the hardware clock. By default, chrony is not enabled, therefore kernel rtcsync isn't enabled. Once the hardware clock is off by 15 minutes, even when chrony is enabled, it won't fix the hardware clock. So the fall out of this, which I just reproduced is this bit of ridiculousness:

Hardware clock is wrong by 2 months. Chrony is enabled in systemd. I boot the computer with an internet connection, and system clock is correct. I disconnect the network connection and reboot, now the system clock is set to hardware clock date/time (which is two months ago).

Reboot with internet. Correct date/time.
Reboot without internet. Incorrect date/time.

This behavior doesn't happen with Windows or OS X of any version. As soon as their NTP service determines updates the date/time, subsequent reboots show the correct date/time regardless of internet connectivity.

So I get the arguments that systemd should't do this, and that doing it on shutdown is fragile and not good engineering. But the current behavior is embarrassingly ridiculous.
Comment 66 Prarit Bhargava 2013-02-07 09:05:46 EST
(In reply to comment #65)

> 
> So I get the arguments that systemd should't do this, and that doing it on
> shutdown is fragile and not good engineering. But the current behavior is
> embarrassingly ridiculous.

tl;dr Chris, this is something that needs to be fixed in the kernel because ntp's synchronization mechanism is executed from within the kernel.

Chris, I'm not in any way disputing that you have a bug.  What I do want to do is give you an explanation of how systemd and the kernel interact so that you have a better appreciation of what the bug is.

The way things work is this.  In early boot, systemd (by reading /etc/adjtime)  determines if the hardware clock is LOCAL or UTC.  systemd makes a settimeofday() call to set the timezone information based on that data.  After a few seconds, systemd starts more processes, one of which is NTP.  NTP signals the kernel (via a call to adjtimex) to start the synchronization process which is a task that runs every 11 minutes.  This process *is a kernel thread*.

The problem you're seeing, unfortunately, is that the system time is getting updated, the hardware clock is not being updated.  This is because the code that the kernel process calls limits changes to the hardware clock to a +/-15 minute window relative to the current hardware clock value; that is you cannot change the hardware clock by more than 15 minutes.

For the record, I don't believe that is the correct thing to do.  As you point out, no other OS does this; Linux AFAICT is the only one.

I'm putting together a kernel patch that will change this behaviour on x86.  Other architectures (powerpc for example) do a full synchronization of the clock and I do not see any reason why x86 should be artificially limited to a 15 minute window.

Hope this better explains where the bug lies,

P.
Comment 67 Chris Murphy 2013-02-15 18:20:28 EST
Prarit I sincerely appreciate the response and explanation.
Comment 68 Fedora End Of Life 2013-07-04 02:21:52 EDT
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 69 Miroslav Lichvar 2013-07-04 04:01:09 EDT
The current rawhide kernel still seems to limit the RTC adjustment to +/-15 minutes on x86. Per comment #66, reassigning to the kernel component.
Comment 70 Prarit Bhargava 2013-07-07 19:33:54 EDT
(In reply to Miroslav Lichvar from comment #69)
> The current rawhide kernel still seems to limit the RTC adjustment to +/-15
> minutes on x86. Per comment #66, reassigning to the kernel component.

Hmm ... I could swear that this was resolved a while ago.  I'll double check with F19 + rawhide kernel this week.

P.
Comment 71 Prarit Bhargava 2013-07-08 16:20:43 EDT
AFAICT ... WORKSFORME:

[root@intel-knightscorner-01 linux]# yum -y install ntpdate
Resolving Dependencies
--> Running transaction check
---> Package ntpdate.x86_64 0:4.2.6p5-11.fc19 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package        Arch          Version                Repository            Size
================================================================================
Installing:
 ntpdate        x86_64        4.2.6p5-11.fc19        beaker-Fedora         80 k

Transaction Summary
================================================================================
Install  1 Package

Total download size: 80 k
Installed size: 117 k
Downloading packages:
ntpdate-4.2.6p5-11.fc19.x86_64.rpm                         |  80 kB   00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : ntpdate-4.2.6p5-11.fc19.x86_64                               1/1 
  Verifying  : ntpdate-4.2.6p5-11.fc19.x86_64                               1/1 

Installed:
  ntpdate.x86_64 0:4.2.6p5-11.fc19                                              

Complete!
[root@intel-knightscorner-01 linux]# systemctl enable ntpdate
ln -s '/usr/lib/systemd/system/ntpdate.service' '/etc/systemd/system/multi-user.target.wants/ntpdate.service'
[root@intel-knightscorner-01 linux]# date;hwclock
Mon Jul  8 16:13:35 EDT 2013
Mon 08 Jul 2013 03:13:02 PM EDT  -0.032038 seconds

^^^ hwclock is one hour behind...

[root@intel-knightscorner-01 linux]# reboot
Connection to intel-knightscorner-01.lab.bos.redhat.com closed by remote host.
Connection to intel-knightscorner-01.lab.bos.redhat.com closed.
[prarit@prarit linux-2.6]$ ssh root@intel-knightscorner-01.lab.bos.redhat.com
root@intel-knightscorner-01.lab.bos.redhat.com's password: 
Last login: Mon Jul  8 16:08:38 2013 from 10.18.17.119
d[root@intel-knightscorner-01 ~]# date;hwclock
Mon Jul  8 16:18:10 EDT 2013
Mon 08 Jul 2013 04:18:11 PM EDT  -0.032005 seconds
[root@intel-knightscorner-01 ~]# 

^^^^ clock is up-to-date.

[root@intel-knightscorner-01 ~]# rpm -q kernel
kernel-3.9.9-301.fc19.x86_64
[root@intel-knightscorner-01 ~]# cat /etc/fedora-release 
Fedora release 19 (Schrödinger’s Cat)


P.
Comment 72 Steve Tyler 2013-07-08 17:15:00 EDT
(In reply to Prarit Bhargava from comment #71)
> AFAICT ... WORKSFORME:
> 
> [root@intel-knightscorner-01 linux]# yum -y install ntpdate
...

Thanks for looking into this.

The default NTP client for Fedora is chrony, and, AFAICT, with it enabled, the RTC is not adjusted when the RTC year is set ahead one year to 2014.

$ date; cat /proc/driver/rtc
Mon Jul  8 14:03:52 PDT 2013
rtc_time	: 14:03:52
rtc_date	: 2014-07-08
...

$ grep rtc /etc/chrony.conf 
rtcsync

chrony-1.28-0.1.pre1.fc19.i686
kernel-3.9.9-301.fc19.i686

NB: Tested on a system with the RTC set to local time (for dual-booting with Windows).
Comment 73 Steve Tyler 2013-07-09 05:52:17 EDT
(In reply to Prarit Bhargava from comment #71)
> AFAICT ... WORKSFORME:
> 
> [root@intel-knightscorner-01 linux]# yum -y install ntpdate
...
> [root@intel-knightscorner-01 linux]# systemctl enable ntpdate
...

The ntpdate.service invokes a wrapper function that optionally calls hwclock:
$ grep SYNC /usr/libexec/ntpdate-wrapper
[ $RETVAL -eq 0 ] && [ "$SYNC_HWCLOCK" = "yes" ] && /sbin/hwclock --systohc

The default value for SYNC_HWCLOCK is "no":
$ grep SYNC /etc/sysconfig/ntpdate 
SYNC_HWCLOCK=no

With ntpdate.service enabled and the default configuration, the RTC is not set to the system time after rebooting. NB: For this test, I disabled chrony.

What do you show for this?
$ grep SYNC /etc/sysconfig/ntpdate
Comment 74 Miroslav Lichvar 2013-07-09 07:44:26 EDT
(In reply to Prarit Bhargava from comment #71)
> AFAICT ... WORKSFORME:
> 
> [root@intel-knightscorner-01 linux]# yum -y install ntpdate

Hm, enabling ntpdate and rebooting doesn't seem to help here. Unless the SYNC_HWCLOCK is set to yes in /etc/sysconfig/ntpdate, the service won't call hwclock and also ntpdate doesn't clear the adjtimex STA_UNSYNC flag, so the kernel shouldn't touch RTC either. Perhaps something else did set the RTC on the shutdown or boot?

Per comment #66, this should be fixed in the kernel by setting the RTC fully with STA_UNSYNC cleared instead of the limited +/-15 minutes range, so when ntpd or chrony is running, the system time will be periodically copied to the RTC. 

I don't see that happening here:

[root@nec-em6 ~]# hwclock --set --date '-65 minutes'
[root@nec-em6 ~]# date;hwclock
Tue Jul  9 10:04:24 UTC 2013
Tue Jul  9 08:59:25 2013  -0.672418 seconds
[root@nec-em6 ~]# systemctl start chronyd
[root@nec-em6 ~]# chronyc tracking | grep Reference
Reference ID    : 10.16.71.254 (10.16.71.254)
[root@nec-em6 ~]# while true; do date; hwclock; sleep 1m; done
Tue Jul  9 10:05:46 UTC 2013
Tue Jul  9 09:00:46 2013  -0.500576 seconds
Tue Jul  9 10:06:46 UTC 2013
Tue Jul  9 09:01:47 2013  -1.000548 seconds
Tue Jul  9 10:07:47 UTC 2013
Tue Jul  9 09:02:48 2013  -1.004972 seconds
Tue Jul  9 10:08:48 UTC 2013
Tue Jul  9 09:03:49 2013  -0.984908 seconds
Tue Jul  9 10:09:49 UTC 2013
Tue Jul  9 09:04:50 2013  -1.004909 seconds
...
Tue Jul  9 11:30:08 UTC 2013
Tue Jul  9 10:25:09 2013  -1.000536 seconds
Tue Jul  9 11:31:09 UTC 2013
Tue Jul  9 10:26:10 2013  -1.004975 seconds
Tue Jul  9 11:32:10 UTC 2013
Tue Jul  9 10:32:11 2013  -0.469276 seconds
Tue Jul  9 11:33:11 UTC 2013
Tue Jul  9 10:33:12 2013  -0.984894 seconds

As you can see, only minutes were corrected (and it took a very long time to do it).

I don't fully understand the kernel code, but in arch/x86/kernel/rtc.c It see:
        if (((abs(real_minutes - cmos_minutes) + 15)/30) & 1)
              real_minutes += 30;
        real_minutes %= 60;

Is this the code that needs to be removed?
Comment 75 Prarit Bhargava 2013-07-09 08:28:33 EDT
I did a plain install of F19 and AFAICT this works by installing ntpdate and starting the ntpdate service.

The steps I did were:

hwclock --set --date="7/08/13 15:08:30"
date;hwclock
reboot

After the reboot do

date;hwclock

You should see that the date & hwclock are different.

yum -y install ntpdate; systemctl enable ntpdate
reboot

After the second reboot do

date;hwclock


If this does not work with chrony then chrony has a problem, not the kernel. ;)  It is likely that chrony is NOT doing an immediate time fix @ boot.

The current F19 kernel and upstream both contain commit 
3195ef59cb42cda3aeeb24a7fd2ba1b900c4a3cc which removes the +/- 15 minute restriction.  Part of that commit does,

-       /*
-        * since we're only adjusting minutes and seconds,
-        * don't interfere with hour overflow. This avoids
-        * messing with unknown time zones but requires your
-        * RTC not to be off by more than 15 minutes
-        */
-       real_seconds = nowtime % 60;
-       real_minutes = nowtime / 60;
-       /* correct for half hour time zone */
-       if (((abs(real_minutes - cmos_minutes) + 15)/30) & 1)
-               real_minutes += 30;
-       real_minutes %= 60;

P.
Comment 76 Miroslav Lichvar 2013-07-09 09:00:42 EDT
(In reply to Prarit Bhargava from comment #75)
> I did a plain install of F19 and AFAICT this works by installing ntpdate and
> starting the ntpdate service.

The trouble is it shouldn't work with ntpdate unless the service is configured to call hwclock --systohc. Also, this wouldn't test the kernel RTC update.

> If this does not work with chrony then chrony has a problem, not the kernel.
> ;)  It is likely that chrony is NOT doing an immediate time fix @ boot.

chronyd just sets the adjtimex status and I see it's doing that. Not much can go wrong here.
 
> The current F19 kernel and upstream both contain commit 
> 3195ef59cb42cda3aeeb24a7fd2ba1b900c4a3cc which removes the +/- 15 minute
> restriction.  Part of that commit does,

Great. I see that commit in the upstream repo, but not in the f19 kernel (3.9.9-301.fc19).

$ git tag --contains 3195ef59cb42cda3aeeb24a7fd2ba1b900c4a3cc
v3.10
v3.10-rc1
v3.10-rc2
v3.10-rc3
v3.10-rc4
v3.10-rc5
v3.10-rc6
v3.10-rc7

Can you please include it in the f19 kernel?
Comment 77 Josh Boyer 2013-07-09 09:08:12 EDT
f19 should be moving to the 3.10 kernel shortly.
Comment 78 Steve Tyler 2013-07-09 12:29:26 EDT
(In reply to Prarit Bhargava from comment #75)
> I did a plain install of F19 and AFAICT this works by installing ntpdate and
> starting the ntpdate service.
...

OK, but we cannot reproduce your results.

What do you show for this?

$ grep SYNC /etc/sysconfig/ntpdate

Did you try testing with "ntp"? (It would be better to test with chrony, though, since it is the default NTP client for Fedora.)

# repoquery ntp ntpdate chrony kernel
chrony-0:1.28-0.1.pre1.fc19.x86_64
kernel-0:3.9.9-301.fc19.x86_64
ntp-0:4.2.6p5-11.fc19.x86_64
ntpdate-0:4.2.6p5-11.fc19.x86_64
Comment 79 Steve Tyler 2013-07-09 16:35:34 EDT
(In reply to Miroslav Lichvar from comment #74)
...
> As you can see, only minutes were corrected (and it took a very long time to
> do it).
...

Thanks for pointing that out.

With the RTC set about 20 minutes ahead and chrony enabled at booting, the RTC is adjusted after about 11 minutes. After adjustment, the RTC is 30 minutes ahead.

NB: The test system has the RTC on local time.

$ while true; do echo '==='; date; grep rtc_time /proc/driver/rtc; uptime; sleep 1m; done
...
===
Tue Jul  9 12:49:20 PDT 2013
rtc_time	: 13:09:17
 12:49:20 up 10 min,  3 users,  load average: 0.01, 0.23, 0.26
===
Tue Jul  9 12:50:20 PDT 2013
rtc_time	: 13:10:17
 12:50:20 up 11 min,  3 users,  load average: 0.04, 0.21, 0.25
===
Tue Jul  9 12:51:20 PDT 2013
rtc_time	: 13:21:20
 12:51:20 up 12 min,  3 users,  load average: 0.01, 0.17, 0.23
===
Tue Jul  9 12:52:20 PDT 2013
rtc_time	: 13:22:20
 12:52:20 up 13 min,  3 users,  load average: 0.01, 0.14, 0.22
===
...

kernel-3.9.9-301.fc19.i686
chrony-1.28-0.1.pre1.fc19.i686
Comment 80 Chris Murphy 2013-07-17 12:40:20 EDT
This still is not working for me with 3.10.0-1. Filed new bug 985522 against kernel.
Comment 81 Josh Boyer 2013-10-08 13:46:27 EDT

*** This bug has been marked as a duplicate of bug 985522 ***