Description of problem: There is a problem when network connections are handles by NetworkManager. Namely while a machine is really networked there may be no connection yet when /etc/init.d/ntpd runs and we have a long delay trying to do something which cannot succeed. Even worse, we have a delay, quite substantial one in times, in a startup sequence always waiting for clock-tickers and a response from time servers. This is even bigger issue when the only connections which can be active in the given situation are wireless. With NetworkManager that means that somebody has to login before an interface to an "outside world" will be up. Here is a proposed solution. Add in /etc/sysconfig/ntpd the following: # Wait that many minutes for a network interface to show up. NDELAY=15 # If NDELAY not given or "" then start immediately and synchronously; # otherwise do that in a background and success and failure messages # are suppressed. # If NDELAY is 0 then we wait indefinitely. # How often to try, in seconds, if NDELAY is not "". NGAP=60 # A list of interfaces through which an ntpd server can be reached. # Used only when NDELAY is non-empty. # If not given then all non-loopback interfaces will be tried. # NLIST="eth0 wlan0 wifi0" and modify /etc/init.d/ntpd as in an attached version. With NDELAY="" this works like now for those who really need ntpd up (not guaranteed in any case) before proceeding with the rest of a startup sequence. Otherwise this happens in a background after some "candidate" network interfaces are detected in an UP state. A list of interfaces to check is important for machines with multiple connections when only on some of those we have a chance that an ntp server could be reached. The same issue affects ntp startup also in FC5 and RHEL.
Created attachment 141594 [details] modifies startup file for ntpd
This is a known problem, reported as bug #146884 and bug #206127. It will be fixed when new version of ntp is released (ntp-4.2.4) as it will be able to handle dynamic interfaces.
*** Bug 217423 has been marked as a duplicate of this bug. ***
Now it looks like that ntp-4.2.2p4-2.fc6 **really** broke ntpd if your network connection may show up later (because, for example, it is handled by NetworkManager). True, in a startup sequence one sees "OK" from ntdp immediately even if a network is down. This gives .LOCL. clock only, which does not buy very much so far. After a network is getting active the other servers are getting reported by 'ntpq -pn' and it appears that we are on our way but this is really an illusion. When starting with a network present those other servers are stratum, say, 2 or 3 and 127.127.1.0 is 10 so we are really syncronizing. If network shows up later then local clock is still stratum 10 but all other servers are stratum 16 and they stay that way. "reach", "delay", "offset" and "jitter" all are fixed at 0 so these other servers are really as good as dead. I was waiting for over 35 minutes, which is ridiculously long, and nothing changed. It also appears that an initial 'ntpdate' sync is lost so if your clock is outside drift limits it will never get synchronized. For now a possible workaround seems to be to leave some process which will check if we are really getting some servers with a stratum lower than 10 and repeatedly restart ntpd if this is not the case. Sigh! Restarting ntpd with a network active immediately shows outside servers which are really consulted for time.
It was always like this. A better workaround would be restarting ntp daemon in a script executed from NetworkManagerDispatcher.
I definitely agree with a "better workaround". Right now it is better not to start ntpd at all. In the situation in question ntpd is really only chewing cycles without contributing anything useful. The problem is that there are other issues of that sort. See bug 218237 for further examples. Do I miss some obstacle preventing an earlier startup of NetworkManager (and Dispatcher)?
What I proposed in an attachment to comment #1 still works fine with ntp-4.2.2p4-2.fc6. Only a closer look reveals that a network interface can be marked as UP with no address assigned. NetworkManager at work? Hence 'connection_up' function in that attachment should be modified as follows: connection_up () { if [ -z "$1" ] ; then ip -o addr | grep -wv lo | grep -qw 'inet6*' else for iface in $@ ; do ip -o addr show dev $iface 2>/dev/null \ | grep -qw 'inet6*' && return done fi } This, obviously, does not give guarantees that at least some time servers will be reachable but still much better than the current situation.
ntp-4.2.4 is finally in rawhide. Please give it a try, if everything is ok with NetworkManager. I will make an update for FC6 in a week or so.
Hm, I still have some doubts about "fixed" in ntp-4.2.4-3.fc6 (the current updates). If external step-tickers are not reachable when /etc/init.d/ntpd runs, which is a normal situation when NetworkManager is in use, then synchronization happens on .LOCAL. clock, ntpd starts without blocking there for a long time and so far so good. Also, when after some time, external servers become reachable then I see that indeed at some moment they are promoted to a higher strata and ntpd synchronizes there. Even better. Thanks! The problem is that if a difference between network time and a local clock is big enough, which is not that unusual, then after an initial synchronization on a local clock ntpd will give up, as designed, and we will be left with a wrong time and a manual intervention the only remaining option. Do I miss something?
Try removing everything from the step-ticker file, so ntpd will be started with -g option and removing local clock from ntp.conf.
Yes, I see. Dropping /etc/ntp/step-tickers file and "Undisciplined Local Clock" driver in /etc/ntp indeed should work by forcing a use of '-g' and currently all this gets relatively shortly into a desired state once ntp servers can be reached. I would be not that clear to me that I have to change that way my old configuration without what you wrote in comment #10. Thanks again!