Bug 216351
Summary: | ntpd starts "too early" | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michal Jaegermann <michal> | ||||
Component: | ntp | Assignee: | Miroslav Lichvar <mlichvar> | ||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6 | CC: | jbarnes | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-02-06 16:34:23 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Michal Jaegermann
2006-11-19 20:10:35 UTC
Created attachment 141594 [details]
modifies startup file for ntpd
This is a known problem, reported as bug #146884 and bug #206127. It will be fixed when new version of ntp is released (ntp-4.2.4) as it will be able to handle dynamic interfaces. *** Bug 217423 has been marked as a duplicate of this bug. *** Now it looks like that ntp-4.2.2p4-2.fc6 **really** broke ntpd if your network connection may show up later (because, for example, it is handled by NetworkManager). True, in a startup sequence one sees "OK" from ntdp immediately even if a network is down. This gives .LOCL. clock only, which does not buy very much so far. After a network is getting active the other servers are getting reported by 'ntpq -pn' and it appears that we are on our way but this is really an illusion. When starting with a network present those other servers are stratum, say, 2 or 3 and 127.127.1.0 is 10 so we are really syncronizing. If network shows up later then local clock is still stratum 10 but all other servers are stratum 16 and they stay that way. "reach", "delay", "offset" and "jitter" all are fixed at 0 so these other servers are really as good as dead. I was waiting for over 35 minutes, which is ridiculously long, and nothing changed. It also appears that an initial 'ntpdate' sync is lost so if your clock is outside drift limits it will never get synchronized. For now a possible workaround seems to be to leave some process which will check if we are really getting some servers with a stratum lower than 10 and repeatedly restart ntpd if this is not the case. Sigh! Restarting ntpd with a network active immediately shows outside servers which are really consulted for time. It was always like this. A better workaround would be restarting ntp daemon in a script executed from NetworkManagerDispatcher. I definitely agree with a "better workaround". Right now it is better not to start ntpd at all. In the situation in question ntpd is really only chewing cycles without contributing anything useful. The problem is that there are other issues of that sort. See bug 218237 for further examples. Do I miss some obstacle preventing an earlier startup of NetworkManager (and Dispatcher)? What I proposed in an attachment to comment #1 still works fine with ntp-4.2.2p4-2.fc6. Only a closer look reveals that a network interface can be marked as UP with no address assigned. NetworkManager at work? Hence 'connection_up' function in that attachment should be modified as follows: connection_up () { if [ -z "$1" ] ; then ip -o addr | grep -wv lo | grep -qw 'inet6*' else for iface in $@ ; do ip -o addr show dev $iface 2>/dev/null \ | grep -qw 'inet6*' && return done fi } This, obviously, does not give guarantees that at least some time servers will be reachable but still much better than the current situation. ntp-4.2.4 is finally in rawhide. Please give it a try, if everything is ok with NetworkManager. I will make an update for FC6 in a week or so. Hm, I still have some doubts about "fixed" in ntp-4.2.4-3.fc6 (the current updates). If external step-tickers are not reachable when /etc/init.d/ntpd runs, which is a normal situation when NetworkManager is in use, then synchronization happens on .LOCAL. clock, ntpd starts without blocking there for a long time and so far so good. Also, when after some time, external servers become reachable then I see that indeed at some moment they are promoted to a higher strata and ntpd synchronizes there. Even better. Thanks! The problem is that if a difference between network time and a local clock is big enough, which is not that unusual, then after an initial synchronization on a local clock ntpd will give up, as designed, and we will be left with a wrong time and a manual intervention the only remaining option. Do I miss something? Try removing everything from the step-ticker file, so ntpd will be started with -g option and removing local clock from ntp.conf. Yes, I see. Dropping /etc/ntp/step-tickers file and "Undisciplined Local Clock" driver in /etc/ntp indeed should work by forcing a use of '-g' and currently all this gets relatively shortly into a desired state once ntp servers can be reached. I would be not that clear to me that I have to change that way my old configuration without what you wrote in comment #10. Thanks again! |