Bug 1116474 - ntpd prints errors because of resolving before network is available
Summary: ntpd prints errors because of resolving before network is available
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: ntp
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
Assignee: Miroslav Lichvar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: dualstack
TreeView+ depends on / blocked
 
Reported: 2014-07-04 23:04 UTC by Christian Stadelmann
Modified: 2015-06-29 21:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-29 21:28:22 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Christian Stadelmann 2014-07-04 23:04:49 UTC
Description of problem:
Right after boot (and before any network connection is available according to kernel/journald) ntpd tries to establish connections to ntp servers. This always must fail and always gives errors in journald like these:
{date} {time} {hostname} ntpd_intres[{pid}]: host name not found: {hosname}

Note that this bug is not critical since ntpd continues to work – it just spams the journal.

Version-Release number of selected component (if applicable):
kernel 3.14, 3.15
ntpd 4.2.6p5
systemd 208

How reproducible:
always, on different computers with different network adapters (including LAN vs. WLAN)

Steps to Reproduce:
1. boot up
2. read logs using journalctl

Actual results:
ntpd prints error messages to journal because it cannot connect to ntp servers

Expected results:
ntpd should wait for a network connection before it tries to connect to ntp servers. If ntpd fails because of no network available it should log that no network is available.


Additional info:
according to the systemd service files, ntpdate.service is to be started after network.target and ntpd.service is to be started after ntpdate.service.
To make this work, ntpdate.service/ntpd.service needs to run after network-online.target (instead of network.target). This may cause trouble on offline machines where ntpd must fail.

Comment 1 Miroslav Lichvar 2014-07-07 09:39:56 UTC
The reason why ntpd is started so early is that it can use reference clocks (no network connection necessary) and also to minimize the time when the frequency of the system clock isn't set to the previously estimated drift.

To me, it would make more sense to try to disable or delay the "host name not found" message.

It looks like the message is printed only when getaddrinfo() returns EAI_NONAME. With EAI_AGAIN, it will not print the message and I think that's what it used to do earlier.

Pavel, do you know if it's correct for getaddrinfo() to return EAI_NONAME when /etc/resolv.conf is empty or the servers are unreachable?

Comment 2 Pavel Šimerda (pavlix) 2014-07-09 08:43:55 UTC
(In reply to Miroslav Lichvar from comment #1)
> Pavel, do you know if it's correct for getaddrinfo() to return EAI_NONAME
> when /etc/resolv.conf is empty or the servers are unreachable?

POSIX[1] is not very specific here. And I don't believe the RFCs much, but we'll have to look there as well. I think this should be defined under the ongoing glibc name resolution redesign.

[1] http://linux.die.net/man/3/getaddrinfo

As for listening services with known IP addresses (a slightly different topic), my opinion is clear. The service should use either the wildcard address or IP_FREEBIND. With services communicating with the outside world, actual connectivity and working DNS resolution is needed. A service like ntp which can be used in any situation, connections coming and going, connectivity changing, I would say that failures may be just as regular as success.

There's a special case of service boot (typically on system boot) where similar networking services typically wait for connectivity. If you want a service starting at early boot but accomodating to later situation, it might want to integrate with the systemd boot by requesting networ-online.target and delay network activity (msekleta could tell you more). Otherwise you probably have to live with the errors.

This is not recommended for services started by default, as network-online.target extends the overall boot time (the time until all boot time services are started).

Comment 3 Orion Poplawski 2014-09-30 16:23:17 UTC
ntpd isn't being started particularly early if it waits for ntpdate to finish, so you aren't getting any drift benefit.  So it seems to me that ntpdate.service should have After=network-online.target.  Those folks who run directly attached reference clocks should disable ntpdate.service, which is not an unreasonable expectation for such a special configuration.  I see you have nss-lookup.target there, but that does not appear to be sufficient and I don't see it being activated on my system.

Also, the syslog.target has been deprecated and should be removed from the After= line.

Comment 4 Miroslav Lichvar 2014-10-01 12:52:54 UTC
Good point, ntpdate service should depend on the network-online target, but ntpd is normally used without ntpdate.

Comment 5 Orion Poplawski 2014-10-01 16:45:14 UTC
Also, the dependency according to systemd.special(7) should be added to "Wants="

Comment 6 Miroslav Lichvar 2014-11-04 16:48:00 UTC
The ntpdate and sntp services in ntp-4.2.6p5-24.fc22 were updated to use the network-online.target. The code in the ntpdate wrapper that ran ntpdate several times in exponentially increasing intervals was removed. The first try should be working always now.

Comment 7 Orion Poplawski 2014-11-06 03:51:07 UTC
Looks good here.

Comment 8 Fedora End Of Life 2015-05-29 12:17:57 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Fedora End Of Life 2015-06-29 21:28:22 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.