Bug 1388759 - Deferred name resolving of NTP servers fails [NEEDINFO]
Summary: Deferred name resolving of NTP servers fails
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ntp
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Miroslav Lichvar
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1477664
TreeView+ depends on / blocked
 
Reported: 2016-10-26 06:35 UTC by Frank Büttner
Modified: 2021-09-09 11:59 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-16 16:43:52 UTC
Target Upstream Version:
mlichvar: needinfo?


Attachments (Terms of Use)
The stack traced. (90.00 KB, application/x-tar)
2017-04-20 05:27 UTC, Frank Büttner
no flags Details

Description Frank Büttner 2016-10-26 06:35:58 UTC
Description of problem:
ntp starts before the network is available.

Version-Release number of selected component (if applicable):
ntp-4.2.6p5-22

How reproducible:
Every time


Steps to Reproduce:
1. boot the server


Actual results:
The network is coming up after ntp starts.

Expected results:
That ntp starts after the network.

Additional info:
So the journal are full with:
ntpd_intres[3379]: recv() fails: No route to host
only an systemctl restart ntpd will stop the flood and get ntpd back to working.

temporary fix:
create /etc/systemd/system/ntpd.service.d/network.conf
with the content:
[Unit]
After=network.target syslog.target ntpdate.service sntp.service

Comment 2 Vipul Agarwal 2016-10-26 19:16:12 UTC
I ran into this problem. The ntpd service has no dependency on network or NetworkManger service. However, it does have dependency on ntpdate which further depends on network service. ntpdate is not enabled by default.

On boot up (with only ntpd enabled), the boot order is wrong with ntpd loads before network service: http://vagarwal.net/f/systemd-analyze_plot-problem.svg

If one enables ntpdate service, the boot order is fixed and problem is resolved:
http://vagarwal.net/f/systemd-analyze_plot-fixed.svg

Comment 3 Miroslav Lichvar 2016-10-27 11:50:38 UTC
The ntpd service doesn't wait for network, because ntpd can be useful without network (e.g. with reference clocks) and also to not delay restoring the frequency of the system clock from the driftfile.

I don't see any recv() errors reported by ntpd in my log, just deferred resolving of hostnames. Can you please post your ntp.conf?

Comment 4 Frank Büttner 2016-10-28 04:42:09 UTC
It will only contains this lines:

server foo.server
server bar.server
server xxx.server

Comment 8 Miroslav Lichvar 2017-04-13 14:45:55 UTC
(In reply to Frank Büttner from comment #0)
> So the journal are full with:
> ntpd_intres[3379]: recv() fails: No route to host

This error indicates the resolving process of ntpd is not able to connect to 127.0.0.1. Does the loopback interface have an unusual configuration, e.g. no IPv4 address?

Comment 9 Frank Büttner 2017-04-18 05:29:38 UTC
No, it looks normal:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever

Comment 10 Miroslav Lichvar 2017-04-18 11:18:17 UTC
Interesting. I'm assuming ntpd is configured to listen on all interfaces (which is the default). Does the following command work for you?

# ntpdc -c authinfo 127.0.0.1

Comment 11 Frank Büttner 2017-04-19 07:07:13 UTC
It will result in:
ntpdc -c authinfo 127.0.0.1
time since reset:     150
stored keys:          1
free keys:            11
key lookups:          9
keys not found:       1
uncached keys:        2
encryptions:          0
decryptions:          4
expired keys:         0

Comment 12 Miroslav Lichvar 2017-04-19 15:44:21 UTC
Hm, I'm running out of ideas. There is only one report for this error, so it's most likely something specific to the configuration of the machine.

I'm not sure how useful this will really be, but can you please consider running ntpd in strace and posting the logs? If you change the ExecStart line in the ntpd unit file to

/usr/bin/strace -ff -ttt -o /tmp/ntpd.strace /usr/sbin/ntpd -u ntp:ntp

the logs should be saved in /tmp/systemd-private-*-ntpd.service-*/tmp.

Comment 13 Frank Büttner 2017-04-20 05:27:19 UTC
Created attachment 1272853 [details]
The stack traced.

I have create the stack strace.
I hope it will help.

Comment 14 Miroslav Lichvar 2017-04-20 13:36:37 UTC
Thanks. Except the recv() error, I don't see anything wrong in the logs. The main ntpd process is bound to 127.0.0.1:123 as expected. The name resolving process is trying to send messages to that port, but for some reason it fails and the main process doesn't get any messages.

Maybe it is an issue with firewall configuration? If there was a rule using the owner match, it might explain why ntpq/ntpdc can connect to ntpd, but ntpd itself cannot. Does it work when firewall is disabled?

Comment 15 Frank Büttner 2017-04-21 09:02:03 UTC
firewalld is not used here.
Only iptables, an for the lo interface, all traffic is allowed.

Comment 17 Orion Poplawski 2017-06-30 19:47:32 UTC
Even ntpdate can fail because it can start before the network is up:

Jun 26 15:36:12 aspen ntpdate[641]: Can't find host ntp.cora.nwra.com: Name or service not known (-2)
Jun 26 15:36:23 aspen systemd: Reached target Network is Online.

It should have After=network-online.target instead of network.target, see https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

Comment 18 Orion Poplawski 2017-06-30 20:19:15 UTC
Ah, I see. /usr/libexec/ntpdate-wrapper keeps trying until the network is up.  Seems like that is just a work around though.  Sorry to hijack this report - thought it was more pertinent initially.

Comment 19 Tomáš Hozza 2018-11-16 16:43:52 UTC
Feel free to reopen with reproducer.


Note You need to log in before you can comment on or make changes to this bug.