Bug 240254

Summary:	ntpd does not obey -L or -I
Product:	[Fedora] Fedora	Reporter:	Curtis Doty <curtis>
Component:	ntp	Assignee:	Miroslav Lichvar <mlichvar>
Status:	CLOSED ERRATA	QA Contact:	Brian Brock <bbrock>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	6
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	4.2.4p2-1.fc7	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2007-06-29 14:03:59 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Curtis Doty 2007-05-16 02:41:31 UTC

With either of the above switches setup properly in /etc/sysconfig/ntpd, the
daemon will eventually forget what it was told and start listening on other
interfaces and/or aliases. Could this be at the default -U 5 minute timer interval?

Comment 1 Curtis Doty 2007-05-17 00:05:38 UTC

An additional symptom is that /var/log/messages fills with periodic (5 minute
interval) chatter about "Deleting interface ..." and "Listening on interface
..." nonsense.

Oh...and also, these are all 802.1Q interfaces. HTH

Comment 2 Miroslav Lichvar 2007-05-17 14:00:09 UTC

Is ntpd really answering on other interfaces or aliases?

I couldn't reproduce that. When ntpd is started on console with -n -D 4, it
correctly outputs "ignore on (10) fd=23 from a.b.c.d".

The problem with spamming syslog on every interface update needs to be fixed though.

Comment 3 Curtis Doty 2007-05-17 15:32:46 UTC

Yes, in my test case with "-I eth0.1", it starts out fine...listening only on
that interface.

But then in exactly 5 minutes, it starts listening on all. It logs that change
showing all the other interfaces enabled. And does so every 5 minutes ad nauseum.

Comment 4 Miroslav Lichvar 2007-05-17 16:12:30 UTC

By listening you mean a socket listed in netstat output?

The manpage should have a better description. ntpd binds to all interfaces, even
when -I is specified. The option actually sets interface on which ntpd shouldn't
ignore packets. This prevents from another ntp daemon running and tampering with
system clock.

First 5 minutes when ntpd doesn't have sockets on all interfaces is a bug.

Comment 5 Curtis Doty 2007-05-21 13:20:05 UTC

Yes, netstat, but more specifically...

Upon first starting with -I eth0.8 it is indeed bound to that interface's addr.
Plus an entry for lo and the 0.0.0.0:123 blocker to prevent mischief.

# netstat -nulp |grep /ntpd |tr -s ' '
udp 0 0 10.0.8.1:123 0.0.0.0:* 11037/ntpd 
udp 0 0 127.0.0.1:123 0.0.0.0:* 11037/ntpd 
udp 0 0 0.0.0.0:123 0.0.0.0:* 11037/ntpd 

Seems sane enough.

Then in 5 minutes, after the first click of the -U update interval, it totally
changes to include specific bindings for all the other interfaces.

# netstat -nulp |grep /ntpd |tr -s ' '
udp 0 0 10.0.254.1:123 0.0.0.0:* 11037/ntpd 
udp 0 0 10.0.253.1:123 0.0.0.0:* 11037/ntpd 
udp 0 0 10.0.252.1:123 0.0.0.0:* 11037/ntpd 
... much more extra clutter here ...
udp 0 0 10.0.8.1:123 0.0.0.0:* 11037/ntpd 
udp 0 0 127.0.0.1:123 0.0.0.0:* 11037/ntpd 
udp 0 0 0.0.0.0:123 0.0.0.0:* 11037/ntpd 

In this case host, there are *only* 802.1Q interfaces...and quite a few. So it's
just plain messy.

And I just noticed that ntpd doesn't even work with -I configured! All time
servers are stuck in .INIT. indefinitely.

Comment 6 Miroslav Lichvar 2007-05-21 14:28:01 UTC

Maybe it's messy, but that's how ntpd is designed. The first 5 minutes is a bug
and it's fixed by the same patch as the syslog spamming bug.

Can you provide more info about the problem with .INIT. state?

Also note that only the last -I option is used when more -I are specified on the
command line. Will be fixed in next release too.

Comment 7 Curtis Doty 2007-05-21 16:23:14 UTC

Yes, I saw the bug with multiple -I options too. Thanks!

After running a single -I eth0.8 option for an hour:

ntpq> peer
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 go.through.ru   .INIT.          16 -    -  128    0    0.000    0.000   0.000
 tilia.zsx.hu    .INIT.          16 -    -  128    0    0.000    0.000   0.000
 a1.develooper.c .INIT.          16 -    -  128    0    0.000    0.000   0.000
*LOCAL(0)        .LOCL.          10 l   46   64  377    0.000    0.000   0.001
ntpq> readvar
assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
version="ntpd 4.2.4p0 Wed Mar  7 18:52:44 UTC 2007 (1)",
processor="i686", system="Linux/2.6.20-1.2948.fc6", leap=00, stratum=11,
precision=-20, rootdelay=0.000, rootdispersion=11.641, peer=53903,
refid=LOCAL(0),
reftime=c9fc42bd.2a317b4f  Mon, May 21 2007  9:09:33.164, poll=10,
clock=c9fc42ed.f4eeea40  Mon, May 21 2007  9:10:21.956, state=4,
offset=0.000, frequency=-60.260, jitter=0.001, noise=0.001,
stability=0.000, tai=0

Is there anything else I should debug? The syslogs show only the usual 5-minute
interface spam. I really do suspect local config issue. Are you able to
test/verify a vlan-interfaces-only setup?

Comment 8 Miroslav Lichvar 2007-05-21 17:11:26 UTC

ATM I can't. But I'm not sure it's related to the problem.

Does it work without -I option? Does ntpdate with a public server work?

If yes, please start ntpd from command line with -I eth0.8 -n -D 4, wait for few
minutes and attach the output here.

Comment 9 Curtis Doty 2007-05-21 19:09:22 UTC

Yes it works fine without -I. And ntpdate works fine too.

And aha! Debugging the daemon in a console yields a couple interesting clues:

Found no interface for address 209.151.236.226 - returning wildcard 
newpeer: local interface currently not bound
newpeer: <null>->209.151.236.226 mode 3 vers 4 poll 6 10 flags 0x8001 0x1 ttl 0
key 00000000

In this case, the default route is required to access the *.pool.ntp.org
servers. But ntpd isn't listening on that interface. I guess it cannot therefore
pick a source address for the client queries?

If it's any consolation, this host has net.ipv4.icmp_errors_use_inbound_ifaddr=1
and it's where I caught this kernel bug:
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=d8cf27287ac7fb5cbfcc4139917a997c39d841ca
A different protocol, but curious/similar connundrum setting source address when
multi-homed.

Maybe this isn't really a bug, but an RFE that should swim upstream?

Comment 10 Miroslav Lichvar 2007-05-22 08:46:24 UTC

Ok, that's not a bug. The limitation is also used for outgoing packets and
that's a protocol requirement according to the following comment in upstream
bugzilla:

https://ntp.isc.org/bugs/show_bug.cgi?id=830#c5

Comment 11 Fedora Update System 2007-06-21 20:02:44 UTC

ntp-4.2.4p2-1.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2007-06-29 14:03:53 UTC

ntp-4.2.4p2-1.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.