Hide Forgot
Description of problem: Any non-trivial cluster should have time synchronization turned on for failover and other cluster operations to work correctly. It also helps with debugging via system logs, etc. The OCP installer should default the openshift_clock_enabled variable to True. Version-Release number of selected component (if applicable): 3.3.0.23
I think the intention of the original PR was to have this enabled by default: https://github.com/openshift/openshift-ansible/pull/1672#issuecomment-205493798 I think it's just a bug.
It's actually enabled everywhere by default, should we add a force sync early in the process?
On our 3.3 deployment ntp was not properly installed and enabled, which cause clock skew, which in turn triggered a false fail-over condition.
We'd need to have more info on that deployment to debug the issue, it's enabled by default and has been in all tags since early June. Need a reproducer.
Created attachment 1206909 [details] Inventory to reproduce this issue
To reproduce (I used 3.4.0.5) 1. yum uninstall ntp 2. Run the ansible installer on an inventory with no openshift_clock variable (sample attached from my install) 3. After the install yum list installed | grep ntp There will be no ntp installed.
Sorry, "yum remove ntp"
Presumably only happens when chrony rpm is pre-installed. From an affected host: root@dhcp5-210: ~ # rpm -q chrony chrony-2.1.1-3.el7.x86_64 root@dhcp5-210: ~ # timedatectl Local time: Mon 2016-10-03 11:30:14 EDT Universal time: Mon 2016-10-03 15:30:14 UTC RTC time: Mon 2016-10-03 15:30:14 Time zone: America/New_York (EDT, -0400) NTP enabled: yes NTP synchronized: no RTC in local TZ: no DST active: yes Last DST change: DST began at Sun 2016-03-13 01:59:59 EST Sun 2016-03-13 03:00:00 EDT Next DST change: DST ends (the clock jumps one hour backwards) at Sun 2016-11-06 01:59:59 EDT Sun 2016-11-06 01:00:00 EST
I will retest with no pre-installed chrony or ntp. Unfortunately, the original cluster this was reported on is long gone, so we can't do any forensic analysis there.
Comment #10 is for a different environment and in that case the host had a firewall blocking access to the configured nameservers.
Timeservers, not nameservers. *sigh*
This is working as described in comment 10. If neither ntp or chrony are installed, ntp gets installed and configured. Will watch for this on the next horizontal cluster install - as I said we no longer have access to CNCF to try to reconstruct what happened there.