+++ This bug was initially created as a clone of Bug #1429222 +++
+++ This bug was initially created as a clone of Bug #1429221 +++
Description of problem:
After rebooting overcloud nodes the time is not NTP synchronized anymore. This could eventually lead to problems for the clustered services.
For example we can notice the Ceph status is reporting a clock skew after the controller nodes have been rebooted.
Version-Release number of selected component (if applicable):
My tests(during upgrade testing) show that this affects both OSP10 and OSP11.
How reproducible:
100%
Steps to Reproduce:
1. Deploy overcloud with 3 controllers
2. Reboot one of the controllers
3. Wait for the controller to come back up to come back up
4. Check timedatectl
Actual results:
[root@overcloud-controller-0 heat-admin]# timedatectl
Local time: Sun 2017-03-05 17:40:49 UTC
Universal time: Sun 2017-03-05 17:40:49 UTC
RTC time: Sun 2017-03-05 17:40:49
Time zone: n/a (UTC, +0000)
NTP enabled: yes
NTP synchronized: no
RTC in local TZ: no
DST active: n/a
Expected results:
NTP synchronized: yes
Additional info:
We can notice that before the controller reboot ntpd was running and chronyd was not but after reboot ntpd is not running and chronyd is running:
Before reboot:
[root@overcloud-controller-1 ~]# timedatectl
Local time: Sun 2017-03-05 17:41:57 UTC
Universal time: Sun 2017-03-05 17:41:57 UTC
RTC time: Sun 2017-03-05 17:41:56
Time zone: n/a (UTC, +0000)
NTP enabled: yes
NTP synchronized: yes
RTC in local TZ: no
DST active: n/a
[root@overcloud-controller-1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2017-03-05 17:01:43 UTC; 40min ago
Main PID: 29574 (ntpd)
CGroup: /system.slice/ntpd.service
└─29574 /usr/sbin/ntpd -u ntp:ntp -g
Mar 05 17:05:00 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c614 04 freq_mode
Mar 05 17:05:01 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c618 08 no_sys_peer
Mar 05 17:05:09 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 25 vlan200 10.0.0.17 UDP 123
Mar 05 17:05:09 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
Mar 05 17:13:22 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 26 gre_sys fe80::4c33:66ff:fe1f:160d UDP 123
Mar 05 17:13:22 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
Mar 05 17:20:20 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c612 02 freq_set kernel -26.091 PPM
Mar 05 17:20:20 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c615 05 clock_sync
Mar 05 17:35:33 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 27 vlan200 10.0.0.10 UDP 123
Mar 05 17:35:33 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
[root@overcloud-controller-1 ~]# systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Sun 2017-03-05 17:01:43 UTC; 40min ago
Main PID: 707 (code=exited, status=0/SUCCESS)
Mar 05 16:51:01 localhost.localdomain chronyd[707]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH)
Mar 05 16:51:01 localhost.localdomain chronyd[707]: Generated key 1
Mar 05 16:51:01 localhost.localdomain systemd[1]: Started NTP client/server.
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 216.229.0.50 offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 2001:470:1f07:e17:250:43ff:fed4:102d offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 66.228.59.187 offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 104.131.53.252 offline
Mar 05 17:01:43 overcloud-controller-1.localdomain systemd[1]: Stopping NTP client/server...
Mar 05 17:01:43 overcloud-controller-1.localdomain chronyd[707]: chronyd exiting
Mar 05 17:01:43 overcloud-controller-1.localdomain systemd[1]: Stopped NTP client/server.
After reboot:
[root@overcloud-controller-1 ~]# timedatectl
Local time: Sun 2017-03-05 17:48:56 UTC
Universal time: Sun 2017-03-05 17:48:56 UTC
RTC time: Sun 2017-03-05 17:48:56
Time zone: n/a (UTC, +0000)
NTP enabled: yes
NTP synchronized: no
RTC in local TZ: no
DST active: n/a
[root@overcloud-controller-1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
Active: inactive (dead)
[root@overcloud-controller-1 ~]# systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2017-03-05 17:45:02 UTC; 4min 17s ago
Main PID: 668 (chronyd)
CGroup: /system.slice/chronyd.service
└─668 /usr/sbin/chronyd
Mar 05 17:45:02 overcloud-controller-1.localdomain systemd[1]: Starting NTP client/server...
Mar 05 17:45:02 overcloud-controller-1.localdomain chronyd[668]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH)
Mar 05 17:45:02 overcloud-controller-1.localdomain chronyd[668]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Mar 05 17:45:02 overcloud-controller-1.localdomain systemd[1]: Started NTP client/server.
Mar 05 17:48:49 overcloud-controller-1.localdomain chronyd[668]: Source 66.228.59.187 replaced with 216.218.254.202
--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-03-05 12:52:55 EST ---
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
--- Additional comment from Alex Schultz on 2017-03-06 15:59:33 EST ---
Workaround would be to run `systemctl stop chronyd; systemctl disable chronyd; systemctl start ntpd`
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2017:1585