Bug 1430833 - After controllers reboot time is not NTP synchronized anymore
Summary: After controllers reboot time is not NTP synchronized anymore
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: z3
: 10.0 (Newton)
Assignee: Emilien Macchi
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On: 1429221 1429222
Blocks: 1438367
TreeView+ depends on / blocked
 
Reported: 2017-03-09 16:48 UTC by Alex Schultz
Modified: 2020-09-10 10:18 UTC (History)
15 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.2.0-17.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1429222
: 1438367 (view as bug list)
Environment:
Last Closed: 2017-06-28 14:46:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 443334 0 None None None 2017-03-09 16:48:08 UTC
Red Hat Product Errata RHBA-2017:1585 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 director Bug Fix Advisory 2017-06-28 18:42:51 UTC

Description Alex Schultz 2017-03-09 16:48:08 UTC
+++ This bug was initially created as a clone of Bug #1429222 +++

+++ This bug was initially created as a clone of Bug #1429221 +++

Description of problem:
After rebooting overcloud nodes the time is not NTP synchronized anymore. This could eventually lead to problems for the clustered services. 

For example we can notice the Ceph status is reporting a clock skew after the controller nodes have been rebooted.

Version-Release number of selected component (if applicable):
My tests(during upgrade testing) show that this affects both OSP10 and OSP11. 

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with 3 controllers
2. Reboot one of the controllers
3. Wait for the controller to  come back up to come back up
4. Check timedatectl

Actual results:
[root@overcloud-controller-0 heat-admin]# timedatectl 
      Local time: Sun 2017-03-05 17:40:49 UTC
  Universal time: Sun 2017-03-05 17:40:49 UTC
        RTC time: Sun 2017-03-05 17:40:49
       Time zone: n/a (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a


Expected results:
NTP synchronized: yes

Additional info:

We can notice that before the controller reboot ntpd was running and chronyd was not but after reboot ntpd is not running and chronyd is running:

Before reboot:

[root@overcloud-controller-1 ~]# timedatectl 
      Local time: Sun 2017-03-05 17:41:57 UTC
  Universal time: Sun 2017-03-05 17:41:57 UTC
        RTC time: Sun 2017-03-05 17:41:56
       Time zone: n/a (UTC, +0000)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a

[root@overcloud-controller-1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2017-03-05 17:01:43 UTC; 40min ago
 Main PID: 29574 (ntpd)
   CGroup: /system.slice/ntpd.service
           └─29574 /usr/sbin/ntpd -u ntp:ntp -g

Mar 05 17:05:00 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c614 04 freq_mode
Mar 05 17:05:01 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c618 08 no_sys_peer
Mar 05 17:05:09 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 25 vlan200 10.0.0.17 UDP 123
Mar 05 17:05:09 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
Mar 05 17:13:22 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 26 gre_sys fe80::4c33:66ff:fe1f:160d UDP 123
Mar 05 17:13:22 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
Mar 05 17:20:20 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c612 02 freq_set kernel -26.091 PPM
Mar 05 17:20:20 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c615 05 clock_sync
Mar 05 17:35:33 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 27 vlan200 10.0.0.10 UDP 123
Mar 05 17:35:33 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver

[root@overcloud-controller-1 ~]# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Sun 2017-03-05 17:01:43 UTC; 40min ago
 Main PID: 707 (code=exited, status=0/SUCCESS)

Mar 05 16:51:01 localhost.localdomain chronyd[707]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH)
Mar 05 16:51:01 localhost.localdomain chronyd[707]: Generated key 1
Mar 05 16:51:01 localhost.localdomain systemd[1]: Started NTP client/server.
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 216.229.0.50 offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 2001:470:1f07:e17:250:43ff:fed4:102d offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 66.228.59.187 offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 104.131.53.252 offline
Mar 05 17:01:43 overcloud-controller-1.localdomain systemd[1]: Stopping NTP client/server...
Mar 05 17:01:43 overcloud-controller-1.localdomain chronyd[707]: chronyd exiting
Mar 05 17:01:43 overcloud-controller-1.localdomain systemd[1]: Stopped NTP client/server.

After reboot:

[root@overcloud-controller-1 ~]# timedatectl 
      Local time: Sun 2017-03-05 17:48:56 UTC
  Universal time: Sun 2017-03-05 17:48:56 UTC
        RTC time: Sun 2017-03-05 17:48:56
       Time zone: n/a (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

[root@overcloud-controller-1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)

[root@overcloud-controller-1 ~]# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2017-03-05 17:45:02 UTC; 4min 17s ago
 Main PID: 668 (chronyd)
   CGroup: /system.slice/chronyd.service
           └─668 /usr/sbin/chronyd

Mar 05 17:45:02 overcloud-controller-1.localdomain systemd[1]: Starting NTP client/server...
Mar 05 17:45:02 overcloud-controller-1.localdomain chronyd[668]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH)
Mar 05 17:45:02 overcloud-controller-1.localdomain chronyd[668]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Mar 05 17:45:02 overcloud-controller-1.localdomain systemd[1]: Started NTP client/server.
Mar 05 17:48:49 overcloud-controller-1.localdomain chronyd[668]: Source 66.228.59.187 replaced with 216.218.254.202

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-03-05 12:52:55 EST ---

This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

--- Additional comment from Alex Schultz on 2017-03-06 15:59:33 EST ---

Workaround would be to run `systemctl stop chronyd; systemctl disable chronyd; systemctl start ntpd`

Comment 4 Amit Ugol 2017-04-05 11:40:36 UTC
All monolithic node types have ntpd enabled by default now.
If this changes due to up/grade/date please file a new bug accordingly.

Comment 7 errata-xmlrpc 2017-06-28 14:46:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1585


Note You need to log in before you can comment on or make changes to this bug.