Bug 1429221 - After controllers reboot time is not NTP synchronized anymore
Summary: After controllers reboot time is not NTP synchronized anymore
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: rc
: 11.0 (Ocata)
Assignee: RHOS Maint
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks: 1429222 1430833 1438367
TreeView+ depends on / blocked
 
Reported: 2017-03-05 17:51 UTC by Marius Cornea
Modified: 2017-05-17 20:05 UTC (History)
11 users (show)

Fixed In Version: puppet-tripleo-6.3.0-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1429222 (view as bug list)
Environment:
Last Closed: 2017-05-17 20:05:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 442041 0 None MERGED Stop the chronyd service 2020-07-13 10:03:42 UTC
Red Hat Product Errata RHEA-2017:1245 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC

Description Marius Cornea 2017-03-05 17:51:33 UTC
Description of problem:
After rebooting overcloud nodes the time is not NTP synchronized anymore. This could eventually lead to problems for the clustered services. 

For example we can notice the Ceph status is reporting a clock skew after the controller nodes have been rebooted.

Version-Release number of selected component (if applicable):
My tests(during upgrade testing) show that this affects both OSP10 and OSP11. 

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with 3 controllers
2. Reboot one of the controllers
3. Wait for the controller to  come back up to come back up
4. Check timedatectl

Actual results:
[root@overcloud-controller-0 heat-admin]# timedatectl 
      Local time: Sun 2017-03-05 17:40:49 UTC
  Universal time: Sun 2017-03-05 17:40:49 UTC
        RTC time: Sun 2017-03-05 17:40:49
       Time zone: n/a (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a


Expected results:
NTP synchronized: yes

Additional info:

We can notice that before the controller reboot ntpd was running and chronyd was not but after reboot ntpd is not running and chronyd is running:

Before reboot:

[root@overcloud-controller-1 ~]# timedatectl 
      Local time: Sun 2017-03-05 17:41:57 UTC
  Universal time: Sun 2017-03-05 17:41:57 UTC
        RTC time: Sun 2017-03-05 17:41:56
       Time zone: n/a (UTC, +0000)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a

[root@overcloud-controller-1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2017-03-05 17:01:43 UTC; 40min ago
 Main PID: 29574 (ntpd)
   CGroup: /system.slice/ntpd.service
           └─29574 /usr/sbin/ntpd -u ntp:ntp -g

Mar 05 17:05:00 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c614 04 freq_mode
Mar 05 17:05:01 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c618 08 no_sys_peer
Mar 05 17:05:09 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 25 vlan200 10.0.0.17 UDP 123
Mar 05 17:05:09 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
Mar 05 17:13:22 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 26 gre_sys fe80::4c33:66ff:fe1f:160d UDP 123
Mar 05 17:13:22 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver
Mar 05 17:20:20 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c612 02 freq_set kernel -26.091 PPM
Mar 05 17:20:20 overcloud-controller-1.localdomain ntpd[29574]: 0.0.0.0 c615 05 clock_sync
Mar 05 17:35:33 overcloud-controller-1.localdomain ntpd[29574]: Listen normally on 27 vlan200 10.0.0.10 UDP 123
Mar 05 17:35:33 overcloud-controller-1.localdomain ntpd[29574]: new interface(s) found: waking up resolver

[root@overcloud-controller-1 ~]# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Sun 2017-03-05 17:01:43 UTC; 40min ago
 Main PID: 707 (code=exited, status=0/SUCCESS)

Mar 05 16:51:01 localhost.localdomain chronyd[707]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH)
Mar 05 16:51:01 localhost.localdomain chronyd[707]: Generated key 1
Mar 05 16:51:01 localhost.localdomain systemd[1]: Started NTP client/server.
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 216.229.0.50 offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 2001:470:1f07:e17:250:43ff:fed4:102d offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 66.228.59.187 offline
Mar 05 16:52:41 overcloud-controller-1.localdomain chronyd[707]: Source 104.131.53.252 offline
Mar 05 17:01:43 overcloud-controller-1.localdomain systemd[1]: Stopping NTP client/server...
Mar 05 17:01:43 overcloud-controller-1.localdomain chronyd[707]: chronyd exiting
Mar 05 17:01:43 overcloud-controller-1.localdomain systemd[1]: Stopped NTP client/server.

After reboot:

[root@overcloud-controller-1 ~]# timedatectl 
      Local time: Sun 2017-03-05 17:48:56 UTC
  Universal time: Sun 2017-03-05 17:48:56 UTC
        RTC time: Sun 2017-03-05 17:48:56
       Time zone: n/a (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

[root@overcloud-controller-1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)

[root@overcloud-controller-1 ~]# systemctl status chronyd
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2017-03-05 17:45:02 UTC; 4min 17s ago
 Main PID: 668 (chronyd)
   CGroup: /system.slice/chronyd.service
           └─668 /usr/sbin/chronyd

Mar 05 17:45:02 overcloud-controller-1.localdomain systemd[1]: Starting NTP client/server...
Mar 05 17:45:02 overcloud-controller-1.localdomain chronyd[668]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH)
Mar 05 17:45:02 overcloud-controller-1.localdomain chronyd[668]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Mar 05 17:45:02 overcloud-controller-1.localdomain systemd[1]: Started NTP client/server.
Mar 05 17:48:49 overcloud-controller-1.localdomain chronyd[668]: Source 66.228.59.187 replaced with 216.218.254.202

Comment 1 Gurenko Alex 2017-04-02 15:18:45 UTC
2017-03-30.4 build was used for verification, here is an output from controller-1 after the reboot:

[heat-admin@controller-1 ~]$ timedatectl
      Local time: Sun 2017-04-02 15:16:16 UTC
  Universal time: Sun 2017-04-02 15:16:16 UTC
        RTC time: Sun 2017-04-02 15:16:16
       Time zone: UTC (UTC, +0000)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a

[heat-admin@controller-1 ~]$ rpm -q puppet-tripleo
puppet-tripleo-6.3.0-6.el7ost.noarch

Comment 3 errata-xmlrpc 2017-05-17 20:05:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.