Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1884465

Summary: [Assisted-4.6 ]Installation fails when there is a time drift between the nodes
Product: OpenShift Container Platform Reporter: nshidlin <nshidlin>
Component: assisted-installerAssignee: Igal Tsoiref <itsoiref>
assisted-installer sub component: discovery-agent QA Contact: Yuri Obshansky <yobshans>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: alazar, aos-bugs, lgamliel
Version: 4.6   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-08 04:18:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description nshidlin 2020-10-02 04:37:02 UTC
Description of problem:
When there is a time drift between the nodes in the cluster the installation fails with error:
Host master-0-0: updated status from "installing-in-progress" to "error" (Host failed to install because its installation stage Joined took longer than expected 20m0s)

Version-Release number of selected component (if applicable):

{
'release_tag': 'v1.0.9.6-ds',
 'versions': 
   {
    'assisted-ignition-generator': 'quay.io/ocpmetal/assisted-ignition- 
     generator:v1.0.9.5',
     'assisted-installer': 'registry.redhat.io/openshift4/assisted-installer- 
      rhel8:v4.6.0-19',
     'assisted-installer-controller': 'registry.redhat.io/openshift4/assisted- 
      installer-reporter-rhel8:v4.6.0-15',
     'assisted-installer-service': 'quay.io/app-sre/assisted-service:bcea367',
     'discovery-agent': 'registry.redhat.io/openshift4/assisted-installer- 
      agent-rhel8:v4.6.0-15',
     'image-builder': 'quay.io/app-sre/assisted-iso-create:bcea367'}}

Steps to Reproduce:
1.Boot nodes into discovery ISO

2.Change the time on the nodes such that there is a multi hour time diff between the nodes:
# ssh core@master-0-0
date +%T -s "08:35:00"
# ssh core@master-0-1
date +%T -s "14:23:00"
# ssh core@master-0-2
date +%T -s "20:00:00"

3.Start cluster installation 

Actual results:
Cluster installation fails

Expected results:


Additional info:
Credentials pod fails to start:
Oct 01 14:33:47 master-0-0 hyperkube[6947]: E1001 14:33:47.556970    6947 pod_workers.go:191] Error syncing pod 49bf00fccfe455708243ba513f26ede2 ("cloud-credential-operator-master-0-0_openshift-cloud-credential-operator(49bf00fccfe455708>

Comment 1 Ronnie Lazar 2020-10-04 13:53:28 UTC
Workaround for this issue: https://issues.redhat.com/browse/MGMT-2342
Full solution: https://issues.redhat.com/browse/MGMT-2322?

Comment 2 Igal Tsoiref 2020-10-06 15:13:17 UTC
We have implemented validation that verifies that we don't have time drift in the hosts.
Allowed time diff is 4 minutes. 
u can set time with date +%T -s "hh:mm:ss" and verify that if we have time diff more than 4 minutes installation must not start and NTP validator error returns

Comment 4 nshidlin 2020-10-06 17:11:08 UTC
Verified on staging:
{
  'release_tag': 'v1.0.9.8-ds',
  'versions': {
    'assisted-ignition-generator': 'quay.io/ocpmetal/assisted-ignition-generator:v1.0.9.5',
    'assisted-installer': 'registry-proxy.engineering.redhat.com/rh-osbs/openshift4-assisted-installer-rhel8:v4.6.0-21',
    'assisted-installer-controller': 'registry-proxy.engineering.redhat.com/rh-osbs/openshift4-assisted-installer-reporter-rhel8:v4.6.0-17',
    'assisted-installer-service': 'quay.io/app-sre/assisted-service:394a627',
    'discovery-agent': 'registry-proxy.engineering.redhat.com/rh-osbs/openshift4-assisted-installer-agent-rhel8:v4.6.0-17',
    'image-builder': 'quay.io/app-sre/assisted-iso-create:394a627'
  }
}
Scenario:
1. Boot nodes into ISO
2. Set the time of one of the nodes to a 5 difference from the other nodes
3. After boot discovery fill in necessary fields to install cluster
4. Cluster installation is disabled with message "Host clocks are not synchronized, please configure an NTP server via DHCP."
5. Fix time difference between the nodes
6. Cluster is ready to be installed
7. Cluster Installation Succeeds

Comment 6 errata-xmlrpc 2020-10-08 04:18:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 assisted installer), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4199