Bug 1731441 - [DOCS] Openshift on AWS: Shutdown of running cluster and start after 1-2 days not working
Summary: [DOCS] Openshift on AWS: Shutdown of running cluster and start after 1-2 days...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.1.z
Assignee: Kathryn Alexander
QA Contact: Gaoyun Pei
Vikram Goyal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-19 12:28 UTC by szustkowski
Modified: 2020-02-11 10:18 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-01 15:01:39 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description szustkowski 2019-07-19 12:28:01 UTC
Description of problem:

A running Openshift 4.1 cluster on AWS, whose EC2 instances are stopped for saving money during a test phase does not come up again after 1-2 days. 

Version-Release number of the following components:

./openshift-install v4.1.6-201907101224-dirty
built from commit e8e6d8998bed2087244a14be16185235b43d6407
release image quay.io/openshift-release-dev/ocp-release@sha256:aa955a9ec40e55e5d9c0203a995b398e8c1031473dae24ed405efe9a95b43186


How reproducible:
Reproducible at least 2 times

Steps to Reproduce:
1. Install OpenShift 4.1 on installer-provisioned AWS, without a installer-config file
2. Stop all EC2 instances
3. Start them again immediately
4. Try to access the cluster: It works
5. Stop them again
6. Wait for 2 days
7. Start them again
8. Give them enough time to boot up: Around 30 minutes
9. Try to access the cluster: Doesn't work anymore

Actual results:
Open the Cluster Web Console in browser: Chrome shows ERR_CONNECTION_CLOSED

Expected results: The cluster is full operational, the Web frontend is accessible. 

Additional info:
Possibly related to https://github.com/openshift/installer/issues/818. Maybe it's an AWS infrastructure issue, tho. However, in this case the openshift-installer should install the cluster in such a way that it survives restarts even if it is installed by an AWS noob.

Comment 3 Seth Jennings 2019-07-29 19:18:03 UTC
This is a documented requirement of the new product architecture, that you must keep the cluster running for at least 24h so the components can rotate to their non-installation certificates
https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#installation-generate-ignition-configs_installing-bare-metal

Comment 4 Ryan Howe 2019-07-30 20:24:39 UTC
We need to have the same thing for AWS IPI install, stating that after the install the cluster must be up for 24 hrs.

Comment 5 Ryan Howe 2019-07-30 20:27:12 UTC
This warning needs to be added every type of installs: 
  
    https://docs.openshift.com/container-platform/4.1/installing/*

Comment 7 Kathryn Alexander 2019-10-17 16:02:50 UTC
The PR to add the note to all of the installation assemblies is here: https://github.com/openshift/openshift-docs/pull/17424

@Gaoyun Pei, will you PTAL?

Comment 10 Gaoyun Pei 2019-10-21 08:24:47 UTC
PR https://github.com/openshift/openshift-docs/pull/17424 lgtm.

Comment 12 Kathryn Alexander 2019-10-21 13:53:30 UTC
I've merged the change and am waiting for it to go live.


Note You need to log in before you can comment on or make changes to this bug.