Bug 1700255 - [upi-on-aws] "Cleanup Machine API Resources" should be removed.
Summary: [upi-on-aws] "Cleanup Machine API Resources" should be removed.
Keywords:
Status: CLOSED DUPLICATE of bug 1698207
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.1.0
Assignee: Abhinav Dahiya
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-16 07:52 UTC by Johnny Liu
Modified: 2019-04-18 19:42 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-18 19:42:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Johnny Liu 2019-04-16 07:52:49 UTC
Description of problem:
Since fix PR for BZ#1697236 is landed in. Master instance would be created withe the same name as machine name.

If user follow "https://github.com/openshift/installer/blob/master/docs/user/aws/install_upi.md#cleanup-machine-api-resources" to delete the pre-defined master nodes, the running master would be terminated.


Version-Release number of the following components:
4.0.0-0.nightly-2019-04-10-182914

How reproducible:
always

Steps to Reproduce:
1. follow upi-on-aws doc to set up a cluster
2. after installation, check machines name
# oc get machine -n openshift-machine-api
NAME                         INSTANCE   STATE   TYPE        REGION      ZONE         AGE
jialiu-upi2-ml7bx-master-0                      m4.xlarge   us-east-2   us-east-2a   18s
jialiu-upi2-ml7bx-master-1                      m4.xlarge   us-east-2   us-east-2b   18s
jialiu-upi2-ml7bx-master-2                      m4.xlarge   us-east-2   us-east-2c   18s
3. Check created instance name via aws cli
# aws ec2 describe-instances --filters Name=tag:kubernetes.io/cluster/jialiu-upi2-ml7bx,Values=owned | jq '.Reservations[].Instances[].Tags[] | select(.Key=="Name")'
{
  "Value": "jialiu-upi2-ml7bx-worker",
  "Key": "Name"
}
{
  "Value": "jialiu-upi2-ml7bx-master-2",
  "Key": "Name"
}
{
  "Value": "jialiu-upi2-ml7bx-master-1",
  "Key": "Name"
}
{
  "Value": "jialiu-upi2-ml7bx-worker",
  "Key": "Name"
}
{
  "Value": "jialiu-upi2-ml7bx-worker",
  "Key": "Name"
}
{
  "Value": "jialiu-upi2-ml7bx-master-0",
  "Key": "Name"
}
4. Follow https://github.com/openshift/installer/blob/master/docs/user/aws/install_upi.md#cleanup-machine-api-resources to delete the pre-defined master nodes.

Actual results:
master instances are terminated, so that cluster api become unavailable.
# oc get machines --namespace openshift-machine-api
NAME                         INSTANCE   STATE   TYPE        REGION      ZONE         AGE
jialiu-upi2-ml7bx-master-0                      m4.xlarge   us-east-2   us-east-2a   46m
jialiu-upi2-ml7bx-master-1                      m4.xlarge   us-east-2   us-east-2b   46m
jialiu-upi2-ml7bx-master-2                      m4.xlarge   us-east-2   us-east-2c   46m

# oc delete machines --all --namespace openshift-machine-api
machine.machine.openshift.io "jialiu-upi2-ml7bx-master-0" deleted
machine.machine.openshift.io "jialiu-upi2-ml7bx-master-1" deleted
machine.machine.openshift.io "jialiu-upi2-ml7bx-master-2" deleted
Unable to connect to the server: http2: server sent GOAWAY and closed the connection; LastStreamID=23, ErrCode=NO_ERROR, debug=""

#oc get csr --no-headers|grep -i pending|awk '{print $1}' | xargs oc adm certificate approve
Unable to connect to the server: dial tcp 3.14.35.218:6443: i/o timeout

Expected results:
cluster api is avaiable.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Scott Dodson 2019-04-16 17:35:19 UTC

*** This bug has been marked as a duplicate of bug 1698207 ***

Comment 2 Johnny Liu 2019-04-17 01:48:02 UTC
I already noticed BZ#1698207, but this bug is mainly focus on upi-on-aws document enhancement, before 1698207 has a final fix, I think we need fix this issue from the doc.

Comment 3 Scott Dodson 2019-04-18 19:42:14 UTC
Right now the only thing we're proposing is documenting the need to remove manifests prior to bootstrapping. So I'm going to close this one as a dupe again and once the engineering team has landed in repo docs and verified that this is safe we'll outline the specific changes we're requesting to documentation and move that bug to Documentation component.

If we arrive at the conclusion that we need to make code changes we'll re-open this and we can track the two efforts separately, right now it's just one effort.

*** This bug has been marked as a duplicate of bug 1698207 ***


Note You need to log in before you can comment on or make changes to this bug.