Bug 1932154
| Summary: | [AWS ] machine stuck in provisioned phase , no warnings or errors | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Milind Yadav <miyadav> |
| Component: | Cloud Compute | Assignee: | Michael McCune <mimccune> |
| Cloud Compute sub component: | Other Providers | QA Contact: | Milind Yadav <miyadav> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | low | ||
| Priority: | unspecified | CC: | mimccune |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: Missing iamInstanceProfile in awsproviderconfig.openshift.io resource of MachineSet.
Consequence: Machine is not able to pass "Provisioning" phase and join the OpenShift cluster as a node.
Fix: A warning has been added in cases where the iamInstanceProfile is not provided.
Result: User has a clear indication of what has caused the Machine to fail to join the cluster.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 22:48:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Milind Yadav
2021-02-24 05:25:42 UTC
hi Milind, just starting to take a look at this. i am looking through the must-gather and i don't see any CertificateSigningRequests for the new machines you are creating. would it be possible for you to get check the cluster to see if those machines ever made a CSR (oc get csr)? if no CSRs were generated, then i think this is an issue with the kubelet or node startup process, see [0] for more details. [0] https://github.com/openshift/machine-api-operator/blob/master/docs/user/TroubleShooting.md#machine-status-phase-provisioned Thanks Michael , I could find that the yaml need to have : (apiversion and iaminstanceProfile info)
apiVersion: awsproviderconfig.openshift.io/v1beta1
iamInstanceProfile:
id: miyadav-oc48-2502-f5l7x-worker-profile
to be able to generate a csr , when we do not pass it , the csr is not generated , but we dont get any error message or warning , after I added it to machineset yaml and scaled machineset , new machine was provisioned successfully and node was attached to it (in READY state)
Let me know if wish to check something more in this.
So basically we encountered this as we are trying to use default values which does not have iamInstanceProfile
Looks like we need to add a warning if the iamInstanceProfile is missing, as we do for the service account on GCP. This would tell clients that the machine may not join the cluster if the instance profile is not provided and should be enough of a hint to users that they should include this sounds good Joel, i wonder if there shouldn't be something in the product documentation about this too? just confirmed that these entries are in the product doc example, i'm going to work on adding the warning in a similar manner as the gcp provider. Validated at : [miyadav@miyadav ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-03-26-002831 True False 4h50m Error while reconciling 4.8.0-0.nightly-2021-03-26-002831: the cluster operator etcd is degraded [miyadav@miyadav ~]$ Steps : created machineset without iaminstance profile in the yaml Actual and expected results : [miyadav@miyadav ~]$ oc create -f rhv/aws/bugval.yml W0326 16:03:02.423741 45575 warnings.go:67] providerSpec.iamInstanceProfile: no IAM instance profile provided: nodes may be unable to join the cluster machineset.machine.openshift.io/miyadav-aws-26-sq5k5-worker-bug created Additional Info: Warning message displayed as expected moved to VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |