Description of problem: Machine status should be "Failed" when creating a spot instance with price lower than spot instance price Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-04-15-223247 How reproducible: Always Steps to Reproduce: 1. Creating a spot instance with price lower than spot instance price providerSpec: value: spotMarketOptions: maxPrice: "0.01" 2. Check machines and logs Actual results: Machine stuck in Provisioning status $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsun416aws-rg88g-master-0 Running m4.xlarge us-east-2 us-east-2a 6h53m zhsun416aws-rg88g-master-1 Running m4.xlarge us-east-2 us-east-2b 6h53m zhsun416aws-rg88g-master-2 Running m4.xlarge us-east-2 us-east-2c 6h53m zhsun416aws-rg88g-worker-us-east-2a-txx9k Running m4.large us-east-2 us-east-2a 6h39m zhsun416aws-rg88g-worker-us-east-2b-9r4rx Running m4.large us-east-2 us-east-2b 6h39m zhsun416aws-rg88g-worker-us-east-2c-fxxdm Provisioning 4m57s lastUpdated: "2020-04-16T09:43:52Z" phase: Provisioning providerStatus: conditions: - lastProbeTime: "2020-04-16T09:44:10Z" lastTransitionTime: "2020-04-16T09:44:10Z" message: 'error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257.' reason: MachineCreationFailed status: "False" type: MachineCreation I0416 09:46:31.346461 1 actuator.go:74] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: actuator creating machine I0416 09:46:31.347178 1 reconciler.go:38] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: creating machine E0416 09:46:31.347197 1 reconciler.go:221] NodeRef not found in machine zhsun416aws-rg88g-worker-us-east-2c-fxxdm I0416 09:46:31.372053 1 instances.go:47] No stopped instances found for machine zhsun416aws-rg88g-worker-us-east-2c-fxxdm I0416 09:46:31.372096 1 instances.go:145] Using AMI ami-0e888b699fa6e37e7 I0416 09:46:31.372108 1 instances.go:77] Describing security groups based on filters I0416 09:46:31.583386 1 instances.go:122] Describing subnets based on filters I0416 09:46:32.438067 1 instances.go:331] Error launching instance: SpotMaxPriceTooLow: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257. status code: 400, request id: 3e5331d8-d1e9-4034-833c-15f10ce599f4 E0416 09:46:32.438171 1 reconciler.go:69] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: error creating machine: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257. I0416 09:46:32.438187 1 machine_scope.go:134] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: Updating status I0416 09:46:32.438195 1 machine_scope.go:155] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: finished calculating AWS status I0416 09:46:32.438215 1 machine_scope.go:80] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: patching machine E0416 09:46:32.453533 1 actuator.go:65] zhsun416aws-rg88g-worker-us-east-2c-fxxdm error: failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257. W0416 09:46:32.453594 1 controller.go:311] zhsun416aws-rg88g-worker-us-east-2c-fxxdm: failed to create machine: failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257. E0416 09:46:32.453654 1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257." "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"zhsun416aws-rg88g-worker-us-east-2c-fxxdm"} I0416 09:46:32.453784 1 recorder.go:52] controller-runtime/manager/events "msg"="Warning" "message"="failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257." "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsun416aws-rg88g-worker-us-east-2c-fxxdm","uid":"6dedaf4b-12db-4741-8a26-5555ca8dd11e","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"134275"} "reason"="FailedCreate" Expected results: The machine phase is set "Failed" Additional info:
I've tested this with the same build and have been unable to reproduce. Is there any more information you can provide?
I believe this issue was introduced by a refactor of the Cluster-API-Provider-AWS in Machine's will only go into the failed phase when the returned error is an `InvalidMachineConfigurationError` (see: https://github.com/openshift/machine-api-operator/blob/b9b4aaea428abe021d84477bd62a99f806fb64f2/pkg/controller/machine/controller.go#L312-L317) The error you are seeing here does return this (https://github.com/openshift/cluster-api-provider-aws/blob/025ec74aa743c3834020f4f6a45ac19c1acb76d2/pkg/actuators/machine/instances.go#L261), however it is then wrapped (https://github.com/openshift/cluster-api-provider-aws/blob/025ec74aa743c3834020f4f6a45ac19c1acb76d2/pkg/actuators/machine/reconciler.go#L73) so that it no longer matches the correct type The check to see if the error is an InvalidMachineConfigurationError (implemented: https://github.com/openshift/machine-api-operator/blob/b9b4aaea428abe021d84477bd62a99f806fb64f2/pkg/controller/machine/controller.go#L312-L317) does not currently support this wrapping. So it will need to be updated to support the wrapping.
clusterversion: 4.5.0-0.nightly-2020-04-25-170442 Machine status didn't become "Failed" with some invalid configuration, for example "invalid credentialsSecret". 1. Create a machine with invalid ami, machine became to Failed. $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsunaws426-rn574-master-0 Running m4.xlarge us-east-2 us-east-2a 5h10m zhsunaws426-rn574-master-1 Running m4.xlarge us-east-2 us-east-2b 5h10m zhsunaws426-rn574-master-2 Running m4.xlarge us-east-2 us-east-2c 5h10m zhsunaws426-rn574-worker-us-east-2a-kcpw2 Failed 137m zhsunaws426-rn574-worker-us-east-2b-xr7m6 Running m4.large us-east-2 us-east-2b 3h16m zhsunaws426-rn574-worker-us-east-2c-fttw7 Running m4.large us-east-2 us-east-2c 4h57m 2 Create a machine with invalid credentialsSecret, machine PHASE is empty. credentialsSecret: name: aws-cloud-credentials-invalid $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsunaws426-rn574-master-0 Running m4.xlarge us-east-2 us-east-2a 5h50m zhsunaws426-rn574-master-1 Running m4.xlarge us-east-2 us-east-2b 5h50m zhsunaws426-rn574-master-2 Running m4.xlarge us-east-2 us-east-2c 5h50m zhsunaws426-rn574-worker-us-east-2a-cpsrj 91s zhsunaws426-rn574-worker-us-east-2b-xr7m6 Running m4.large us-east-2 us-east-2b 3h56m zhsunaws426-rn574-worker-us-east-2c-fttw7 Running m4.large us-east-2 us-east-2c 5h36m I0426 06:51:48.974004 1 actuator.go:97] zhsunaws426-rn574-worker-us-east-2a-cpsrj: actuator checking if machine exists E0426 06:51:48.974666 1 controller.go:269] zhsunaws426-rn574-worker-us-east-2a-cpsrj: failed to check if machine exists: zhsunaws426-rn574-worker-us-east-2a-cpsrj: failed to create scope for machine: failed to create aws client: aws credentials secret openshift-machine-api/aws-cloud-credentials-invalid: Secret "aws-cloud-credentials-invalid" not found not found E0426 06:51:48.974751 1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="zhsunaws426-rn574-worker-us-east-2a-cpsrj: failed to create scope for machine: failed to create aws client: aws credentials secret openshift-machine-api/aws-cloud-credentials-invalid: Secret \"aws-cloud-credentials-invalid\" not found not found" "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"zhsunaws426-rn574-worker-us-east-2a-cpsrj"}
>Machine status didn't become "Failed" with some invalid configuration, for example "invalid credentialsSecret". Exists() will never succeed in that scenario, therefore requeueing the object right away. This is known and tracked here https://bugzilla.redhat.com/show_bug.cgi?id=1805639 For this BZ we need to reproduce the scenario in the description: "message: 'error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0257.'"
Verified clusterversion: 4.5.0-0.nightly-2020-04-28-023400 Creating a spot instance with price lower than spot instance price $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsun428aws-5x69z-master-0 Running m4.xlarge us-east-2 us-east-2a 57m zhsun428aws-5x69z-master-1 Running m4.xlarge us-east-2 us-east-2b 57m zhsun428aws-5x69z-master-2 Running m4.xlarge us-east-2 us-east-2c 57m zhsun428aws-5x69z-worker-us-east-2a-zczxx Running m4.large us-east-2 us-east-2a 43m zhsun428aws-5x69z-worker-us-east-2b-8w79c Running m4.large us-east-2 us-east-2b 43m zhsun428aws-5x69z-worker-us-east-2c-b2w82 Failed 17s E0428 06:58:23.712222 1 reconciler.go:68] zhsun428aws-5x69z-worker-us-east-2c-b2w82: error creating machine: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0241. I0428 06:58:23.712238 1 machine_scope.go:134] zhsun428aws-5x69z-worker-us-east-2c-b2w82: Updating status I0428 06:58:23.712246 1 machine_scope.go:155] zhsun428aws-5x69z-worker-us-east-2c-b2w82: finished calculating AWS status I0428 06:58:23.712261 1 machine_scope.go:80] zhsun428aws-5x69z-worker-us-east-2c-b2w82: patching machine E0428 06:58:23.729450 1 actuator.go:65] zhsun428aws-5x69z-worker-us-east-2c-b2w82 error: failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0241. W0428 06:58:23.729500 1 controller.go:312] zhsun428aws-5x69z-worker-us-east-2c-b2w82: failed to create machine: failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0241. I0428 06:58:23.729520 1 controller.go:412] Actuator returned invalid configuration error: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0241. I0428 06:58:23.729531 1 controller.go:421] zhsun428aws-5x69z-worker-us-east-2c-b2w82: going into phase "Failed" I0428 06:58:23.729907 1 recorder.go:52] controller-runtime/manager/events "msg"="Warning" "message"="failed to launch instance: error launching instance: Your Spot request price of 0.01 is lower than the minimum required Spot request fulfillment price of 0.0241." "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsun428aws-5x69z-worker-us-east-2c-b2w82","uid":"feaf35a3-2ca9-4a31-88e8-5bc145ca6d24","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"32428"} "reason"="FailedCreate" I0428 06:58:23.741457 1 controller.go:282] controller-runtime/controller "msg"="Successfully Reconciled" "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"zhsun428aws-5x69z-worker-us-east-2c-b2w82"} I0428 06:58:23.741521 1 controller.go:166] zhsun428aws-5x69z-worker-us-east-2c-b2w82: reconciling Machine W0428 06:58:23.741534 1 controller.go:263] zhsun428aws-5x69z-worker-us-east-2c-b2w82: machine has gone "Failed" phase. It won't reconcile I0428 06:58:23.741552 1 controller.go:282] controller-runtime/controller "msg"="Successfully Reconciled" "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"zhsun428aws-5x69z-worker-us-east-2c-b2w82"}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409