Bug 1882608 - Spot instance not getting created on AzureGovCloud
Summary: Spot instance not getting created on AzureGovCloud
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Alexander Demicev
QA Contact: Milind Yadav
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-25 06:58 UTC by Milind Yadav
Modified: 2021-02-24 15:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:21:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-api-operator pull 765 0 None closed Bug 1882608: Add warning when govcloud and spot are enabled in Azure 2021-02-11 15:21:39 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:22:07 UTC

Description Milind Yadav 2020-09-25 06:58:19 UTC
Description : Spot instance not getting created on AzureGovCloud

Version :Cluster version is 4.6.0-0.nightly-2020-09-23-022756

m/m

Reproducible :Always

Steps :
1.refer below yaml to create spot instance machineset , make sure to provide higher price at that moment , try google for latest price for the type of instance mentioned in machineset 
https://gist.github.com/miyadav/ca5f85e78ff0cd98f7ddabb85607f110
Expected : machineset should be created successfully

2.oc get machines 
Expected :machine should be provisioned successfully
Actual : machine creation failed 
[miyadav@miyadav ManualRun]$ oc get machines
NAME                                          PHASE     TYPE              REGION          ZONE   AGE
miyadav-az-zzlxl-master-0                     Running   Standard_D8s_v3   usgovvirginia          44h
miyadav-az-zzlxl-master-1                     Running   Standard_D8s_v3   usgovvirginia          44h
miyadav-az-zzlxl-master-2                     Running   Standard_D8s_v3   usgovvirginia          44h
miyadav-az-zzlxl-worker-spot-cx9hs            Failed                                             5s
miyadav-az-zzlxl-worker-usgovvirginia-nd8gh   Running   Standard_D2s_v3   usgovvirginia          3h7m
miyadav-az-zzlxl-worker-usgovvirginia-qxmbr   Running   Standard_D2s_v3   usgovvirginia          20h
miyadav-az-zzlxl-worker-usgovvirginia-xdnjd   Running   Standard_D2s_v3   usgovvirginia          20h

Additional log :
I0925 05:53:12.606654       1 machine_scope.go:160] miyadav-az-zzlxl-worker-spot-hjxrs: patching machine
E0925 05:53:12.687288       1 actuator.go:78] Machine error: failed to reconcile machine "miyadav-az-zzlxl-worker-spot-hjxrs": failed to create vm miyadav-az-zzlxl-worker-spot-hjxrs: failure sending request for machine miyadav-az-zzlxl-worker-spot-hjxrs
W0925 05:53:12.687320       1 controller.go:315] miyadav-az-zzlxl-worker-spot-hjxrs: failed to create machine: failed to reconcile machine "miyadav-az-zzlxl-worker-spot-hjxrs": failed to create vm miyadav-az-zzlxl-worker-spot-hjxrs: failure sending request for machine miyadav-az-zzlxl-worker-spot-hjxrs
I0925 05:53:12.687329       1 controller.go:415] Actuator returned invalid configuration error: failed to reconcile machine "miyadav-az-zzlxl-worker-spot-hjxrs": failed to create vm miyadav-az-zzlxl-worker-spot-hjxrs: failure sending request for machine miyadav-az-zzlxl-worker-spot-hjxrs
I0925 05:53:12.687339       1 controller.go:424] miyadav-az-zzlxl-worker-spot-hjxrs: going into phase "Failed"
I0925 05:53:12.687543       1 recorder.go:52] controller-runtime/manager/events "msg"="Warning"  "message"="InvalidConfiguration: failed to reconcile machine \"miyadav-az-zzlxl-worker-spot-hjxrs\": failed to create vm miyadav-az-zzlxl-worker-spot-hjxrs: failure sending request for machine miyadav-az-zzlxl-worker-spot-hjxrs" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"miyadav-az-zzlxl-worker-spot-hjxrs","uid":"33ce149a-d069-4d43-9423-6f5a3289a91e","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"2274306"} "reason"="FailedCreate"

Comment 1 Joel Speed 2020-09-25 08:28:32 UTC
@Milind, I don't think I have access to GovCloud to try to reproduce this, could you try again but without setting a maxPrice, just use `spotVMOptions: {}` instead? Would be good to know if it is related to any Spot VM or just if there is a maxPrice

Comment 2 Milind Yadav 2020-09-25 08:49:53 UTC
@joel - tried as you suggested :

   providerSpec:
        value:
          spotVMOptions: {}


Events:
  Type     Reason        Age   From              Message
  ----     ------        ----  ----              -------
  Warning  FailedCreate  18s   azure-controller  InvalidConfiguration: failed to reconcile machine "miyadav-az-zzlxl-worker-spot-fzd4r": failed to create vm miyadav-az-zzlxl-worker-spot-fzd4r: failure sending request for machine miyadav-az-zzlxl-worker-spot-fzd4r

Comment 3 Joel Speed 2020-09-25 09:11:21 UTC
I'm aware that on AWS, their GovCloud didn't support spot instances for a very long time, though I believe it does now.

I wonder if Azure just doesn't support creating spot instances on the GovCloud tenant.

I think we will need to do some research and try to work out if they should be supported on GovCloud or not.

Comment 4 Alexander Demicev 2020-09-29 11:38:07 UTC
I'm not able to launch any spot instance using Azure GovCloud console. I think spot instances are not supported in GovCloud.

Comment 5 Joel Speed 2020-09-30 11:01:55 UTC
I spoke with Alex yesterday who confirmed (as per https://bugzilla.redhat.com/show_bug.cgi?id=1882608#c4) that when trying to create an instance in Azure GovCloud, independently of OCP, that the same error occurs.

Since we are aware that spot instances can be created in other regions, I suspect that the issue here is that Azure does not support Spot Instances in this region.
I've asked Alex to try and confirm if this is the case and hopefully we will have an answer to this question during the next sprint.

If this is the case, we may be able to improve the error message that we return to improve UX.
Additionally we should ensure that the docs team are aware and can add a note that this region doesn't support spot (if that is the case).

Comment 6 Alexander Demicev 2020-11-12 17:07:14 UTC
Spot instances are most likely not supported on GovCloud.

Comment 7 Joel Speed 2020-11-13 11:44:49 UTC
I've reached out to some folks from Microsoft to try and get an answer as to whether these are supported or not.

My suggestion is that we add a warning via a webhook if we can see that the region is GovCloud and Spot VMs are requested, for example "Warning: Spot VMs may not be supported in this region".

I will update this issue if I get a response.

Comment 8 Joel Speed 2020-11-23 12:30:16 UTC
@ademicev I've had no response from MS about this. I think the route to closing this issue should be adding a webhook warning when spot instances are requested in the GovCloud region, something akin to "Warning: Spot VMs may not be supported in this region". Could you create a PR to add that please?

Comment 9 Michael McCune 2020-12-04 21:20:37 UTC
the PR associated with this has all the necessary labels and is currently waiting on CI to pass

Comment 11 Milind Yadav 2020-12-08 05:14:37 UTC
Validated on:
[miyadav@miyadav ~]$ oc get clusterversion
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.ci-2020-12-08-014807   True        False         30m     Cluster version is 4.7.0-0.ci-2020-12-08-014807


Steps:
1.Create machineset to provision spot instance , refer below 
.
.
    spec:
      metadata: {}
      providerSpec:
        value:
          spotVMOptions:
           maxPrice: 0.225
.
.

Actual & Expected - machineset created successfully 

2.Monitor logs ( oc logs -f machine-controller-pod -c machine-controller 

Actual & Expected - 

I1208 05:03:35.631580       1 actuator.go:168] Updating machine miyadav608az-zvd4x-worker-usgovvirginia-s-v9tlq
I1208 05:03:35.894049       1 machine_scope.go:144] miyadav608az-zvd4x-worker-usgovvirginia-s-v9tlq: status unchanged
I1208 05:03:35.894084       1 machine_scope.go:144] miyadav608az-zvd4x-worker-usgovvirginia-s-v9tlq: status unchanged
I1208 05:03:35.894089       1 machine_scope.go:160] miyadav608az-zvd4x-worker-usgovvirginia-s-v9tlq: patching machine
W1208 05:03:35.907643       1 warnings.go:67] spot VMs may not be supported when using GovCloud region


Additional info:
Moved to Verified

Comment 14 errata-xmlrpc 2021-02-24 15:21:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.