Bug 2021322 - cluster-api-provider-azure should populate purchase plan information
Summary: cluster-api-provider-azure should populate purchase plan information
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.10
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.10.0
Assignee: Patrick Dillon
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks: 2021513
TreeView+ depends on / blocked
 
Reported: 2021-11-08 20:27 UTC by Patrick Dillon
Modified: 2023-09-15 01:17 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2021513 (view as bug list)
Environment:
Last Closed: 2022-03-10 16:26:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-azure pull 237 0 None open Bug 2021322: Azure Marketplace Purchase Plan Info 2021-11-08 20:36:22 UTC
Github openshift cluster-api-provider-azure pull 239 0 None open Revert "Bug 2021322: Azure Marketplace Purchase Plan Info" 2021-11-11 11:02:58 UTC
Github openshift cluster-api-provider-azure pull 242 0 None Merged Add VM Support for Marketplace Purchase Plans 2021-11-23 17:31:10 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:26:27 UTC

Comment 1 sunzhaohua 2021-11-09 03:56:22 UTC
Set up cluster using cluster-bot with https://github.com/openshift/cluster-api-provider-azure/pull/237, tested create machineset after post installation, didn’t encounter error “requires Plan information in the request”. 

$ oc get clusterversion                                                                                           [11:40:46]
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci.test-2021-11-09-012149-ci-ln-vlwy3yt-latest   True        False         92m     Cluster version is 4.10.0-0.ci.test-2021-11-09-012149-ci-ln-vlwy3yt-latest

steps:
1. Accept Azure Marketplace term so that the image can be used to create VMs.
 $ az vm image accept-terms --urn plesk:solution-server-wordpress:plsk-cent-pro-sol-azr-m:18.0.38
2. Create a new machineset, machine could be created successful. But machine stuck in Provisioned status, couldn’t join the cluster, no csr pending.
machine.yaml https://privatebin-it-iso.int.open.paas.redhat.com/?10479f0e43f38db7#HuhrgPxxkfeDQXMEmFVgzv8cna34cKWFEYBMucgfLoXC
$ oc get machine                                                                                                  [11:42:09]
NAME                                          PHASE         TYPE              REGION           ZONE   AGE
zhsun119-lg85r-master-0                       Running       Standard_D4s_v3   northcentralus          114m
zhsun119-lg85r-master-1                       Running       Standard_D4s_v3   northcentralus          114m
zhsun119-lg85r-master-2                       Running       Standard_D4s_v3   northcentralus          114m
zhsun119-lg85r-worker-northcentralus-2qlbz    Running       Standard_D4s_v3   northcentralus          109m
zhsun119-lg85r-worker-northcentralus-pp5pl    Running       Standard_D4s_v3   northcentralus          109m
zhsun119-lg85r-worker-northcentralus-q4259    Running       Standard_D4s_v3   northcentralus          109m
zhsun119-lg85r-worker-northcentralus1-lwbbc   Provisioned   Standard_D4s_v3   northcentralus          31m

      image:
        offer: solution-server-wordpress
        publisher: plesk
        resourceID: ""
        sku: plsk-cent-pro-sol-azr-m
        version: 18.0.38
...
status:
  addresses:
  - address: zhsun119-lg85r-worker-northcentralus1-lwbbc
    type: Hostname
  - address: zhsun119-lg85r-worker-northcentralus1-lwbbc
    type: InternalDNS
  - address: zhsun119-lg85r-worker-northcentralus1-lwbbc.13yt3anhfaqeha50bzd5szgnbd.ex.internal.cloudapp.net
    type: InternalDNS
  - address: 10.0.128.7
    type: InternalIP
  conditions:
  - lastTransitionTime: "2021-11-09T03:10:59Z"
    status: "True"
    type: InstanceExists
  lastUpdated: "2021-11-09T03:11:00Z"
  phase: Provisioned
  providerStatus:
    conditions:
    - lastProbeTime: "2021-11-09T03:10:59Z"
      lastTransitionTime: "2021-11-09T03:10:59Z"
      message: machine successfully created
      reason: MachineCreationSucceeded
      status: "True"
      type: MachineCreated
    metadata: {}
    vmId: /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/zhsun119-lg85r-rg/providers/Microsoft.Compute/virtualMachines/zhsun119-lg85r-worker-northcentralus1-lwbbc
    vmState: Running

must-gather: https://file.rdu.redhat.com/~zhsun/must-gather.local.4681473631749557535.zip

Comment 2 Joel Speed 2021-11-09 09:08:20 UTC
With the plesk image, it's expected that the machine wouldn't become a full node. It doesn't have ignition within the image to set up the system.

Do we have an RHCOS image on the Azure marketplace that we can use to verify this with?

Comment 3 Patrick Dillon 2021-11-09 11:30:54 UTC
> Do we have an RHCOS image on the Azure marketplace that we can use to verify this with?


No, not yet, unfortunately.

Comment 7 sunzhaohua 2021-11-11 06:33:56 UTC
Sure Jole, I am working on this, will post the result later. For the regression want to make sure we only run cloud teams cases or the whole teams cases. Now I am running the whole teams cases.

Comment 9 sunzhaohua 2021-11-12 10:16:16 UTC
Hi Patrick, I set up a cluster using cluster-bot with https://github.com/openshift/cluster-api-provider-azure/pull/240, with WMCO, but windows machines failed. 

$ oc get clusterversion                                                                                                                           [18:13:28]
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci.test-2021-11-12-013426-ci-ln-r2wlr1b-latest   True        False         7h46m   Cluster version is 4.10.0-0.ci.test-2021-11-12-013426-ci-ln-r2wlr1b-latest

$ oc get machine                                                                                                                                  [18:13:34]
NAME                                      PHASE     TYPE              REGION      ZONE   AGE
windows-6fscv                             Failed                                         7h37m
windows-9lnwd                             Failed                                         7h37m
windows1-twbsv                            Failed                                         6h47m
zhsun1112-t8gbp-master-0                  Running   Standard_D4s_v3   centralus   2      8h
zhsun1112-t8gbp-master-1                  Running   Standard_D4s_v3   centralus   3      8h
zhsun1112-t8gbp-master-2                  Running   Standard_D4s_v3   centralus   1      8h
zhsun1112-t8gbp-worker-centralus1-9p6f2   Running   Standard_D4s_v3   centralus   1      8h
zhsun1112-t8gbp-worker-centralus2-vs5m8   Running   Standard_D4s_v3   centralus   2      8h
zhsun1112-t8gbp-worker-centralus3-66hdq   Running   Standard_D4s_v3   centralus   3      8h

status:
  conditions:
  - lastTransitionTime: "2021-11-12T02:37:13Z"
    message: Instance has not been created
    reason: InstanceNotCreated
    severity: Warning
    status: "False"
    type: InstanceExists
  errorMessage: 'failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"'
  errorReason: InvalidConfiguration
  lastUpdated: "2021-11-12T02:37:14Z"
  phase: Failed
  providerStatus:
    conditions:
    - lastProbeTime: "2021-11-12T02:37:14Z"
      lastTransitionTime: "2021-11-12T02:37:14Z"
      message: 'failed to create vm windows-6fscv: failed to create or get machine: compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"'
      reason: MachineCreationFailed

$ oc logs -f windows-machine-config-operator-867cfb76d7-gbx8q -n openshift-windows-machine-config-operator
2021-11-12T02:37:13.632Z	DEBUG	controller.windowsmachine	invalid Machine	{"name": "windows-6fscv", "error": "no IP addresses defined", "errorVerbose": "no IP addresses defined\ngithub.com/openshift/windows-machine-config-operator/controllers.getInternalIPAddress\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:515\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).isValidMachine\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:203\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).SetupWithManager.func2\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:114\nsigs.k8s.io/controller-runtime/pkg/predicate.Funcs.Update\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/predicate/predicate.go:87\nsigs.k8s.io/controller-runtime/pkg/source/internal.EventHandler.OnUpdate\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/internal/eventsource.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}

$ oc logs -f machine-api-controllers-79b49bc5cb-sjwwr -c machine-controller | grep windows-6fscv
E1112 02:37:14.778599       1 actuator.go:78] Machine error: failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"
W1112 02:37:14.778633       1 controller.go:367] windows-6fscv: failed to create machine: failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"
I1112 02:37:14.778654       1 controller.go:471] Actuator returned invalid configuration error: failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"
I1112 02:37:14.778665       1 controller.go:483] windows-6fscv: going into phase "Failed"
I1112 02:37:14.779503       1 recorder.go:104] controller-runtime/manager/events "msg"="Warning"  "message"="InvalidConfiguration: failed to reconcile machine \"windows-6fscv\": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"InvalidParameter\" Message=\"The value of parameter version is invalid.\" Target=\"version\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"windows-6fscv","uid":"adabd6f0-2519-4409-869c-dc230f048523","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"31512"} "reason"="FailedCreate"

Comment 10 Patrick Dillon 2021-11-12 12:25:54 UTC
@sunzhaohua can you share the machineconfigs or the image ino for the windows test?

thanks for testing

Comment 11 sunzhaohua 2021-11-12 15:43:18 UTC
Patrick, 
must-gather: https://file.rdu.redhat.com/~zhsun/must-gather.local.4933594194145624634.zip

Comment 14 sunzhaohua 2021-12-03 06:18:03 UTC
Verified
We tested using marketplace images to set up IPI and UPI cluster, tested after installation create a new machineset, all work as expected. And do Regression test, no issue were found.
clusterversion: 4.10.0-0.nightly-2021-12-01-164437

Comment 17 errata-xmlrpc 2022-03-10 16:26:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Comment 18 Red Hat Bugzilla 2023-09-15 01:17:06 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.