Set up cluster using cluster-bot with https://github.com/openshift/cluster-api-provider-azure/pull/237, tested create machineset after post installation, didn’t encounter error “requires Plan information in the request”. $ oc get clusterversion [11:40:46] NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.ci.test-2021-11-09-012149-ci-ln-vlwy3yt-latest True False 92m Cluster version is 4.10.0-0.ci.test-2021-11-09-012149-ci-ln-vlwy3yt-latest steps: 1. Accept Azure Marketplace term so that the image can be used to create VMs. $ az vm image accept-terms --urn plesk:solution-server-wordpress:plsk-cent-pro-sol-azr-m:18.0.38 2. Create a new machineset, machine could be created successful. But machine stuck in Provisioned status, couldn’t join the cluster, no csr pending. machine.yaml https://privatebin-it-iso.int.open.paas.redhat.com/?10479f0e43f38db7#HuhrgPxxkfeDQXMEmFVgzv8cna34cKWFEYBMucgfLoXC $ oc get machine [11:42:09] NAME PHASE TYPE REGION ZONE AGE zhsun119-lg85r-master-0 Running Standard_D4s_v3 northcentralus 114m zhsun119-lg85r-master-1 Running Standard_D4s_v3 northcentralus 114m zhsun119-lg85r-master-2 Running Standard_D4s_v3 northcentralus 114m zhsun119-lg85r-worker-northcentralus-2qlbz Running Standard_D4s_v3 northcentralus 109m zhsun119-lg85r-worker-northcentralus-pp5pl Running Standard_D4s_v3 northcentralus 109m zhsun119-lg85r-worker-northcentralus-q4259 Running Standard_D4s_v3 northcentralus 109m zhsun119-lg85r-worker-northcentralus1-lwbbc Provisioned Standard_D4s_v3 northcentralus 31m image: offer: solution-server-wordpress publisher: plesk resourceID: "" sku: plsk-cent-pro-sol-azr-m version: 18.0.38 ... status: addresses: - address: zhsun119-lg85r-worker-northcentralus1-lwbbc type: Hostname - address: zhsun119-lg85r-worker-northcentralus1-lwbbc type: InternalDNS - address: zhsun119-lg85r-worker-northcentralus1-lwbbc.13yt3anhfaqeha50bzd5szgnbd.ex.internal.cloudapp.net type: InternalDNS - address: 10.0.128.7 type: InternalIP conditions: - lastTransitionTime: "2021-11-09T03:10:59Z" status: "True" type: InstanceExists lastUpdated: "2021-11-09T03:11:00Z" phase: Provisioned providerStatus: conditions: - lastProbeTime: "2021-11-09T03:10:59Z" lastTransitionTime: "2021-11-09T03:10:59Z" message: machine successfully created reason: MachineCreationSucceeded status: "True" type: MachineCreated metadata: {} vmId: /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/zhsun119-lg85r-rg/providers/Microsoft.Compute/virtualMachines/zhsun119-lg85r-worker-northcentralus1-lwbbc vmState: Running must-gather: https://file.rdu.redhat.com/~zhsun/must-gather.local.4681473631749557535.zip
With the plesk image, it's expected that the machine wouldn't become a full node. It doesn't have ignition within the image to set up the system. Do we have an RHCOS image on the Azure marketplace that we can use to verify this with?
> Do we have an RHCOS image on the Azure marketplace that we can use to verify this with? No, not yet, unfortunately.
Sure Jole, I am working on this, will post the result later. For the regression want to make sure we only run cloud teams cases or the whole teams cases. Now I am running the whole teams cases.
Hi Patrick, I set up a cluster using cluster-bot with https://github.com/openshift/cluster-api-provider-azure/pull/240, with WMCO, but windows machines failed. $ oc get clusterversion [18:13:28] NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.ci.test-2021-11-12-013426-ci-ln-r2wlr1b-latest True False 7h46m Cluster version is 4.10.0-0.ci.test-2021-11-12-013426-ci-ln-r2wlr1b-latest $ oc get machine [18:13:34] NAME PHASE TYPE REGION ZONE AGE windows-6fscv Failed 7h37m windows-9lnwd Failed 7h37m windows1-twbsv Failed 6h47m zhsun1112-t8gbp-master-0 Running Standard_D4s_v3 centralus 2 8h zhsun1112-t8gbp-master-1 Running Standard_D4s_v3 centralus 3 8h zhsun1112-t8gbp-master-2 Running Standard_D4s_v3 centralus 1 8h zhsun1112-t8gbp-worker-centralus1-9p6f2 Running Standard_D4s_v3 centralus 1 8h zhsun1112-t8gbp-worker-centralus2-vs5m8 Running Standard_D4s_v3 centralus 2 8h zhsun1112-t8gbp-worker-centralus3-66hdq Running Standard_D4s_v3 centralus 3 8h status: conditions: - lastTransitionTime: "2021-11-12T02:37:13Z" message: Instance has not been created reason: InstanceNotCreated severity: Warning status: "False" type: InstanceExists errorMessage: 'failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"' errorReason: InvalidConfiguration lastUpdated: "2021-11-12T02:37:14Z" phase: Failed providerStatus: conditions: - lastProbeTime: "2021-11-12T02:37:14Z" lastTransitionTime: "2021-11-12T02:37:14Z" message: 'failed to create vm windows-6fscv: failed to create or get machine: compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version"' reason: MachineCreationFailed $ oc logs -f windows-machine-config-operator-867cfb76d7-gbx8q -n openshift-windows-machine-config-operator 2021-11-12T02:37:13.632Z DEBUG controller.windowsmachine invalid Machine {"name": "windows-6fscv", "error": "no IP addresses defined", "errorVerbose": "no IP addresses defined\ngithub.com/openshift/windows-machine-config-operator/controllers.getInternalIPAddress\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:515\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).isValidMachine\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:203\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).SetupWithManager.func2\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:114\nsigs.k8s.io/controller-runtime/pkg/predicate.Funcs.Update\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/predicate/predicate.go:87\nsigs.k8s.io/controller-runtime/pkg/source/internal.EventHandler.OnUpdate\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/internal/eventsource.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"} $ oc logs -f machine-api-controllers-79b49bc5cb-sjwwr -c machine-controller | grep windows-6fscv E1112 02:37:14.778599 1 actuator.go:78] Machine error: failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version" W1112 02:37:14.778633 1 controller.go:367] windows-6fscv: failed to create machine: failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version" I1112 02:37:14.778654 1 controller.go:471] Actuator returned invalid configuration error: failed to reconcile machine "windows-6fscv": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidParameter" Message="The value of parameter version is invalid." Target="version" I1112 02:37:14.778665 1 controller.go:483] windows-6fscv: going into phase "Failed" I1112 02:37:14.779503 1 recorder.go:104] controller-runtime/manager/events "msg"="Warning" "message"="InvalidConfiguration: failed to reconcile machine \"windows-6fscv\": compute.VirtualMachineImagesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"InvalidParameter\" Message=\"The value of parameter version is invalid.\" Target=\"version\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"windows-6fscv","uid":"adabd6f0-2519-4409-869c-dc230f048523","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"31512"} "reason"="FailedCreate"
@sunzhaohua can you share the machineconfigs or the image ino for the windows test? thanks for testing
Patrick, must-gather: https://file.rdu.redhat.com/~zhsun/must-gather.local.4933594194145624634.zip
Verified We tested using marketplace images to set up IPI and UPI cluster, tested after installation create a new machineset, all work as expected. And do Regression test, no issue were found. clusterversion: 4.10.0-0.nightly-2021-12-01-164437
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days