Hide Forgot
Description of problem: I setup a 4.5 vsphere cluster, and create a machineset with AWS provider spec, then delete machineset, machine stuck in Deleting status. Then upgrade cluster from 4.5 all the way to 4.11 then remove finalizer, machine still couldn't be deleted because of the validating webhooks. Version-Release number of selected component (if applicable): 4.11.0-0.nightly-2022-05-18-171831 How reproducible: Always Steps to Reproduce: 1. Setup a 4.5 vsphere cluster 2. Create a new machineset in vSphere cluster with AWS provider spec 3. Delete machineset, machine stuck in Deleting status 4. Upgrade cluster from 4.5 all the way to 4.11 then remove finalizer, still couldn't delete machine. $ oc edit machine zhsunvs2-g6mkw-worker1-rxf47 Warning: providerSpec.numCPUs: 0 is less than the minimum value (2): the minimum value will be used instead Warning: providerSpec.memoryMiB: 0 is less than the recommended minimum value (2048): nodes may not boot correctly Warning: providerSpec.diskGiB: 0 is less than the recommended minimum (120): nodes may fail to start if disk size is too low error: machines.machine.openshift.io "zhsunvs2-g6mkw-worker1-rxf47" could not be patched: admission webhook "validation.machine.machine.openshift.io" denied the request: [providerSpec.template: Required value: template must be provided, providerSpec.workspace: Required value: workspace must be provided, providerSpec.network.devices: Required value: at least 1 network device must be provided] $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsunvs19-bdttx-master-0 Running 10h zhsunvs19-bdttx-master-1 Running 10h zhsunvs19-bdttx-master-2 Running 10h zhsunvs19-bdttx-worker-d99jf Running 10h zhsunvs19-bdttx-worker-mtjgx Running 10h zhsunvs19-bdttx-worker2-7wfh2 Deleting 9h Actual results: Remove finalizer, machine couldn't be deleted. Expected results: Remove finalizer, machine could be deleted. Additional info: must-gather: https://drive.google.com/file/d/1ulYQI5yR2LTgnnRnC4GcWzkxmk1Hin6x/view?usp=sharing This is for https://issues.redhat.com/browse/OCPCLOUD-1426
This still doesn't work, same steps with the bug descripted. After upgrade cluster to 4.12, then remove finalizer, machine finalizer couldn't be updated. $ oc get clusterversion [20:45:51] NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.0-0.nightly-2022-07-20-030220 True False 157m Cluster version is 4.12.0-0.nightly-2022-07-20-030220 $ oc get machine [20:40:42] NAME PHASE TYPE REGION ZONE AGE zhsunvs4-ntvg6-master-0 Running 12h zhsunvs4-ntvg6-master-1 Running 12h zhsunvs4-ntvg6-master-2 Running 12h zhsunvs4-ntvg6-worker-mf2gz Running 12h zhsunvs4-ntvg6-worker-v26s8 Running 12h zhsunvs4-ntvg6-worker1-tjwtt Deleting 11h $ oc edit machine zhsunvs4-ntvg6-worker1-tjwtt [20:45:59] Warning: providerSpec.numCPUs: 0 is missing or less than the minimum value (2): nodes may not boot correctly Warning: providerSpec.memoryMiB: 0 is missing or less than the recommended minimum value (2048): nodes may not boot correctly Warning: providerSpec.diskGiB: 0 is missing or less than the recommended minimum (120): nodes may fail to start if disk size is too low Warning: providerSpec.credentialsSecret: Invalid value: "aws-cloud-credentials": not found. Expected CredentialsSecret to exist error: machines.machine.openshift.io "zhsunvs4-ntvg6-worker1-tjwtt" could not be patched: admission webhook "validation.machine.machine.openshift.io" denied the request: [providerSpec.template: Required value: template must be provided, providerSpec.workspace: Required value: workspace must be provided, providerSpec.network.devices: Required value: at least 1 network device must be provided] You can run `oc replace -f /var/folders/0m/7xwxpmks77n3dm5rr8x8g92r0000gn/T/oc-edit-3841744707.yaml` to try this update again.
Hmmm... I was able to delete the finalizer using these commands: ❯ oc delete machines -nopenshift-machine-api ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29 machine.machine.openshift.io "ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29" deleted ^C ❯ oc get machines -nopenshift-machine-api ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29 -oyaml apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: annotations: machine.openshift.io/instance-state: running creationTimestamp: "2022-08-23T15:05:26Z" deletionGracePeriodSeconds: 0 deletionTimestamp: "2022-08-23T15:41:05Z" finalizers: - machine.machine.openshift.io generateName: ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b- generation: 3 ... ❯ oc patch machines -nopenshift-machine-api ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29 -p '{"metadata":{"finalizers":null}}' --type=merge machine.machine.openshift.io/ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29 patched ❯ oc get machines -nopenshift-machine-api ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29 -oyaml Error from server (NotFound): machines.machine.openshift.io "ci-ln-sc52m12-76ef8-2c4ss-worker-us-west-2b-5wl29" not found
@mfedosin I tried again, I couldn't delete the finalizer, please help to take a look if we need support this case. step: 1. Setup a 4.5 vsphere cluster (only in 4.5 we can create a machineset with aws ProviderSpec, from 4.6 webhook don't allowed to create) 2. Create a new machineset in vSphere cluster with AWS provider spec 3. Delete machine, machine stuck in Deleting status 4. Upgrade cluster from 4.5 all the way to 4.12 then remove finalizer. After upgrading to 4.12, remove finalizer $ oc patch machines -nopenshift-machine-api zhsun-v25-gl6hs-worker1-jfnrm -p '{"metadata":{"finalizers":null}}' --type=merge Warning: providerSpec.numCPUs: 0 is missing or less than the minimum value (2): nodes may not boot correctly Warning: providerSpec.memoryMiB: 0 is missing or less than the recommended minimum value (2048): nodes may not boot correctly Warning: providerSpec.diskGiB: 0 is missing or less than the recommended minimum (120): nodes may fail to start if disk size is too low Warning: providerSpec.credentialsSecret: Invalid value: "aws-cloud-credentials": not found. Expected CredentialsSecret to exist Error from server ([providerSpec.template: Required value: template must be provided, providerSpec.workspace: Required value: workspace must be provided, providerSpec.network.devices: Required value: at least 1 network device must be provided]): admission webhook "validation.machine.machine.openshift.io" denied the request: [providerSpec.template: Required value: template must be provided, providerSpec.workspace: Required value: workspace must be provided, providerSpec.network.devices: Required value: at least 1 network device must be provided] $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsun-v25-gl6hs-master-0 Running 28h zhsun-v25-gl6hs-master-1 Running 28h zhsun-v25-gl6hs-master-2 Running 28h zhsun-v25-gl6hs-worker-cnw54 Running 27h zhsun-v25-gl6hs-worker-jc2cl Running 27h zhsun-v25-gl6hs-worker1-jfnrm Deleting 27h zhsun-v25-gl6hs-worker1-v8mqq 27h zhsun-v25-gl6hs-worker2-6cq9n 27h zhsun-v25-gl6hs-worker2-k96pd Deleting 27h
Verified clusterverison: 4.12.0-0.ci-2022-09-09-121216 Same steps as Comment 4, machine could be deleted. $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsunvs99-vkmns-master-0 Running 13h zhsunvs99-vkmns-master-1 Running 13h zhsunvs99-vkmns-master-2 Running 13h zhsunvs99-vkmns-worker-8cqnc Running 13h zhsunvs99-vkmns-worker-99v99 Running 13h zhsunvs99-vkmns-worker1-2swvx 13h zhsunvs99-vkmns-worker1-h5nn2 Deleting 13h zhsunvs99-vkmns-worker2-2dbmd Deleting 13h zhsunvs99-vkmns-worker2-ffxwp 13h $ oc patch machines -nopenshift-machine-api zhsunvs99-vkmns-worker1-h5nn2 -p '{"metadata":{"finalizers":null}}' --type=merge machine.machine.openshift.io/zhsunvs99-vkmns-worker1-h5nn2 patched $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsunvs99-vkmns-master-0 Running 13h zhsunvs99-vkmns-master-1 Running 13h zhsunvs99-vkmns-master-2 Running 13h zhsunvs99-vkmns-worker-8cqnc Running 13h zhsunvs99-vkmns-worker-99v99 Running 13h zhsunvs99-vkmns-worker1-2swvx 13h zhsunvs99-vkmns-worker2-2dbmd Deleting 13h zhsunvs99-vkmns-worker2-ffxwp 13h
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399