Bug 1820421
Summary: | [OSP] Update machine with an invalid flavor, machine becomes 'Failed' | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jianwei Hou <jhou> |
Component: | Installer | Assignee: | Mike Fedosin <mfedosin> |
Installer sub component: | OpenShift on OpenStack | QA Contact: | David Sanz <dsanzmor> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | low | ||
Priority: | low | CC: | m.andre, mfedosin, pprinett |
Version: | 4.4 | ||
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Cluster API Provider OpenStack didn't validate flavors before updating machines.
Consequence: Machine with updated invalid flavors failed to boot.
Fix: Validate flavor existence before updating machines and return an error immediately.
Result: In case of invalid flavor Cluster API Provider OpenStack returns an error to the user immediately and doesn't update the machine.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 15:57:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jianwei Hou
2020-04-03 03:24:49 UTC
I'm not sure I understand this bug report. What do you suggest should be the status of the instance if you pass an invalid flavor in the providerSpec? It seems to match the definition of "Failed": Create() returns a invalidConfigurationMachineError type or Exists() is False and machine has a providerID/address. What happens on OSP, is when updating this machine, there were 2 events(from oc describe machine): a create following a delete. Because we set an invalid value to the flavor, the creation failed resulting this machine in failed phase. On other cloud providers(aws,gcp,azure), there is only an update event(no delete/create) and machine does not become 'Failed'. The reporting is because the same operation on OSP has a different result. Considering the priority assigned to this bug and our team capacity, we are deferring this bug to an upcoming sprint. Please let us know if there are reasons for us to reprioritize. Verified on 4.6.0-0.nightly-2020-08-27-005538 Changed flavor to machine, log shows: I0827 11:41:47.735711 1 utils.go:99] Cloud provider CA cert not provided, using system trust bundle I0827 11:41:48.406889 1 controller.go:285] mrnd-13-46-gckws-worker-0-qks9l: reconciling machine triggers idempotent update I0827 11:41:48.419032 1 utils.go:99] Cloud provider CA cert not provided, using system trust bundle E0827 11:41:48.730625 1 actuator.go:550] Machine error mrnd-13-46-gckws-worker-0-qks9l: Can't find a flavor with name ci.m1.xlargee: Unable to find flavor with name ci.m1.xlargee E0827 11:41:48.730663 1 controller.go:287] mrnd-13-46-gckws-worker-0-qks9l: error updating machine: Can't find a flavor with name ci.m1.xlargee: Unable to find flavor with name ci.m1.xlargee I0827 11:41:49.731079 1 controller.go:172] mrnd-13-46-gckws-worker-0-qks9l: reconciling Machine I0827 11:41:49.754563 1 utils.go:99] Cloud provider CA cert not provided, using system trust bundle I0827 11:41:50.660861 1 controller.go:285] mrnd-13-46-gckws-worker-0-qks9l: reconciling machine triggers idempotent update I0827 11:41:50.677234 1 utils.go:99] Cloud provider CA cert not provided, using system trust bundle E0827 11:41:51.033382 1 actuator.go:550] Machine error mrnd-13-46-gckws-worker-0-qks9l: Can't find a flavor with name ci.m1.xlargee: Unable to find flavor with name ci.m1.xlargee E0827 11:41:51.033564 1 controller.go:287] mrnd-13-46-gckws-worker-0-qks9l: error updating machine: Can't find a flavor with name ci.m1.xlargee: Unable to find flavor with name ci.m1.xlargee I0827 11:41:52.033998 1 controller.go:172] mrnd-13-46-gckws-worker-0-qks9l: reconciling Machine I0827 11:41:52.054118 1 utils.go:99] Cloud provider CA cert not provided, using system trust bundle I0827 11:41:53.095999 1 controller.go:285] mrnd-13-46-gckws-worker-0-qks9l: reconciling machine triggers idempotent update But machine keeps Running status [morenod@morenod-laptop ~]$ oc get nodes | grep mrnd-13-46-gckws-worker-0-qks9l mrnd-13-46-gckws-worker-0-qks9l Ready worker 23m v1.19.0-rc.2+f71a7ab-dirty [morenod@morenod-laptop ~]$ oc get machines -A | grep mrnd-13-46-gckws-worker-0-qks9l openshift-machine-api mrnd-13-46-gckws-worker-0-qks9l Running ci.m1.xlarge regionOne nova 27m [morenod@morenod-laptop ~]$ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |