Bug 1983612
| Summary: | When using boot-from-volume "image", InstanceCreate leaks volumes in case machine-controller is rebooted | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Pierre Prinetti <pprinett> |
| Component: | Cloud Compute | Assignee: | Pierre Prinetti <pprinett> |
| Cloud Compute sub component: | OpenStack Provider | QA Contact: | Itzik Brown <itbrown> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | low | ||
| Priority: | medium | CC: | adduarte, egarcia, itbrown, m.andre, mfedosin, pprinett |
| Version: | 4.8 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: cluster-api-provider-openstack created instance in a non-idempotent manner when it comes to volumes
Consequence: In case of a sudden crash of cluster-api-provider-openstack after boot-volume creation, but before the corresponding instance, then the volume would be left unused in OpenStack.
Fix: When creating a new instance, cluster-api-provider-openstack now cleans volumes with the target name before creating a new one
Result: cluster-api-provider-openstack instance creation is idempotent when it comes to root volumes.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:39:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Pierre Prinetti
2021-07-19 08:32:56 UTC
One solution could be to look for a volume by name before creating it in InstanceCreate. Since it's likely the result of an interrupted operation, we could delete and recreate. One alternative would be to match the image checksum with the volume checksum, but this might be a terribly slow operation with a low chance of being useful. The question to be solved is whether it is fine to delete all volumes named after the machine upon machine creation. Verified on: OCP 4.9.0-0.nightly-2021-07-27-181211 OSP HOS-16.1-RHEL-8-20210604.n.0 Scaled worker-0 machineset $ oc scale --replicas=2 machineset ostest-bhssb-worker-0 -n openshift-machine-api Verified a new volumes was created (shiftstack) [stack@undercloud-0 ~]$ openstack volume list |grep worker-0 | b1197586-047d-46d7-b702-7125d305b8ac | ostest-bhssb-worker-0-749v9 | in-use | 25 | Attached to ostest-bhssb-worker-0-749v9 on /dev/vda | | c16d25e9-4cbd-476b-9963-c4f30b58ae50 | pvc-f110027a-f981-4e95-9964-a0e5d93b0f93 | in-use | 100 | Attached to ostest-bhssb-worker-0-zvtw4 on /dev/vdb | | 330546f5-661d-4b8a-8349-7febedc82f79 | ostest-bhssb-worker-0-zvtw4 | in-use | 25 | Attached to ostest-bhssb-worker-0-zvtw4 on /dev/vda | Deleted the machine-api-controller $ oc delete pod -n openshift-machine-api machine-api-controllers-f85979566-wkjlf Checked the new instance was created $ openstack server list |grep worker-0 | c75e5125-1960-49cf-98ad-8e94e167e266 | ostest-bhssb-worker-0-749v9 | ACTIVE | StorageNFS=172.17.5.199; ostest-bhssb-openshift=10.196.1.242 | | | | 4e5b73e1-760e-4c26-bc0c-80a205ff653f | ostest-bhssb-worker-0-zvtw4 | ACTIVE | StorageNFS=172.17.5.176; ostest-bhssb-openshift=10.196.0.104 | | | Checked that a new worker was created and no new volume got created $ oc get nodes NAME STATUS ROLES AGE VERSION ostest-bhssb-master-0 Ready master 24h v1.21.1+8268f88 ostest-bhssb-master-1 Ready master 24h v1.21.1+8268f88 ostest-bhssb-master-2 Ready master 24h v1.21.1+8268f88 ostest-bhssb-worker-0-749v9 Ready worker 71s v1.21.1+8268f88 ostest-bhssb-worker-0-zvtw4 Ready worker 22h v1.21.1+8268f88 ostest-bhssb-worker-1-6hgcn Ready worker 24h v1.21.1+8268f88 ostest-bhssb-worker-2-7plmn Ready worker 24h v1.21.1+8268f88 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |