2061833 – A single worker can be assigned to multiple baremetal hosts

Bug 2061833 - A single worker can be assigned to multiple baremetal hosts

Summary: A single worker can be assigned to multiple baremetal hosts

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Bare Metal Hardware Provisioning
Sub Component:
Version:	4.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Steven Hardy
QA Contact:	Eldar Weiss
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-03-08 15:31 UTC by Stephen Benjamin
Modified:	2022-08-10 10:52 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:	job=periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6=all
Last Closed:	2022-08-10 10:52:44 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-api-provider-baremetal pull 167	0	None	open	Bug 2061833: Delay after ensuring annotation	2022-03-09 13:03:05 UTC
Red Hat Product Errata	RHSA-2022:5069	0	None	None	None	2022-08-10 10:52:54 UTC

Description Stephen Benjamin 2022-03-08 15:31:59 UTC

TRT has noticed an uptick in failures on certain metal jobs. One of the failure cases seems to be worker not getting provisioned.

See https://search.ci.openshift.org/?search=6+machines.*node&maxAge=168h&context=1&type=bug%2Bjunit&name=.*4.1.*metal.*&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job


In this particular job: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6/1500945414994530304

A worker machine is stuck "provisioning" because there's no available bare metal host:

    status:
      conditions:
      - lastTransitionTime: "2022-03-07T22:21:31Z"
        message: Instance has not been created
        reason: InstanceNotCreated
        severity: Warning
        status: "False"
        type: InstanceExists
      errorMessage: No available BareMetalHost found


When looking at at the BMH, you can see ostest-worker-1 and ostest-worker-2 are consumed by the same worker:

    $ oc get bmh -A
    NAMESPACE               NAME              STATE                    CONSUMER                      ONLINE   ERROR   AGE
    openshift-machine-api   ostest-worker-2   provisioned              ostest-rbt8c-worker-0-cpkx7   true             17h
    openshift-machine-api   ostest-worker-1   provisioned              ostest-rbt8c-worker-0-cpkx7   true             17h
    openshift-machine-api   ostest-master-1   externally provisioned   ostest-rbt8c-master-1         true             17h
    openshift-machine-api   ostest-master-2   externally provisioned   ostest-rbt8c-master-2         true             17h
    openshift-machine-api   ostest-master-0   externally provisioned   ostest-rbt8c-master-0         true             17h
    openshift-machine-api   ostest-worker-0   provisioned              ostest-rbt8c-worker-0-8l4tx   true             17h


I am not sure what logic cluster-api-provider-baremetal is using to assign a BMH to a worker, but there seems to be some kind of race where 2 BMH can be consumed by the same machine.

Comment 1 Zane Bitter 2022-03-08 16:13:48 UTC

2022-03-07T22:28:10.310571787Z 2022/03/07 22:28:10 Creating machine ostest-rbt8c-worker-0-cpkx7
2022-03-07T22:28:10.310862634Z 2022/03/07 22:28:10 Host 'ostest-worker-0' matched hostSelector for Machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.310862634Z 2022/03/07 22:28:10 Host 'ostest-worker-1' matched hostSelector for Machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.310862634Z 2022/03/07 22:28:10 Host 'ostest-worker-2' matched hostSelector for Machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.310862634Z 2022/03/07 22:28:10 3 hosts available while choosing host for machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.310862634Z 2022/03/07 22:28:10 Associating machine ostest-rbt8c-worker-0-cpkx7 with host ostest-worker-2
2022-03-07T22:28:10.328955287Z 2022/03/07 22:28:10 setting annotation for ostest-rbt8c-worker-0-cpkx7 to metal3.io/BareMetalHost="openshift-machine-api/ostest-worker-2"
2022-03-07T22:28:10.328955287Z 2022/03/07 22:28:10 setting annotation for ostest-rbt8c-worker-0-cpkx7 to machine.openshift.io/instance-state="available"
2022-03-07T22:28:10.338793000Z W0307 22:28:10.338744       1 controller.go:366] ostest-rbt8c-worker-0-cpkx7: failed to create machine: requeue in: 0s
2022-03-07T22:28:10.338793000Z I0307 22:28:10.338772       1 controller.go:460] Actuator returned requeue-after error: requeue in: 0s

2022-03-07T22:28:10.697696323Z 2022/03/07 22:28:10 Creating machine ostest-rbt8c-worker-0-cpkx7
2022-03-07T22:28:10.698005837Z 2022/03/07 22:28:10 Host 'ostest-worker-2' matched hostSelector for Machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.698005837Z 2022/03/07 22:28:10 Host 'ostest-worker-0' matched hostSelector for Machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.698005837Z 2022/03/07 22:28:10 Host 'ostest-worker-1' matched hostSelector for Machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.698005837Z 2022/03/07 22:28:10 3 hosts available while choosing host for machine 'ostest-rbt8c-worker-0-cpkx7'
2022-03-07T22:28:10.698021319Z 2022/03/07 22:28:10 Associating machine ostest-rbt8c-worker-0-cpkx7 with host ostest-worker-1
2022-03-07T22:28:10.720600833Z 2022/03/07 22:28:10 setting annotation for ostest-rbt8c-worker-0-cpkx7 to metal3.io/BareMetalHost="openshift-machine-api/ostest-worker-1"
2022-03-07T22:28:10.720600833Z 2022/03/07 22:28:10 setting annotation for ostest-rbt8c-worker-0-cpkx7 to machine.openshift.io/instance-state="available"
2022-03-07T22:28:10.725309867Z W0307 22:28:10.725253       1 controller.go:366] ostest-rbt8c-worker-0-cpkx7: failed to create machine: failed to update machine annotation: Operation cannot be fulfilled on machines.machine.openshift.io "ostest-rbt8c-worker-0-cpkx7": the object has been modified; please apply your changes to the latest version and try again
2022-03-07T22:28:10.725339030Z E0307 22:28:10.725305       1 controller.go:304] controller-runtime/manager/controller/machine_controller "msg"="Reconciler error" "error"="failed to update machine annotation: Operation cannot be fulfilled on machines.machine.openshift.io \"ostest-rbt8c-worker-0-cpkx7\": the object has been modified; please apply your changes to the latest version and try again" "name"="ostest-rbt8c-worker-0-cpkx7" "namespace"="openshift-machine-api" 

It looks like we're requeuing before the cache has been updated with either the Consumer ref in the BMH ostest-worker-2 or the annotation on the Machine, and thus it is selecting another Machine and also claiming that one. We fail to update the annotation, but by then it is too late.

It's not clear to me what could possibly have changed recently to cause this?

The easiest fix is probably to add a delay at https://github.com/openshift/cluster-api-provider-baremetal/blob/master/pkg/cloud/baremetal/actuators/machine/actuator.go#L672
We don't actually want to requeue at all - we should just let the reconcile be triggered when the change to the annotations shows up in the cache - but we have to return RequeueAfterError{} to prevent the controller deciding we have completed the Create phase prematurely.

Comment 2 Devan Goodwin 2022-05-02 17:32:10 UTC

PR sitting for a few weeks: https://github.com/openshift/cluster-api-provider-baremetal/pull/167 needs an approver.

Comment 7 errata-xmlrpc 2022-08-10 10:52:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Note You need to log in before you can comment on or make changes to this bug.