Bug 2096445 - Assisted service POD keeps crashing after a bare metal host is created
Summary: Assisted service POD keeps crashing after a bare metal host is created
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Infrastructure Operator
Version: rhacm-2.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: rhacm-2.6
Assignee: Eran Cohen
QA Contact: Chad Crum
Derek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-13 20:41 UTC by Eran Cohen
Modified: 2022-09-06 22:31 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-06 22:30:54 UTC
Target Upstream Version:
Embargoed:
cbynum: rhacm-2.6+
cbynum: rhacm-2.6.z+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift assisted-service pull 3938 0 None Merged BZ-2096460: Spoke BMH stuck inspecting when deployed via the converged workflow 2022-06-19 12:24:45 UTC
Github stolostron backlog issues 23213 0 None None None 2022-06-13 22:37:12 UTC
Red Hat Issue Tracker MGMTBUGSM-433 0 None None None 2022-06-13 20:49:34 UTC
Red Hat Product Errata RHSA-2022:6370 0 None None None 2022-09-06 22:31:11 UTC

Description Eran Cohen 2022-06-13 20:41:47 UTC
Description of the problem:
Unable to deploy a spoke cluster on an OCP 4.11 hub - spoke BMHs stuck in inspecting state and assisted-service pod in CrashLoopBackOff


ernal/controller/controllers/preprovisioningimage_controller.go:78" go-id=733 preprovisioning_image=ostest-extraworker-0 preprovisioning_image_namespace=openshift-machine-api request_id=0921967b-ab58-4c39-a618-95fb8ba57d02
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x273619f]

goroutine 733 [running]:
github.com/openshift/assisted-service/internal/controller/controllers.(*PreprovisioningImageReconciler).AddIronicAgentToInfraEnv(0xc001d08900, {0x36c75d0, 0xc0006b5c50}, {0x375b410, 0xc0011c9b90}, 0xc001168b40)
	/go/src/github.com/openshift/origin/internal/controller/controllers/preprovisioningimage_controller.go:292 +0x17f
github.com/openshift/assisted-service/internal/controller/controllers.(*PreprovisioningImageReconciler).Reconcile(0xc001d08900, {0x36c75d0, 0xc0006b5c20}, {{{0xc001150f30, 0x2ee5940}, {0xc001150f18, 0x30}}})
	/go/src/github.com/openshift/origin/internal/controller/controllers/preprovisioningimage_controller.go:102 +0xb8b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc00080e210, {0x36c75d0, 0xc0006b5bc0}, {{{0xc001150f30, 0x2ee5940}, {0xc001150f18, 0x415694}}})
	/go/src/github.com/openshift/origin/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114 +0x26f
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00080e210, {0x36c7528, 0xc000ca9fc0}, {0x2cc5420, 0xc0008d3ea0})
	/go/src/github.com/openshift/origin/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311 +0x33e
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00080e210, {0x36c7528, 0xc000ca9fc0})
	/go/src/github.com/openshift/origin/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/go/src/github.com/openshift/origin/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/go/src/github.com/openshift/origin/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:223 +0x357

Release version:

Release version:
- Latest upstream assisted-service-operator
- OCP 4.11 on hub (4.11.0-0.nightly-2022-05-25-193227)

Operator snapshot version:

OCP version:

Browser Info:

Steps to reproduce:
1. Create an infraEnv (without clusterRef)
2. Create BMH
3.

Actual results:

Expected results:

Additional info:

Comment 1 Chad Crum 2022-07-19 17:16:36 UTC
QE no longer seeing assisted pod crash or bmh stuck inspecting with recent builds. Spoke deploys properly e2e with converged flow enabled.

Comment 4 errata-xmlrpc 2022-09-06 22:30:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Advanced Cluster Management 2.6.0 security updates and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6370


Note You need to log in before you can comment on or make changes to this bug.