1882101 – machine-api-operator fails to deploy due to security constraint

Bug 1882101 - machine-api-operator fails to deploy due to security constraint

Summary: machine-api-operator fails to deploy due to security constraint

Keywords:
Status:	CLOSED DUPLICATE of bug 1883458
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.6.0
Assignee:	David Eads
QA Contact:	Ke Wang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-09-23 19:08 UTC by Michael Gugino
Modified:	2020-09-30 14:49 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-09-30 14:49:08 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Michael Gugino 2020-09-23 19:08:10 UTC

Some CI runs are failing on Azure such as: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-4.6/1308571357872656384

First indication is there are no worker nodes.  Upon further investigation, the machine-controller is not deployed, and this is because the machine-api-operator is not deployed.  Checking artifacts/deployment.json reveals:

{
    "lastTransitionTime": "2020-09-23T01:15:10Z",
    "lastUpdateTime": "2020-09-23T01:15:10Z",
    "message": "Deployment does not have minimum availability.",
    "reason": "MinimumReplicasUnavailable",
    "status": "False",
    "type": "Available"
},
{
    "lastTransitionTime": "2020-09-23T01:15:10Z",
    "lastUpdateTime": "2020-09-23T01:15:10Z",
    "message": "pods \"machine-api-operator-5c99d74d58-\" is forbidden: unable to validate against any security context constraint: []",
    "reason": "FailedCreate",
    "status": "True",
    "type": "ReplicaFailure"
},
{
    "lastTransitionTime": "2020-09-23T01:25:11Z",
    "lastUpdateTime": "2020-09-23T01:25:11Z",
    "message": "ReplicaSet \"machine-api-operator-5c99d74d58\" has timed out progressing.",
    "reason": "ProgressDeadlineExceeded",
    "status": "False",
    "type": "Progressing"
}


Many operators seem broken, including kube-apiserver:
Operator unavailable (StaticPods_ZeroNodesActive): StaticPodsAvailable: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 2

Unsure what the root cause is.

For reference, the initial 3 master machines are created by the installer and join the bootstrap cluster.  The machine-api has no control over the initial creation or configuration of the master machines.

Comment 1 David Eads 2020-09-23 19:18:15 UTC

I would like to see an [early] test that checks to see if we have the events and flake.  I think this may happen more often on azure: something about the LB perhaps?

Also, I think I see logs that indicate an io timeout from CVO to internal LB.

Comment 3 David Eads 2020-09-30 14:49:08 UTC


*** This bug has been marked as a duplicate of bug 1883458 ***

Note You need to log in before you can comment on or make changes to this bug.