Bug 1882101 - machine-api-operator fails to deploy due to security constraint
Summary: machine-api-operator fails to deploy due to security constraint
Keywords:
Status: CLOSED DUPLICATE of bug 1883458
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.0
Assignee: David Eads
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-23 19:08 UTC by Michael Gugino
Modified: 2020-09-30 14:49 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-30 14:49:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michael Gugino 2020-09-23 19:08:10 UTC
Some CI runs are failing on Azure such as: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-4.6/1308571357872656384

First indication is there are no worker nodes.  Upon further investigation, the machine-controller is not deployed, and this is because the machine-api-operator is not deployed.  Checking artifacts/deployment.json reveals:

{
    "lastTransitionTime": "2020-09-23T01:15:10Z",
    "lastUpdateTime": "2020-09-23T01:15:10Z",
    "message": "Deployment does not have minimum availability.",
    "reason": "MinimumReplicasUnavailable",
    "status": "False",
    "type": "Available"
},
{
    "lastTransitionTime": "2020-09-23T01:15:10Z",
    "lastUpdateTime": "2020-09-23T01:15:10Z",
    "message": "pods \"machine-api-operator-5c99d74d58-\" is forbidden: unable to validate against any security context constraint: []",
    "reason": "FailedCreate",
    "status": "True",
    "type": "ReplicaFailure"
},
{
    "lastTransitionTime": "2020-09-23T01:25:11Z",
    "lastUpdateTime": "2020-09-23T01:25:11Z",
    "message": "ReplicaSet \"machine-api-operator-5c99d74d58\" has timed out progressing.",
    "reason": "ProgressDeadlineExceeded",
    "status": "False",
    "type": "Progressing"
}


Many operators seem broken, including kube-apiserver:
Operator unavailable (StaticPods_ZeroNodesActive): StaticPodsAvailable: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 2

Unsure what the root cause is.

For reference, the initial 3 master machines are created by the installer and join the bootstrap cluster.  The machine-api has no control over the initial creation or configuration of the master machines.

Comment 1 David Eads 2020-09-23 19:18:15 UTC
I would like to see an [early] test that checks to see if we have the events and flake.  I think this may happen more often on azure: something about the LB perhaps?

Also, I think I see logs that indicate an io timeout from CVO to internal LB.

Comment 3 David Eads 2020-09-30 14:49:08 UTC

*** This bug has been marked as a duplicate of bug 1883458 ***


Note You need to log in before you can comment on or make changes to this bug.