Bug 1961925 - New ManagementCPUsOverride admission plugin blocks pod creation in clusters with no nodes
Summary: New ManagementCPUsOverride admission plugin blocks pod creation in clusters w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.8.0
Assignee: Artyom
QA Contact: Weinan Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-19 04:04 UTC by Seth Jennings
Modified: 2021-07-27 23:09 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 23:09:09 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kubernetes pull 756 0 None open Bug 1961925: UPSTREAM: <carry>: Does not prevent pod creation because of no nodes reason when it runs under the regular ... 2021-05-19 18:01:12 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:09:28 UTC

Description Seth Jennings 2021-05-19 04:04:04 UTC
Description of problem:
New ManagementCPUsOverride admission plugin blocks pod creation in clusters with no nodes

Version-Release number of selected component (if applicable):
4.8.0-fc.2 and later

How reproducible:
Always

Steps to Reproduce:
1. Create a Hypershift cluster with no nodes
2. Observe that pods can not be created
3.

Actual results:
FailedCreate        replicaset/network-operator-6d876489f9   Error creating: pods "network-operator-6d876489f9-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the cluster does not have any nodes

Expected results:
The pods should be created and wait in Pending state

Additional info:
When a control plane is provided external to the nodes in the cluster, it is possible to create a cluster with a control plane but no nodes.  While such a cluster is not able to actually run pods, a user should be able to create pods such that they will be scheduled as soon as a Node becomes available.

With the new behavior the Pod, typically created by a ReplicaSet, is rejected at admission time.  The ReplicaSet controller does not expect this and does generic backoff, not understanding the cause of the rejection.  When Nodes become available, depending on how long the backoff has been occurring, it can be many minutes before the ReplicaSet attempts to create the Pods again.

With the old behavior the Pods could be created and placed in a Pending (i.e. unscheduled) state.  However, the kube-scheduler watches for Nodes to be added and immediately schedules pods when Nodes become present.

This PR introduced the regression
https://github.com/openshift/kubernetes/pull/632

Current thinking is we should exempt any topology that doesn't explicitly need this admission plugin from the Pod blocking admission policy.

Comment 6 errata-xmlrpc 2021-07-27 23:09:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.