Bug 1998552
| Summary: | Enforce OpenShift's defined kubelet version skew policies | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Luis Sanchez <sanchezl> |
| Component: | kube-apiserver | Assignee: | Luis Sanchez <sanchezl> |
| Status: | CLOSED ERRATA | QA Contact: | Rahul Gangwar <rgangwar> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.9 | CC: | aos-bugs, mfojtik, rgangwar, wking, xxia |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:49:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2001244 | ||
|
Description
Luis Sanchez
2021-08-27 14:45:09 UTC
One possible verification sketch: 1. Install $VERSION_1 2. Pause the compute MachineConfigPool. 3. Update to $VERSION_2 -> $VERSION_3 -> 4.9.0-rc.1 (or other recent 4.9). 4. Check Upgradeable on the kube-apiserver ClusterOperator. For example, installing 4.7.30, pausing the pool, updating to 4.8.11, and updating to 4.9.0-rc.1 would give you a skew of 2, which for the odd 4.9 release is behind the 0-or-1 acceptable skew [1], so it should get Upgradeable=False, reason=KubeletMinorVersionUnsupported, with a message like: Unsupported kubelet minor versions on nodes $NODES are too far behind the target API server version ($4_9_API_SERVER_VERSION). Getting at KubeletVersionUnknown might be harder; maybe the node folks have ideas at how you could set bogus information in the Node resource? [1]: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1199/files#diff-22001281e3b968448f2558fd87069f7dbe886ce349047d0270433e17ece4372aR37 Upgrade is success 4.7 to 4.9 by pausing machine-config-pool for worker, and get message for OCP odd version:
'KubeletMinorVersionUpgradeable: Unsupported kubelet minor versions on nodes ip-10-0-134-126.us-east-2.compute.internal, ip-10-0-177-196.us-east-2.compute.internal, and ip-10-0-197-69.us-east-2.compute.internal are too far behind the target API server version (1.22.1).'
reason: KubeletMinorVersion_KubeletMinorVersionUnsupported
status: "False"
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.0-rc.1 True False 80m Cluster version is 4.9.0-rc.1
oc get node -A
NAME STATUS ROLES AGE VERSION
ip-10-0-131-50.us-east-2.compute.internal Ready master 5h1m v1.22.0-rc.0+75ee307
ip-10-0-134-126.us-east-2.compute.internal Ready worker 4h56m v1.20.0+9689d22
ip-10-0-170-97.us-east-2.compute.internal Ready master 5h2m v1.22.0-rc.0+75ee307
ip-10-0-177-196.us-east-2.compute.internal Ready worker 4h57m v1.20.0+9689d22
ip-10-0-197-69.us-east-2.compute.internal Ready worker 4h56m v1.20.0+9689d22
ip-10-0-216-152.us-east-2.compute.internal Ready master 5h2m v1.22.0-rc.0+75ee307
oc get co kube-apiserver -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
exclude.release.openshift.io/internal-openshift-hosted: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
creationTimestamp: "2021-09-17T10:58:58Z"
generation: 1
managedFields:
- apiVersion: config.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:exclude.release.openshift.io/internal-openshift-hosted: {}
f:include.release.openshift.io/self-managed-high-availability: {}
f:include.release.openshift.io/single-node-developer: {}
f:spec: {}
f:status:
.: {}
f:extension: {}
manager: cluster-version-operator
operation: Update
time: "2021-09-17T10:58:58Z"
- apiVersion: config.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:relatedObjects: {}
manager: cluster-kube-apiserver-operator
operation: Update
time: "2021-09-17T11:54:03Z"
- apiVersion: config.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions: {}
f:versions: {}
manager: cluster-kube-apiserver-operator
operation: Update
subresource: status
time: "2021-09-17T13:46:53Z"
name: kube-apiserver
resourceVersion: "135703"
uid: ac30d522-6486-44be-97fc-aa77f283ae12
spec: {}
status:
conditions:
- lastTransitionTime: "2021-09-17T14:48:44Z"
message: 'NodeControllerDegraded: All master nodes are ready'
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2021-09-17T13:46:53Z"
message: 'NodeInstallerProgressing: 3 nodes are at revision 10'
reason: AsExpected
status: "False"
type: Progressing
- lastTransitionTime: "2021-09-17T11:10:17Z"
message: 'StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 10'
reason: AsExpected
status: "True"
type: Available
- lastTransitionTime: "2021-09-17T13:27:45Z"
message: 'KubeletMinorVersionUpgradeable: Unsupported kubelet minor versions on nodes ip-10-0-134-126.us-east-2.compute.internal, ip-10-0-177-196.us-east-2.compute.internal, and ip-10-0-197-69.us-east-2.compute.internal are too far behind the target API server version (1.22.1).'
reason: KubeletMinorVersion_KubeletMinorVersionUnsupported
status: "False"
type: Upgradeable
extension: null
relatedObjects:
- group: operator.openshift.io
name: cluster
resource: kubeapiservers
- group: apiextensions.k8s.io
name: ""
resource: customresourcedefinitions
- group: security.openshift.io
name: ""
resource: securitycontextconstraints
- group: ""
name: openshift-config
resource: namespaces
- group: ""
name: openshift-config-managed
resource: namespaces
- group: ""
name: openshift-kube-apiserver-operator
resource: namespaces
- group: ""
name: openshift-kube-apiserver
resource: namespaces
- group: admissionregistration.k8s.io
name: ""
resource: mutatingwebhookconfigurations
- group: admissionregistration.k8s.io
name: ""
resource: validatingwebhookconfigurations
- group: controlplane.operator.openshift.io
name: ""
namespace: openshift-kube-apiserver
resource: podnetworkconnectivitychecks
- group: apiserver.openshift.io
name: ""
resource: apirequestcounts
versions:
- name: raw-internal
version: 4.9.0-rc.1
- name: kube-apiserver
version: 1.22.1
- name: operator
version: 4.9.0-rc.1
After unpausing machine-config-pool for worker the skew drops to 2 to 0 and operator gradually become fine.
NAME STATUS ROLES AGE VERSION
ip-10-0-131-50.us-east-2.compute.internal Ready master 5h20m v1.22.0-rc.0+75ee307
ip-10-0-134-126.us-east-2.compute.internal Ready worker 5h15m v1.22.0-rc.0+75ee307
ip-10-0-170-97.us-east-2.compute.internal Ready master 5h21m v1.22.0-rc.0+75ee307
ip-10-0-177-196.us-east-2.compute.internal Ready worker 5h15m v1.22.0-rc.0+75ee307
ip-10-0-197-69.us-east-2.compute.internal Ready worker 5h15m v1.22.0-rc.0+75ee307
ip-10-0-216-152.us-east-2.compute.internal Ready master 5h21m v1.22.0-rc.0+75ee307
oc get co kube-apiserver -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
annotations:
exclude.release.openshift.io/internal-openshift-hosted: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
creationTimestamp: "2021-09-17T10:58:58Z"
generation: 1
managedFields:
- apiVersion: config.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:exclude.release.openshift.io/internal-openshift-hosted: {}
f:include.release.openshift.io/self-managed-high-availability: {}
f:include.release.openshift.io/single-node-developer: {}
f:spec: {}
f:status:
.: {}
f:extension: {}
manager: cluster-version-operator
operation: Update
time: "2021-09-17T10:58:58Z"
- apiVersion: config.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:relatedObjects: {}
manager: cluster-kube-apiserver-operator
operation: Update
time: "2021-09-17T11:54:03Z"
- apiVersion: config.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions: {}
f:versions: {}
manager: cluster-kube-apiserver-operator
operation: Update
subresource: status
time: "2021-09-17T13:46:53Z"
name: kube-apiserver
resourceVersion: "165739"
uid: ac30d522-6486-44be-97fc-aa77f283ae12
spec: {}
status:
conditions:
- lastTransitionTime: "2021-09-17T14:48:44Z"
message: 'NodeControllerDegraded: All master nodes are ready'
reason: AsExpected
status: "False"
type: Degraded
- lastTransitionTime: "2021-09-17T13:46:53Z"
message: 'NodeInstallerProgressing: 3 nodes are at revision 10'
reason: AsExpected
status: "False"
type: Progressing
- lastTransitionTime: "2021-09-17T11:10:17Z"
message: 'StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 10'
reason: AsExpected
status: "True"
type: Available
- lastTransitionTime: "2021-09-17T16:25:24Z"
message: 'KubeletMinorVersionUpgradeable: Kubelet and API server minor versions are synced.'
reason: AsExpected
status: "True"
type: Upgradeable
extension: null
relatedObjects:
- group: operator.openshift.io
name: cluster
resource: kubeapiservers
- group: apiextensions.k8s.io
name: ""
resource: customresourcedefinitions
- group: security.openshift.io
name: ""
resource: securitycontextconstraints
- group: ""
name: openshift-config
resource: namespaces
- group: ""
name: openshift-config-managed
resource: namespaces
- group: ""
name: openshift-kube-apiserver-operator
resource: namespaces
- group: ""
name: openshift-kube-apiserver
resource: namespaces
- group: admissionregistration.k8s.io
name: ""
resource: mutatingwebhookconfigurations
- group: admissionregistration.k8s.io
name: ""
resource: validatingwebhookconfigurations
- group: controlplane.operator.openshift.io
name: ""
namespace: openshift-kube-apiserver
resource: podnetworkconnectivitychecks
- group: apiserver.openshift.io
name: ""
resource: apirequestcounts
versions:
- name: raw-internal
version: 4.9.0-rc.1
- name: kube-apiserver
version: 1.22.1
- name: operator
version: 4.9.0-rc.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |