Bug 1908145 - kube-scheduler-recovery-controller container crash loop when router pod is co-scheduled
Summary: kube-scheduler-recovery-controller container crash loop when router pod is co...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-scheduler
Version: 4.7
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.7.0
Assignee: Mike Dame
QA Contact: RamaKasturi
: 1910417 (view as bug list)
Depends On:
TreeView+ depends on / blocked
Reported: 2020-12-15 22:48 UTC by Seth Jennings
Modified: 2021-02-24 15:45 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
[sig-arch] Managed cluster should have no crashlooping pods in core namespaces over four minutes [Suite:openshift/conformance/parallel]
Last Closed: 2021-02-24 15:44:54 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-scheduler-operator pull 311 0 None closed Bug 1908145: Change recovery-controller port to avoid conflicts 2021-02-19 05:23:11 UTC
Github openshift enhancements pull 569 0 None closed Bug 1908145: Add workloads localhost ports for recovery-controllers 2021-02-19 05:23:11 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:45:16 UTC

Description Seth Jennings 2020-12-15 22:48:42 UTC
Description of problem:

On installs where the masters are schedulable and the ingress router can be scheduled on a master, the router already uses port 10443 leading to a crash loop on the new kube-scheduler-recovery-controller container.


      name: kube-scheduler-recovery-controller
        - /bin/bash
        - '-euxo'
        - pipefail
        - '-c'
        - >
          timeout 3m /bin/bash -exuo pipefail -c 'while [ -n "$(ss -Htanop \(
          sport = 10443 \))" ]; do sleep 1; done'

          exec cluster-kube-scheduler-operator cert-recovery-controller
          --namespace=${POD_NAMESPACE} --listen= -v=2

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Install 3-node cluster, make masters schedulable (i.e. masters have both master and worker roles).

Actual results:
$ oc get pod | grep kube-sche
openshift-kube-scheduler-master-0.ocp-dev.variantweb.net   2/3     CrashLoopBackOff   20         140m
openshift-kube-scheduler-master-1.ocp-dev.variantweb.net   3/3     Running            1          142m
openshift-kube-scheduler-master-2.ocp-dev.variantweb.net   2/3     CrashLoopBackOff   21         146m

$ oc get pod -n openshift-ingress -owide
NAME                              READY   STATUS    RESTARTS   AGE    IP             NODE                              NOMINATED NODE   READINESS GATES
router-default-86dcd458d8-7vbj2   1/1     Running   0          152m   master-0.ocp-dev.variantweb.net   <none>           <none>
router-default-86dcd458d8-vvxgb   1/1     Running   0          152m   master-2.ocp-dev.variantweb.net   <none>           <none>

Expected results:

kube-scheduler pod should be able to run successfully on the same node as the router

Additional info:

Comment 1 W. Trevor King 2020-12-15 22:58:09 UTC
Both components should probably also register their ports in [1] or somewhere in that doc.

[1]: https://github.com/openshift/enhancements/blob/5f2529a2a02a73aad17620d643e89eed189f14e3/enhancements/network/host-port-registry.md#localhost-only

Comment 2 Maciej Szulik 2020-12-16 13:00:55 UTC
Mike, we'll probably need to pick a different port for recovery controller, sync with Tomas if in doubt. 

I'm marking this a blocker+ since this is affecting the stability of the product when we're running 
in a schedulable masters configuration.

Comment 3 Seth Jennings 2020-12-16 15:22:06 UTC
fyi port used by the router updated in the port registry

Comment 4 Mike Dame 2020-12-17 18:08:54 UTC
I opened 2 PRs: 
- https://github.com/openshift/cluster-kube-scheduler-operator/pull/311, to change the port in kube-scheduler to 11443 (just a guess, need to confirm this value works)
- https://github.com/openshift/enhancements/pull/569, to add that, and the kube-controller-manager port for the same controller, to the registry

Comment 5 W. Trevor King 2020-12-23 21:41:14 UTC
*** Bug 1910417 has been marked as a duplicate of this bug. ***

Comment 6 W. Trevor King 2020-12-23 21:46:46 UTC
Dropping a reference to at least one of the e2e test-cases this kills (for compact clusters), to make this issue more discoverable in Sippy.

Comment 7 RamaKasturi 2021-01-06 07:07:00 UTC
Similar issue was hit when an upgrade was performed from 4.2 to 4.7 nightly build and as per the discussion with dev the PR here should fix the issue.

Comment 9 RamaKasturi 2021-01-11 18:00:44 UTC
Verified with the latest build below and i see that port has been changed to 11443 instead of 10443. Also tried an upgrade from 4.2 to 4.7 which was failing before the fix and now i could see that it passes with latest 4.7 build.

[knarra@knarra openshift-client-linux-4.7.0-0.nightly-2021-01-10-070949]$ ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-01-10-070949   True        False         6h7m    Cluster version is 4.7.0-0.nightly-2021-01-10-070949

Post action: #oc get co:NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
01-11 20:54:42  authentication                             4.7.0-0.nightly-2021-01-10-070949   True        False         False      3m42s
01-11 20:54:42  baremetal                                  4.7.0-0.nightly-2021-01-10-070949   True        False         False      25m
01-11 20:54:42  cloud-credential                           4.7.0-0.nightly-2021-01-10-070949   True        False         False      4h19m
01-11 20:54:42  cluster-autoscaler                         4.7.0-0.nightly-2021-01-10-070949   True        False         False      4h11m
01-11 20:54:42  config-operator                            4.7.0-0.nightly-2021-01-10-070949   True        False         False      138m
01-11 20:54:42  console                                    4.7.0-0.nightly-2021-01-10-070949   True        False         False      23m

port before the fix:
      name: cert-dir
  - args:
    - |
      timeout 3m /bin/bash -exuo pipefail -c 'while [ -n "$(ss -Htanop \( sport = 10443 \))" ]; do sleep 1; done'

      exec cluster-kube-scheduler-operator cert-recovery-controller --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-scheduler-cert-syncer-kubeconfig/kubeconfig  --namespace=${POD_NAMESPACE} --listen= -v=2

port after the fix:
- args:
    - |
      timeout 3m /bin/bash -exuo pipefail -c 'while [ -n "$(ss -Htanop \( sport = 11443 \))" ]; do sleep 1; done'

      exec cluster-kube-scheduler-operator cert-recovery-controller --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-scheduler-cert-syncer-kubeconfig/kubeconfig  --namespace=${POD_NAMESPACE} --listen= -v=2

Comment 10 RamaKasturi 2021-01-12 10:44:56 UTC
Based on comment 9 moving the bug to verified state.

Comment 11 RamaKasturi 2021-01-12 10:47:34 UTC
(In reply to RamaKasturi from comment #10)
> Based on comment 9 moving the bug to verified state. Also tried both UPI & IPI installs where node has both master & worker role but could not reproduce the crash.

Comment 14 errata-xmlrpc 2021-02-24 15:44:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.