Bug 1926579

Summary: Setting .spec.policy is deprecated and will be removed eventually. Please use .spec.profile instead is being logged every 3 seconds in scheduler operator log
Product: OpenShift Container Platform Reporter: RamaKasturi <knarra>
Component: kube-schedulerAssignee: Mike Dame <mdame>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.7CC: aos-bugs, mfojtik
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:42:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description RamaKasturi 2021-02-09 05:58:53 UTC
Description of problem:
When spec.policy is set i see that deprecation message about this is logged every 3 seconds in scheduler operator logs even when the cluster is idle

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2021-02-06-084550

How reproducible:
Always

Steps to Reproduce:
Create policy.cfg with the contents below
Cat policy.cfg
{
"kind" : "Policy",
"apiVersion" : "v1",
"predicates" : [
{"name" : "HostName"}
],
"priorities" : [ 
]
}

Run the command to create configmap “oc create configmap scheduler-policy --from-file=policy.cfg=/tmp/policy.cfg -n openshift-config”
Run the command to patch scheduler cluster “oc patch Scheduler cluster --type='merge' -p '{"spec":{"policy":{"name":"scheduler-policy"}}}' --type=merge”
Wait for kube-scheduler pods to restart
Now check the scheduler operator logs

Actual Results:
We see that deprecated message about spec.policy is being logged every 3 seconds in the logs.

Expected Results:
Message should not be logged every 3 seconds i.e frequency should be lesser.

Additional Info:
snippet of logs: http://pastebin.test.redhat.com/938160

Comment 1 Maciej Szulik 2021-02-09 09:43:41 UTC
Mike, maybe we could suppress that message somehow and not print it that frequently.
It should definitely be back-ported to 4.7

Comment 2 Mike Dame 2021-02-09 16:42:39 UTC
Yeah, the message should only print once (when the scheduler config CRD is updated). It seems like there is some hotloop in the targetconfigreconciler's syncing

For reference, the message is coming from https://github.com/openshift/cluster-kube-scheduler-operator/blob/bf94794/pkg/operator/targetconfigcontroller/targetconfigcontroller.go#L294

Comment 3 Mike Dame 2021-02-10 20:05:27 UTC
I added some debugging to this, by printing the updates for each event that triggers the eventHandler sync, and I see updates every ~2s to the kube-scheduler configmap updating `renewTime`. Is this normal? It appears to be what's triggering the constant re-syncs so far

```
Old object:
&ConfigMap{ObjectMeta:{kube-scheduler  openshift-kube-scheduler /api/v1/namespaces/openshift-kube-scheduler/configmaps/kube-scheduler 9cf42b9c-4c91-44b0-9941-a591a0f8ef00 481420 0 2021-02-09 18:37:15 +0000 UTC <nil> <nil> map[] map[control-plane.alpha.kubernetes.io/leader:{"holderIdentity":"ip-10-0-156-77_0a901d62-797e-4a41-9082-e8b062f9e614","leaseDurationSeconds":15,"acquireTime":"2021-02-10T19:40:55Z","renewTime":"2021-02-10T20:01:17Z","leaderTransitions":16}] [] []  [{kube-scheduler Update v1 2021-02-09 18:37:15 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:control-plane.alpha.kubernetes.io/leader":{}}}}}]},Data:map[string]string{},BinaryData:map[string][]byte{},Immutable:nil,}

New object:
&ConfigMap{ObjectMeta:{kube-scheduler  openshift-kube-scheduler /api/v1/namespaces/openshift-kube-scheduler/configmaps/kube-scheduler 9cf42b9c-4c91-44b0-9941-a591a0f8ef00 481424 0 2021-02-09 18:37:15 +0000 UTC <nil> <nil> map[] map[control-plane.alpha.kubernetes.io/leader:{"holderIdentity":"ip-10-0-156-77_0a901d62-797e-4a41-9082-e8b062f9e614","leaseDurationSeconds":15,"acquireTime":"2021-02-10T19:40:55Z","renewTime":"2021-02-10T20:01:19Z","leaderTransitions":16}] [] []  [{kube-scheduler Update v1 2021-02-09 18:37:15 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:control-plane.alpha.kubernetes.io/leader":{}}}}}]},Data:map[string]string{},BinaryData:map[string][]byte{},Immutable:nil,}
```

Diff is "renewTime":"2021-02-10T20:01:17Z" and "renewTime":"2021-02-10T20:01:19Z"

```
$ oc get -o yaml cm/kube-scheduler -n openshift-kube-scheduler
apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"ip-10-0-156-77_0a901d62-797e-4a41-9082-e8b062f9e614","leaseDurationSeconds":15,"acquireTime":"2021-02-10T19:40:55Z","renewTime":"2021-02-10T20:02:01Z","leaderTransitions":16}'
  creationTimestamp: "2021-02-09T18:37:15Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:control-plane.alpha.kubernetes.io/leader: {}
    manager: kube-scheduler
    operation: Update
    time: "2021-02-09T18:37:15Z"
  name: kube-scheduler
  namespace: openshift-kube-scheduler
  resourceVersion: "481640"
  selfLink: /api/v1/namespaces/openshift-kube-scheduler/configmaps/kube-scheduler
  uid: 9cf42b9c-4c91-44b0-9941-a591a0f8ef00
```

Comment 4 Mike Dame 2021-02-11 20:11:07 UTC
I opened https://github.com/openshift/cluster-kube-scheduler-operator/pull/330 to move this log line to a more controlled location.

This isn't really a hot loop, the leader-election configmap does frequently update. This is normally harmless, though the events do trigger no-op resyncs. I explained this in the PR, but I think adding logic to handle every edge case event which doesn't really require a re-sync isn't viable compared to the cost of the no-ops. I am open to any feedback on this though

Comment 6 RamaKasturi 2021-02-15 10:24:59 UTC
Did not see any nightly build today as well, but verified the bug in the ci payload and i do not see the spec.policy being deprecated message seen for every 3 seconds. In the span of two hours the message has been printed only once.

[knarra@knarra openshift-client-linux-4.8.0-0.ci-2021-02-14-191057]$ oc logs openshift-kube-scheduler-operator-67689944bf-f45bk -n openshift-kube-scheduler-operator | grep "deprecated"
W0215 08:49:17.834694       1 targetconfigcontroller.go:380] Setting .spec.policy is deprecated and will be removed eventually. Please use .spec.profile instead.

will move the bug to verified once a nightly accepted build is available.

Comment 7 RamaKasturi 2021-02-22 15:15:20 UTC
Verified bug with the payload below and i see that the deprecation message is seen only once and does not get logged for every 3 seconds.

[knarra@knarra openshift-client-linux-4.8.0-0.nightly-2021-02-21-102854]$ ./oc logs openshift-kube-scheduler-operator-5b84469f46-2p6j2 -n openshift-kube-scheduler-operator | grep ".spec.policy"
W0222 12:32:58.121016       1 targetconfigcontroller.go:406] Setting .spec.policy is deprecated and will be removed eventually. Please use .spec.profile instead.
[knarra@knarra openshift-client-linux-4.8.0-0.nightly-2021-02-21-102854]$ date -u
Mon Feb 22 15:13:34 UTC 2021


Based on the above moving bug to verified state.

Comment 10 errata-xmlrpc 2021-07-27 22:42:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438