1685439 – can not find policy-configmap in namespace openshift-config when modify scheduler policy

Bug 1685439 - can not find policy-configmap in namespace openshift-config when modify scheduler policy

Summary: can not find policy-configmap in namespace openshift-config when modify sched...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.1.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Seth Jennings
QA Contact:	Weinan Liu
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1688674 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-03-05 08:29 UTC by MinLi
Modified:	2019-06-04 10:45 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:45:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:45:12 UTC

Description MinLi 2019-03-05 08:29:35 UTC

Description of problem:
When we need modify scheduler policy, can not find policy-configmap in namespace openshift-config. 

refer to doc: https://docs.openshift.com/container-platform/4.0/nodes/scheduling/nodes-scheduler-default.html#nodes-scheduler-default-modifying_nodes-scheduler-default

Version-Release number of selected component (if applicable):
[root@localhost lyman]# oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.nightly-2019-03-04-234414   True        False         3h4m      Cluster version is 4.0.0-0.nightly-2019-03-04-234414

[core@ip-10-0-162-11 ~]$ oc version              
oc v4.0.0-0.182.0
kubernetes v1.12.4+4dd65df23d
features: Basic-Auth GSSAPI Kerberos SPNEGO


How reproducible:
always

Steps to Reproduce:
1.oc get cm -n openshift-config
2.
3.

Actual results:
1.not show cm which named "policy-configmap"

Expected results:
1.show cm which named "policy-configmap"

Additional info:

Comment 1 Seth Jennings 2019-03-05 14:26:30 UTC

MinLi,

The configmap is not created by default.  The cluster admin has to create it.

However, this is being reworked atm.  I'm going to be working on it today.

https://github.com/openshift/cluster-kube-scheduler-operator/pull/70

The new way is that the cluster admin will create a configmap _and_ set the `policy` field of the Scheduler global config resource to point to that configmap so we aren't hardcoding the configmap name.

I'll let you know when this is done.

Comment 3 MinLi 2019-03-14 08:24:54 UTC

@Seth Jennings,

when I create a configmap and scheduler for policy, how can I confirm it take effect? 
May I see log from scheduler-pod? Do you have detailed specification?
I open a bug about one specific policy to trace,  https://bugzilla.redhat.com/show_bug.cgi?id=1688674

Comment 5 ravig 2019-03-18 21:23:26 UTC

@MinLi,

You can see the scheduler log especially the starting section would have the list of predicates and priorities enabled. For example, it might look this..

 Creating scheduler with fit predicates 'map[GeneralPredicates:{} CheckNodeUnschedulable:{} NoVolumeZoneConflict:{} MaxCSIVolumeCountPred:{} MatchInterPodAffinity:{} NoDiskConflict:{} MaxGCEPDVolumeCount:{} MaxAzureDiskVolumeCount:{} MaxEBSVolumeCount:{} PodToleratesNodeTaints:{} CheckVolumeBinding:{}]' and priority functions 'map[SelectorSpreadPriority:{} InterPodAffinityPriority:{} LeastRequestedPriority:{} BalancedResourceAllocation:{} NodePreferAvoidPodsPriority:{} NodeAffinityPriority:{} TaintTolerationPriority:{} ImageLocalityPriority:{}]'

You can check if the config-map is being used, if the above section has the same set of predicates and priorities that you specified in policy.cfg

Comment 6 MinLi 2019-03-19 07:33:02 UTC

@ravig, from my understanding, the scheduler policy didn't update.

version info:
4.0.0-0.nightly-2019-03-18-200009

Steps:
1.create a configmap 
#oc create configmap -n openshift-config --from-file=policy.cfg mypolicy
policy.cfg is like:
{
"kind" : "Policy",
"apiVersion" : "v1",
"predicates" : [
        {"name" : "NoDiskConflict"},
        {"name" : "NoVolumeZoneConflict"}
        ],
"priorities" : [
        {"name" : "ImageLocalityPriority", "weight" : 100}
        ]
}

2.create scheduler named "cluster"
#oc create -f scheduler.yaml
scheduler.yaml is like:
apiVersion: config.openshift.io/v1
kind: Scheduler
metadata:
  name: cluster
spec:
  policy:
    name: mypolicy

3.check logs of ns "openshift-kube-scheduler" (the log only Creating scheduler from algorithm provider 'DefaultProvider', not update when I create new policy)
#oc logs openshift-kube-scheduler-ip-172-31-143-245.us-east-2.compute.internal -n openshift-kube-scheduler
...
I0319 06:05:11.338618       1 flags.go:33] FLAG: --use-legacy-policy-config="false"
I0319 06:05:11.338626       1 flags.go:33] FLAG: --v="2"
I0319 06:05:11.338633       1 flags.go:33] FLAG: --version="false"
I0319 06:05:11.338644       1 flags.go:33] FLAG: --vmodule=""
I0319 06:05:11.338653       1 flags.go:33] FLAG: --write-config-to=""
I0319 06:05:11.346412       1 server.go:128] Version: v1.12.4+5ba7aff
W0319 06:05:11.346458       1 defaults.go:217] TaintNodesByCondition is enabled, PodToleratesNodeTaints predicate is mandatory
I0319 06:05:11.346709       1 factory.go:1107] Creating scheduler from algorithm provider 'DefaultProvider'
I0319 06:05:11.346731       1 factory.go:1207] Creating scheduler with fit predicates 'map[MaxEBSVolumeCount:{} NoDiskConflict:{} NoVolumeZoneConflict:{} MatchInterPodAffinity:{} GeneralPredicates:{} CheckNodeUnschedulable:{} MaxGCEPDVolumeCount:{} PodToleratesNodeTaints:{} CheckVolumeBinding:{} MaxAzureDiskVolumeCount:{} MaxCSIVolumeCountPred:{}]' and priority functions 'map[InterPodAffinityPriority:{} LeastRequestedPriority:{} BalancedResourceAllocation:{} NodePreferAvoidPodsPriority:{} NodeAffinityPriority:{} TaintTolerationPriority:{} ImageLocalityPriority:{} SelectorSpreadPriority:{}]'
W0319 06:05:11.347419       1 authorization.go:47] Authorization is disabled
W0319 06:05:11.347436       1 authentication.go:55] Authentication is disabled
I0319 06:05:11.347445       1 deprecated_insecure_serving.go:48] Serving healthz insecurely on [::]:10251
I0319 06:05:12.249178       1 controller_utils.go:1027] Waiting for caches to sync for scheduler controller
I0319 06:05:12.349416       1 controller_utils.go:1034] Caches are synced for scheduler controller
I0319 06:05:12.349468       1 leaderelection.go:205] attempting to acquire leader lease  kube-system/kube-scheduler...
E0319 06:06:11.357339       1 reflector.go:237] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:178: Failed to watch *v1.Pod: Get https://localhost:6443/api/v1/pods?fieldSelector=status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&resourceVersion=210087&timeoutSeconds=505&watch=true: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:11.357469       1 reflector.go:237] k8s.io/client-go/informers/factory.go:131: Failed to watch *v1beta1.PodDisruptionBudget: Get https://localhost:6443/apis/policy/v1beta1/poddisruptionbudgets?resourceVersion=195448&timeoutSeconds=353&watch=true: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:11.357655       1 reflector.go:237] k8s.io/client-go/informers/factory.go:131: Failed to watch *v1.ReplicationController: Get https://localhost:6443/api/v1/replicationcontrollers?resourceVersion=195439&timeoutSeconds=571&watch=true: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:11.357759       1 reflector.go:237] k8s.io/client-go/informers/factory.go:131: Failed to watch *v1.Node: Get https://localhost:6443/api/v1/nodes?resourceVersion=210020&timeoutSeconds=519&watch=true: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:11.357818       1 reflector.go:237] k8s.io/client-go/informers/factory.go:131: Failed to watch *v1.Service: Get https://localhost:6443/api/v1/services?resourceVersion=207986&timeoutSeconds=421&watch=true: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:11.358338       1 reflector.go:237] k8s.io/client-go/informers/factory.go:131: Failed to watch *v1.ReplicaSet: Get https://localhost:6443/apis/apps/v1/replicasets?resourceVersion=209044&timeoutSeconds=343&watch=true: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:11.358411       1 reflector.go:237] k8s.io/client-go/informers/factory.go:131: Failed to watch *v1.PersistentVolumeClaim: Get https://localhost:6443/api/v1/persistentvolumeclaims?resourceVersion=195437&timeoutSeconds=447&watch=true: dial tcp [::1]:6443: connect: connection refused
...
E0319 06:06:12.358514       1 reflector.go:125] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:178: Failed to list *v1.Pod: Get https://localhost:6443/api/v1/pods?fieldSelector=status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.359519       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1beta1.PodDisruptionBudget: Get https://localhost:6443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.369569       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.ReplicationController: Get https://localhost:6443/api/v1/replicationcontrollers?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.376103       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Node: Get https://localhost:6443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.377229       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Service: Get https://localhost:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.378572       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.ReplicaSet: Get https://localhost:6443/apis/apps/v1/replicasets?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.380439       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.PersistentVolumeClaim: Get https://localhost:6443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.381105       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.StorageClass: Get https://localhost:6443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.381893       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.PersistentVolume: Get https://localhost:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.382981       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.StatefulSet: Get https://localhost:6443/apis/apps/v1/statefulsets?limit=500&resourceVersion=0: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:12.871424       1 leaderelection.go:270] error retrieving resource lock kube-system/kube-scheduler: Get https://localhost:6443/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: dial tcp [::1]:6443: connect: connection refused
E0319 06:06:19.537783       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumes" in API group "" at the cluster scope
E0319 06:06:19.538257       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
E0319 06:06:19.538384       1 reflector.go:125] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:178: Failed to list *v1.Pod: pods is forbidden: User "system:kube-scheduler" cannot list resource "pods" in API group "" at the cluster scope
E0319 06:06:19.538450       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
E0319 06:06:19.538535       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
E0319 06:06:19.538593       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Node: nodes is forbidden: User "system:kube-scheduler" cannot list resource "nodes" in API group "" at the cluster scope
E0319 06:06:19.538655       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:kube-scheduler" cannot list resource "replicasets" in API group "apps" at the cluster scope
E0319 06:06:19.538730       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:kube-scheduler" cannot list resource "statefulsets" in API group "apps" at the cluster scope
E0319 06:06:19.551736       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
E0319 06:06:19.552101       1 reflector.go:125] k8s.io/client-go/informers/factory.go:131: Failed to list *v1beta1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:kube-scheduler" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
E0319 06:06:19.565703       1 leaderelection.go:270] error retrieving resource lock kube-system/kube-scheduler: endpoints "kube-scheduler" is forbidden: User "system:kube-scheduler" cannot get resource "endpoints" in API group "" in the namespace "kube-system"
E0319 06:06:30.736753       1 factory.go:740] scheduler cache UpdatePod failed: pod 1838aea6-4a0d-11e9-83b9-0aceb08c3fb0 is not added to scheduler cache, so cannot be updated
E0319 06:06:31.727207       1 factory.go:740] scheduler cache UpdatePod failed: pod 1838aea6-4a0d-11e9-83b9-0aceb08c3fb0 is not added to scheduler cache, so cannot be updated
E0319 06:06:37.882941       1 factory.go:740] scheduler cache UpdatePod failed: pod 1838aea6-4a0d-11e9-83b9-0aceb08c3fb0 is not added to scheduler cache, so cannot be updated
E0319 06:07:00.862593       1 factory.go:740] scheduler cache UpdatePod failed: pod 1838aea6-4a0d-11e9-83b9-0aceb08c3fb0 is not added to scheduler cache, so cannot be updated
E0319 06:07:14.436749       1 factory.go:740] scheduler cache UpdatePod failed: pod 1838aea6-4a0d-11e9-83b9-0aceb08c3fb0 is not added to scheduler cache, so cannot be updated
I0319 06:07:25.749502       1 leaderelection.go:214] successfully acquired lease kube-system/kube-scheduler

Comment 7 Seth Jennings 2019-03-20 15:32:07 UTC

*** Bug 1688674 has been marked as a duplicate of this bug. ***

Comment 9 ravig 2019-03-21 23:20:45 UTC

https://github.com/openshift/cluster-kube-scheduler-operator/pull/76

Comment 12 errata-xmlrpc 2019-06-04 10:45:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.