1611247 – update scheduler.json of master can not take effect

Bug 1611247 - update scheduler.json of master can not take effect

Summary: update scheduler.json of master can not take effect

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Seth Jennings
QA Contact:	Xiaoli Tian
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-08-02 08:15 UTC by MinLi
Modified:	2019-06-04 10:40 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:40:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:40:28 UTC

Description MinLi 2018-08-02 08:15:52 UTC

Description of problem:
update scheduler.json of master  can not take effect.

Version-Release number of selected component (if applicable):
oc v3.11.0-0.10.0
openshift v3.11.0-0.10.0
kubernetes v1.11.0+d4cacc0


How reproducible:
always

Steps to Reproduce:
1.modify /etc/origin/master/master-config.yaml, add
kubernetesMasterConfig:
....
  schedulerArguments:
    feature-gates:
    - BalanceAttachedNodeVolumes=true


2.modify /etc/origin/master/scheduler.json, as follows:
scheduler.json as follows:
{
    "apiVersion": "v1",
    "kind": "Policy",
    "predicates": [
        {
            "name": "GeneralPredicates"
        }
    ],
    "priorities": [
        {
            "name": "BalancedResourceAllocation",
            "weight": 4
        }
    ]
}


3.modify /etc/origin/master/master.env:
DEBUG_LOGLEVEL=10

4.restart master controllers
#master-restart controllers

5.create a pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  persistentVolumeReclaimPolicy: Retain


6.print contrllers log, and then deploy a pod
#master-logs controllers controllers 1> ./controller.txt   2>&1

kind: Pod
apiVersion: v1
metadata:
  name: mypod
  labels:
    name: frontendhttp
spec:
  containers:
    - name: myfrontend
      image: jhou/hello-openshift
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/tmp"
          name: aws
  volumes:
    - name: aws
      persistentVolumeClaim:
        claimName: ebs


Actual results:
check controller log , as follows:

I0802 05:55:32.739516       1 server.go:126] Version: v1.11.0+d4cacc0
I0802 05:55:32.746712       1 factory.go:960] Creating scheduler from configuration: {{ } [{NoVolumeZoneConflict <nil>} {MaxEBSVolumeCount <nil>} {MaxGCEPDVolumeCount <nil>} {MaxAzureDiskVolumeCount <nil>} {MatchInterPodAffinity <nil>} {NoDiskConflict <nil>} {GeneralPredicates <nil>} {PodToleratesNodeTaints <nil>} {CheckNodeMemoryPressure <nil>} {CheckNodeDiskPressure <nil>} {CheckVolumeBinding <nil>} {Region 0xc421019520}] [{SelectorSpreadPriority 1 <nil>} {InterPodAffinityPriority 1 <nil>} {LeastRequestedPriority 1 <nil>} {BalancedResourceAllocation 1 <nil>} {NodePreferAvoidPodsPriority 10000 <nil>} {NodeAffinityPriority 1 <nil>} {TaintTolerationPriority 1 <nil>} {Zone 2 0xc42063baa0}] [] 0 false}
I0802 05:55:32.746757       1 factory.go:977] Registering predicate: NoVolumeZoneConflict
I0802 05:55:32.746770       1 plugins.go:224] Predicate type NoVolumeZoneConflict already registered, reusing.
I0802 05:55:32.746776       1 factory.go:977] Registering predicate: MaxEBSVolumeCount
I0802 05:55:32.746782       1 plugins.go:224] Predicate type MaxEBSVolumeCount already registered, reusing.
I0802 05:55:32.746787       1 factory.go:977] Registering predicate: MaxGCEPDVolumeCount
I0802 05:55:32.746793       1 plugins.go:224] Predicate type MaxGCEPDVolumeCount already registered, reusing.
I0802 05:55:32.746798       1 factory.go:977] Registering predicate: MaxAzureDiskVolumeCount
I0802 05:55:32.746803       1 plugins.go:224] Predicate type MaxAzureDiskVolumeCount already registered, reusing.
I0802 05:55:32.746808       1 factory.go:977] Registering predicate: MatchInterPodAffinity
I0802 05:55:32.746814       1 plugins.go:224] Predicate type MatchInterPodAffinity already registered, reusing.
I0802 05:55:32.746819       1 factory.go:977] Registering predicate: NoDiskConflict
I0802 05:55:32.746824       1 plugins.go:224] Predicate type NoDiskConflict already registered, reusing.
I0802 05:55:32.746829       1 factory.go:977] Registering predicate: GeneralPredicates
I0802 05:55:32.746835       1 plugins.go:224] Predicate type GeneralPredicates already registered, reusing.
I0802 05:55:32.746840       1 factory.go:977] Registering predicate: PodToleratesNodeTaints
I0802 05:55:32.746845       1 plugins.go:224] Predicate type PodToleratesNodeTaints already registered, reusing.
I0802 05:55:32.746851       1 factory.go:977] Registering predicate: CheckNodeMemoryPressure
I0802 05:55:32.746856       1 plugins.go:224] Predicate type CheckNodeMemoryPressure already registered, reusing.
I0802 05:55:32.746863       1 factory.go:977] Registering predicate: CheckNodeDiskPressure
I0802 05:55:32.746868       1 plugins.go:224] Predicate type CheckNodeDiskPressure already registered, reusing.
I0802 05:55:32.746874       1 factory.go:977] Registering predicate: CheckVolumeBinding
I0802 05:55:32.746879       1 plugins.go:224] Predicate type CheckVolumeBinding already registered, reusing.
I0802 05:55:32.746884       1 factory.go:977] Registering predicate: Region
I0802 05:55:32.746897       1 factory.go:992] Registering priority: SelectorSpreadPriority
I0802 05:55:32.746904       1 plugins.go:336] Priority type SelectorSpreadPriority already registered, reusing.
I0802 05:55:32.746916       1 factory.go:992] Registering priority: InterPodAffinityPriority
I0802 05:55:32.746922       1 plugins.go:336] Priority type InterPodAffinityPriority already registered, reusing.
I0802 05:55:32.746930       1 factory.go:992] Registering priority: LeastRequestedPriority
I0802 05:55:32.746935       1 plugins.go:336] Priority type LeastRequestedPriority already registered, reusing.
I0802 05:55:32.746942       1 factory.go:992] Registering priority: BalancedResourceAllocation
I0802 05:55:32.746948       1 plugins.go:336] Priority type BalancedResourceAllocation already registered, reusing.
I0802 05:55:32.746955       1 factory.go:992] Registering priority: NodePreferAvoidPodsPriority
I0802 05:55:32.746961       1 plugins.go:336] Priority type NodePreferAvoidPodsPriority already registered, reusing.
I0802 05:55:32.746968       1 factory.go:992] Registering priority: NodeAffinityPriority
I0802 05:55:32.746975       1 plugins.go:336] Priority type NodeAffinityPriority already registered, reusing.
I0802 05:55:32.746982       1 factory.go:992] Registering priority: TaintTolerationPriority
I0802 05:55:32.746988       1 plugins.go:336] Priority type TaintTolerationPriority already registered, reusing.
I0802 05:55:32.746995       1 factory.go:992] Registering priority: Zone
t:{} MaxAzureDiskVolumeCount:{} NoDiskConflict:{} PodToleratesNodeTaints:{} CheckNodeDiskPressure:{} CheckVolumeBinding:{} NoVolumeZoneConflict:{} MaxEBSVolumeCount:{} MatchInterPodAffinity:{} GeneralPredicates:{}]' and priority functions 'map[LeastRequestedPriority:{} BalancedResourceAllocation:{} NodePreferAvoidPodsPriority:{} NodeAffinityPriority:{} TaintTolerationPriority:{} Zone:{} SelectorSpreadPriority:{} InterPodAffinityPriority:{}]'



Expected results:
logging updated scheduler policy according to  scheduler.json

Additional info:

Comment 1 MinLi 2018-08-03 07:35:07 UTC

this problem happen definitely in aws env, template: http://git.app.eng.bos.redhat.com/git/openshift-misc.git/plain/v3-launch-templates/functionality-testing/aos-3_11/vars-aws/rpm-rhel7-s3_registry-aws-cloudprovider

But in other env, such as gce , it not happen

Comment 2 Avesh Agarwal 2018-08-06 20:16:49 UTC

I am not able to reproduce this, as everything is working as expected. I did following steps and it is evident from the logs, the scheduler was started with updated scheduler.json:

1.Updated debug level in /etc/origin/master/master.env:
DEBUG_LOGLEVEL=10

2. Used following scheduler.json:

scheduler.json as follows:
{
    "apiVersion": "v1",
    "kind": "Policy",
    "predicates": [
        {
            "name": "GeneralPredicates"
        }
    ],
    "priorities": [
        {
            "name": "BalancedResourceAllocation",
            "weight": 4
        }
    ]
}

3. Restarted controllers on master:
#/usr/local/bin/master-restart controllers

4. Here are the logs:
I0806 20:07:21.733455       1 factory.go:960] Creating scheduler from configuration: {{ } [{GeneralPredicates <nil>}] [{BalancedResourceAllocation 4 <nil>}] [] 0 false}
I0806 20:07:21.733522       1 factory.go:977] Registering predicate: GeneralPredicates
I0806 20:07:21.733556       1 plugins.go:224] Predicate type GeneralPredicates already registered, reusing.
I0806 20:07:21.733584       1 factory.go:992] Registering priority: BalancedResourceAllocation
I0806 20:07:21.733632       1 plugins.go:336] Priority type BalancedResourceAllocation already registered, reusing.
I0806 20:07:21.733670       1 factory.go:1049] Creating scheduler with fit predicates 'map[GeneralPredicates:{}]' and priority functions 'map[BalancedResourceAllocation:{}]'


I have my 3.11 cluster running in AWS in case you want to verify.

Comment 4 MinLi 2018-08-15 07:56:35 UTC

I test again in aws cluster
When do step4 "master-restart controllers" it print "2". I think this cause scheduler.json not effect. Because I run the command in gce cluster, it not print anything. 
I want to know if you meet the same problem.

Comment 5 Avesh Agarwal 2018-08-15 13:37:22 UTC

can i get access to your aws cluster?

Comment 9 MinLi 2018-08-30 03:04:48 UTC

It seems this issue not happen any more.

Comment 10 DeShuai Ma 2018-08-30 05:22:35 UTC

As comment 9, move to verified.

Comment 12 MinLi 2018-12-18 07:33:38 UTC

verified version:

oc v3.11.57
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-15-99.ec2.internal:8443
openshift v3.11.57
kubernetes v1.11.0+d4cacc0

Comment 16 errata-xmlrpc 2019-06-04 10:40:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.