Description of problem: Checked with cluster capacity and found it always got "Failed to set default scheduler config: Error in opening default scheduler config file: open : no such file or directory" even I give a existed path for --default-config /bin/sh -ec /bin/cluster-capacity --default-config=/test-s/scheduler.json --podspec=/test-pod/pod.yaml --verbose $ ls /test-* /test-pod: pod.yaml /test-s: scheduler.json Version-Release number of selected component (if applicable): atomic-openshift-cluster-capacity-3.9.0-0.21.0.git.0.2a50d06.el7.x86_64 # openshift version openshift v3.9.0-0.20.0 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.8 How reproducible: always Steps to Reproduce: 1. Create a podspec as a configmap oc create configmap cluster-capacity-configmap --from-file=pod.yaml=pod.yaml -n default # cat pod.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null name: cluster-capacity-stub-container namespace: cluster-capacity spec: containers: - image: gcr.io/google_containers/pause:2.0 imagePullPolicy: Always name: cluster-capacity-stub-container resources: limits: cpu: 200m memory: 100Mi requests: cpu: 100m memory: 80Mi dnsPolicy: Default nodeSelector: load: high region: hpc restartPolicy: OnFailure schedulerName: default-scheduler status: {} 2. oc create secret generic sf --from-file=scheduler.json=/etc/origin/master/scheduler.json -n default 3. create a rc for cluster-capacity apiVersion: v1 kind: ReplicationController metadata: creationTimestamp: 2018-01-18T07:30:32Z generation: 5 labels: run: cluster-capacity name: cluster-capacity namespace: default resourceVersion: "22043" selfLink: /api/v1/namespaces/default/replicationcontrollers/cluster-capacity uid: 778a86ed-fc21-11e7-80da-fa163ead188e spec: replicas: 1 selector: run: cluster-capacity template: metadata: creationTimestamp: null labels: run: cluster-capacity spec: containers: - command: - /bin/sh - -ec - | /bin/cluster-capacity --default-config=/test-s/scheduler.json --podspec=/test-pod/pod.yaml --verbose;while true;do sleep 10;done env: - name: CC_INCLUSTER value: "true" image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-cluster-capacity imagePullPolicy: Always name: cluster-capacity resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /test-pod name: test-volume - mountPath: /test-s name: ss dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: cluster-capacity-sa serviceAccountName: cluster-capacity-sa terminationGracePeriodSeconds: 30 volumes: - configMap: defaultMode: 420 name: cluster-capacity-configmap name: test-volume - name: ss secret: defaultMode: 420 secretName: sf Actual results: # oc logs -n default -f cluster-capacity-bgjh5 Failed to set default scheduler config: Error in opening default scheduler config file: open : no such file or directory Expected results: cluster-capacity should work well Additional info:
(In reply to weiwei jiang from comment #0) > Description of problem: > Checked with cluster capacity and found it always got "Failed to set default > scheduler config: Error in opening default scheduler config file: open : no > such file or directory" even I give a existed path for --default-config > > /bin/sh -ec /bin/cluster-capacity --default-config=/test-s/scheduler.json > --podspec=/test-pod/pod.yaml --verbose > > $ ls /test-* > /test-pod: > pod.yaml > > /test-s: > scheduler.json > > > Version-Release number of selected component (if applicable): > atomic-openshift-cluster-capacity-3.9.0-0.21.0.git.0.2a50d06.el7.x86_64 > # openshift version > openshift v3.9.0-0.20.0 > kubernetes v1.9.1+a0ce1bc657 > etcd 3.2.8 > > How reproducible: > always > > Steps to Reproduce: > 1. Create a podspec as a configmap > oc create configmap cluster-capacity-configmap --from-file=pod.yaml=pod.yaml > -n default > # cat pod.yaml > apiVersion: v1 > kind: Pod > metadata: > creationTimestamp: null > name: cluster-capacity-stub-container > namespace: cluster-capacity > spec: > containers: > - image: gcr.io/google_containers/pause:2.0 > imagePullPolicy: Always > name: cluster-capacity-stub-container > resources: > limits: > cpu: 200m > memory: 100Mi > requests: > cpu: 100m > memory: 80Mi > dnsPolicy: Default > nodeSelector: > load: high > region: hpc > restartPolicy: OnFailure > schedulerName: default-scheduler > status: {} > > 2. oc create secret generic sf > --from-file=scheduler.json=/etc/origin/master/scheduler.json -n default Why are you creating secret from scheduler config file? > 3. create a rc for cluster-capacity > apiVersion: v1 > kind: ReplicationController > metadata: > creationTimestamp: 2018-01-18T07:30:32Z > generation: 5 > labels: > run: cluster-capacity > name: cluster-capacity > namespace: default > resourceVersion: "22043" > selfLink: > /api/v1/namespaces/default/replicationcontrollers/cluster-capacity > uid: 778a86ed-fc21-11e7-80da-fa163ead188e > spec: > replicas: 1 > selector: > run: cluster-capacity > template: > metadata: > creationTimestamp: null > labels: > run: cluster-capacity > spec: > containers: > - command: > - /bin/sh > - -ec > - | > /bin/cluster-capacity --default-config=/test-s/scheduler.json why are you mounting scheduler.json config file as secret? > --podspec=/test-pod/pod.yaml --verbose;while true;do sleep 10;done > env: > - name: CC_INCLUSTER > value: "true" > image: > brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-cluster- > capacity > imagePullPolicy: Always > name: cluster-capacity > resources: {} > terminationMessagePath: /dev/termination-log > terminationMessagePolicy: File > volumeMounts: > - mountPath: /test-pod > name: test-volume > - mountPath: /test-s > name: ss > dnsPolicy: ClusterFirst > restartPolicy: Always > schedulerName: default-scheduler > securityContext: {} > serviceAccount: cluster-capacity-sa > serviceAccountName: cluster-capacity-sa > terminationGracePeriodSeconds: 30 > volumes: > - configMap: > defaultMode: 420 > name: cluster-capacity-configmap > name: test-volume > - name: ss > secret: > defaultMode: 420 > secretName: sf > > > Actual results: > # oc logs -n default -f cluster-capacity-bgjh5 > Failed to set default scheduler config: Error in opening default scheduler > config file: open : no such file or directory > > Expected results: > cluster-capacity should work well > > Additional info: Also in general, you need to node provide scheduler.json unless your cluster us using some customize default scheduler.
Just want to correct my comment: Also in general, you need NOT provide scheduler.json unless your cluster is using some customized default scheduler.
Also can you show me your scheduler.json? I just tested locally --default-config and it seemed to work. I will try in a pod now.
I can reproduce the issue. I am working on a fix.
PR: https://github.com/openshift/origin/pull/18198 In case you want to test it before, you could use this image: docker.io/aveshagarwal/cluster-capacity
Checked with atomic-openshift-cluster-capacity-3.9.0-0.24.0.git.0.fc8ad63.el7.x86_64, and work now. # oc logs -f cluster-capacity-g9xfc cluster-capacity-stub-container pod requirements: - CPU: 100m - Memory: 80Mi - NodeSelector: load=high,region=hpc The cluster can schedule 0 instance(s) of the pod cluster-capacity-stub-container. Termination reason: Unschedulable: 0/5 nodes are available: 1 NodeUnschedulable, 5 MatchNodeSelector.
According to https://bugzilla.redhat.com/show_bug.cgi?id=1535940#c6, move to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489