Bug 1535940
Summary: | Cluster capacity can not run after update for kube rebase 1.9 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | weiwei jiang <wjiang> |
Component: | Node | Assignee: | Avesh Agarwal <avagarwa> |
Status: | CLOSED ERRATA | QA Contact: | weiwei jiang <wjiang> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 3.9.0 | CC: | aos-bugs, jokerman, mmccomas, sjenning |
Target Milestone: | --- | ||
Target Release: | 3.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: In a pod, kubeconfig is not supplied and cluster-capacity exited incorrectly because in a pod, in cluster config is used.
Consequence:
cluster capacity stopped working in a pod.
Fix:
Now in a pod, absence of kubeconfig is ignored because in cluster config is used.
Result:
cluster capacity is now woking in a pod.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-03-28 14:20:40 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
weiwei jiang
2018-01-18 09:56:13 UTC
(In reply to weiwei jiang from comment #0) > Description of problem: > Checked with cluster capacity and found it always got "Failed to set default > scheduler config: Error in opening default scheduler config file: open : no > such file or directory" even I give a existed path for --default-config > > /bin/sh -ec /bin/cluster-capacity --default-config=/test-s/scheduler.json > --podspec=/test-pod/pod.yaml --verbose > > $ ls /test-* > /test-pod: > pod.yaml > > /test-s: > scheduler.json > > > Version-Release number of selected component (if applicable): > atomic-openshift-cluster-capacity-3.9.0-0.21.0.git.0.2a50d06.el7.x86_64 > # openshift version > openshift v3.9.0-0.20.0 > kubernetes v1.9.1+a0ce1bc657 > etcd 3.2.8 > > How reproducible: > always > > Steps to Reproduce: > 1. Create a podspec as a configmap > oc create configmap cluster-capacity-configmap --from-file=pod.yaml=pod.yaml > -n default > # cat pod.yaml > apiVersion: v1 > kind: Pod > metadata: > creationTimestamp: null > name: cluster-capacity-stub-container > namespace: cluster-capacity > spec: > containers: > - image: gcr.io/google_containers/pause:2.0 > imagePullPolicy: Always > name: cluster-capacity-stub-container > resources: > limits: > cpu: 200m > memory: 100Mi > requests: > cpu: 100m > memory: 80Mi > dnsPolicy: Default > nodeSelector: > load: high > region: hpc > restartPolicy: OnFailure > schedulerName: default-scheduler > status: {} > > 2. oc create secret generic sf > --from-file=scheduler.json=/etc/origin/master/scheduler.json -n default Why are you creating secret from scheduler config file? > 3. create a rc for cluster-capacity > apiVersion: v1 > kind: ReplicationController > metadata: > creationTimestamp: 2018-01-18T07:30:32Z > generation: 5 > labels: > run: cluster-capacity > name: cluster-capacity > namespace: default > resourceVersion: "22043" > selfLink: > /api/v1/namespaces/default/replicationcontrollers/cluster-capacity > uid: 778a86ed-fc21-11e7-80da-fa163ead188e > spec: > replicas: 1 > selector: > run: cluster-capacity > template: > metadata: > creationTimestamp: null > labels: > run: cluster-capacity > spec: > containers: > - command: > - /bin/sh > - -ec > - | > /bin/cluster-capacity --default-config=/test-s/scheduler.json why are you mounting scheduler.json config file as secret? > --podspec=/test-pod/pod.yaml --verbose;while true;do sleep 10;done > env: > - name: CC_INCLUSTER > value: "true" > image: > brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-cluster- > capacity > imagePullPolicy: Always > name: cluster-capacity > resources: {} > terminationMessagePath: /dev/termination-log > terminationMessagePolicy: File > volumeMounts: > - mountPath: /test-pod > name: test-volume > - mountPath: /test-s > name: ss > dnsPolicy: ClusterFirst > restartPolicy: Always > schedulerName: default-scheduler > securityContext: {} > serviceAccount: cluster-capacity-sa > serviceAccountName: cluster-capacity-sa > terminationGracePeriodSeconds: 30 > volumes: > - configMap: > defaultMode: 420 > name: cluster-capacity-configmap > name: test-volume > - name: ss > secret: > defaultMode: 420 > secretName: sf > > > Actual results: > # oc logs -n default -f cluster-capacity-bgjh5 > Failed to set default scheduler config: Error in opening default scheduler > config file: open : no such file or directory > > Expected results: > cluster-capacity should work well > > Additional info: Also in general, you need to node provide scheduler.json unless your cluster us using some customize default scheduler. Just want to correct my comment: Also in general, you need NOT provide scheduler.json unless your cluster is using some customized default scheduler. Also can you show me your scheduler.json? I just tested locally --default-config and it seemed to work. I will try in a pod now. I can reproduce the issue. I am working on a fix. PR: https://github.com/openshift/origin/pull/18198 In case you want to test it before, you could use this image: docker.io/aveshagarwal/cluster-capacity Checked with atomic-openshift-cluster-capacity-3.9.0-0.24.0.git.0.fc8ad63.el7.x86_64, and work now. # oc logs -f cluster-capacity-g9xfc cluster-capacity-stub-container pod requirements: - CPU: 100m - Memory: 80Mi - NodeSelector: load=high,region=hpc The cluster can schedule 0 instance(s) of the pod cluster-capacity-stub-container. Termination reason: Unschedulable: 0/5 nodes are available: 1 NodeUnschedulable, 5 MatchNodeSelector. According to https://bugzilla.redhat.com/show_bug.cgi?id=1535940#c6, move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489 |