Red Hat Bugzilla – Bug 1469037
Sometime daemonset DESIRED=0 even this matched node
Last modified: 2017-08-16 15:51 EDT
Created attachment 1295764 [details]
Description of problem:
When install service-catalog by openshift-ansible, I met service-catalog can't running error, then ssh to install debug, the ds's DESIRED=0, but actually there is matched node. Then restart master service can fix this.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.[root@ip-172-18-0-4 ~]# oc get ds -n kube-service-catalog
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE-SELECTOR AGE
apiserver 0 0 0 0 0 openshift-infra=apiserver 27m
controller-manager 0 0 0 0 0 openshift-infra=apiserver 27m
[root@ip-172-18-0-4 ~]# oc get no --show-labels
NAME STATUS AGE VERSION LABELS
ip-172-18-0-4.ec2.internal Ready,SchedulingDisabled 43m v1.6.1+5115d708d7 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m3.medium,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1d,kubernetes.io/hostname=ip-172-18-0-4.ec2.internal,openshift-infra=apiserver,role=node
ip-172-18-11-233.ec2.internal Ready 43m v1.6.1+5115d708d7 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m3.medium,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1d,kubernetes.io/hostname=ip-172-18-11-233.ec2.internal,registry=enabled,role=node,router=enabled
Created attachment 1295776 [details]
Created attachment 1295777 [details]
Eric Wolinetz is attempting to reproduce this now. He says the node labels in the original comment look correct.
Could we get the logs from the controller manager? I reviewed the node logs and they looked uneventful.
Could we also get a yaml dump of the daemon sets that were created?
The controller-manager log is attached in file atomic-openshift-master.log
daemonset.yaml: http://pastebin.test.redhat.com/501739 (note: the daemonset I provided the link is working well as I restart the master)
Created attachment 1296100 [details]
Reproduce again. Attach some info about ds and node
We debugged a customer issue similar to this one yesterday. Can we establish:
1. Are pods being created at all for the daemon set? If so, can we get yamls and describe output for them?
2. Is there a node selector associated with the namespace? Can we get a yaml for the namespace?
In the issue we debugged today, the default node selectors for the project and later the cluster were resulting in pods being created, but not being scheduled on certain nodes due to conflicts between the pod's node selector and the nodes labels that were introduced by the project node selector.
When happen again, I'll check what you said. To be honest, it's really hard to reproduce it.
This daemonset doesn't create by my manual. it create by openshift-ansible when enable service-catalog. This ds is service-catalog apiserver and controller-manager in kube-service-catalog project.
I spoke to Eric and he is not currently using a node selector on the namespace the installer creates for the catalog components. He is going to add one in this PR: https://github.com/openshift/openshift-ansible/pull/4781
That should address this issue - I don't think that we have a cause to believe that something else is happening. I am going to reassign this bug to Eric and he can move it to ON_QA once that PR is merged.
Verify on openshift-ansible-3.6.162-1.git.0.50e29bd.el7.noarch.rpm.
Now can't met the error again.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.