Description of problem: The router pod should, like all CVO managed components without a special exception, tolerate masters, but it does not. Version-Release number of selected component (if applicable): $ KUBECONFIG=wking/auth/kubeconfig oc adm release info --commits | grep ingress cluster-ingress-operator https://github.com/openshift/cluster-ingress-operator 9478e28af89922fa4d54389b1ae8ae6fafb2662b How reproducible: Every time. Steps to Reproduce: 1. Break your Machine API provider, e.g. by running libvirt with a non-standard volume pool before [1] lands. 2. Launch a cluster. 3. Wait for things to stabilize. Then: $ oc get pods --all-namespaces | grep Pending openshift-ingress router-default-7688479d99-nbnj8 0/1 Pending 0 31m openshift-monitoring prometheus-operator-647d84b5c6-rsplb 0/1 Pending 0 31m openshift-operator-lifecycle-manager olm-operators-sf5sm 0/1 Pending 0 36m $ oc get pod -o "jsonpath={.status.conditions}{'\n'}" -n openshift-ingress router-default-7688479d99-nbnj8 [map[type:PodScheduled status:False lastProbeTime:<nil> lastTransitionTime:2019-01-30T20:00:04Z reason:Unschedulable message:0/1 nodes are available: 1 node(s) didn't match node selector.]] $ oc get -o yaml deployment -n openshift-ingress router-default apiVersion: extensions/v1beta1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" creationTimestamp: 2019-01-30T20:00:04Z generation: 1 labels: app: router name: router-default namespace: openshift-ingress resourceVersion: "12646" selfLink: /apis/extensions/v1beta1/namespaces/openshift-ingress/deployments/router-default uid: a2a9a529-24c9-11e9-8d1a-52fdfc072182 spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: router router: router-default strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: router router: router-default spec: containers: - env: - name: STATS_PORT value: "1936" - name: ROUTER_SERVICE_NAMESPACE value: openshift-ingress - name: DEFAULT_CERTIFICATE_DIR value: /etc/pki/tls/private - name: ROUTER_SERVICE_NAME value: default - name: ROUTER_CANONICAL_HOSTNAME value: apps.wking.installer.testing image: registry.svc.ci.openshift.org/openshift/origin-v4.0-2019-01-30-150036@sha256:6991fb24697317cb8a1b8a4cfd129d77d05a199f382a4c5ba7eae7ad55bb386b imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: host: localhost path: /healthz port: 1936 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: router ports: - containerPort: 80 hostPort: 80 name: http protocol: TCP - containerPort: 443 hostPort: 443 name: https protocol: TCP - containerPort: 1936 hostPort: 1936 name: stats protocol: TCP readinessProbe: failureThreshold: 3 httpGet: host: localhost path: /healthz port: 1936 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/pki/tls/private name: default-certificate readOnly: true dnsPolicy: ClusterFirst hostNetwork: true nodeSelector: node-role.kubernetes.io/worker: "" priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: router serviceAccountName: router terminationGracePeriodSeconds: 30 volumes: - name: default-certificate secret: defaultMode: 420 secretName: router-certs-default status: conditions: - lastTransitionTime: 2019-01-30T20:00:04Z lastUpdateTime: 2019-01-30T20:00:04Z message: Deployment does not have minimum availability. reason: MinimumReplicasUnavailable status: "False" type: Available - lastTransitionTime: 2019-01-30T20:10:05Z lastUpdateTime: 2019-01-30T20:10:05Z message: ReplicaSet "router-default-7688479d99" has timed out progressing. reason: ProgressDeadlineExceeded status: "False" type: Progressing observedGeneration: 1 replicas: 1 unavailableReplicas: 1 updatedReplicas: 1 Actual results: Pending pod with "0/1 nodes are available: 1 node(s) didn't match node selector." Expected results: A running pod. Additional info: "high" severity is based on Clayton's request [2]. [1]: https://github.com/openshift/cluster-api-provider-libvirt/pull/45 [2]: https://github.com/openshift/installer/pull/1146#issuecomment-459037176
Kube explicitly prohibits masters from service load balancer target pools[1]. Given that, even if we allowed the routers to be scheduled on masters, no traffic would make it to them through the provisioned ELB. For other non-cloud platforms (e.g. libvirt), we don't use load balancer services (instead using host-networked routers with no managed LB; something we refer to as 'user defined' high availability). Given all this, should our operator add a master toleration only when using 'user defined' cluster ingress high availability? [1] https://github.com/kubernetes/kubernetes/issues/65618
> Given all this, should our operator add a master toleration only when using 'user defined' cluster ingress high availability? Currently we have: Possibly? If only to set a more-specific ClusterOperator reason "master can't run a useable router" vs. our current "ingress "default" not available": $ oc get clusteroperator -o yaml openshift-ingress-operator apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: 2019-01-30T20:00:05Z generation: 1 name: openshift-ingress-operator resourceVersion: "7055" selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-ingress-operator uid: a346c6e0-24c9-11e9-8d1a-52fdfc072182 spec: {} status: conditions: - lastTransitionTime: 2019-01-30T20:00:05Z status: "False" type: Failing - lastTransitionTime: 2019-01-30T20:00:05Z status: "False" type: Progressing - lastTransitionTime: 2019-01-30T20:00:05Z message: ingress "default" not available reason: IngressUnavailable status: "False" type: Available extension: null version: 0.0.1
Given that 4.0 is AWS only, I'm marking this a 4.1 bug.
> Given that 4.0 is AWS only, I'm marking this a 4.1 bug. Is all-in-one (zero compute nodes) not a target for 4.0? I think resource constraints make a stronger case for that on libvirt, but folks trying to run AWS clusters on the cheap may also be interested in dropping compute nodes. And maybe the Kubernetes issue linked from comment 1 make zero compute nodes infeasible in the short-term anyway. So while punting to future targets may be appropriate, this is fundamentally an issue for all platforms.
I'm going to close this one, because: 1. Our default is consistent with upstream. When publishing an ingress controller with a LoadBalancer Service, masters are excluded from LB target pools by design in k8s. To change this assumption, I think we should take the discussion upstream. 2. Our defaults can be overridden. Admins can control ingress controller scheduling via .spec.nodePlacement. If someone wants to schedule ingress controllers on masters or non-linux hosts, they can. We just won't by default. Please feel free to re-open if you feel closing this was a mistake!
Today we landed [1], which should allow ingress/routing on the control-plane machines if you have no compute nodes. I'm not entirely clear on what happens when you have a single compute node; are we still prohibiting colocation ([2], bug 1703943)? We might be stuck there without scheduleable control-plane machines (because we have a compute node), but without enough compute nodes for the full ingress deployment. [1]: https://github.com/openshift/installer/pull/2004 [2]: https://github.com/openshift/cluster-ingress-operator/pull/222