Description of problem: With a standard OCP3.9 install on a new RHEL VM using "atomic-openshift-installer install" onto a combined master+node. Create a new project and attempt to deploy a pod to that project. The pod will get stuck in Pending and master logs state that the pod cannot be scheduled Version-Release number of selected component (if applicable): How reproducible: Has occurred on my only install and also reported by Gary Lamperillo Steps to Reproduce: 1. Install clean OCP3.9 2. oc new-project demo 3. deploy any image to demo namespace Actual results: Pod will get stuck in Pending and master logs state that the pod cannot be scheduled Expected results: Pod is scheduled and created Additional info: Pods can be deployed to the default project but not any project created post-install. Comparing the default project and a test project the former contains the following annotation: openshift.io/node-selector: "" The latter project does not. When I add the missing annotation to the new project via 'oc edit project test' then the pods are scheduled and created as expected. ----------------------------------------------------------- [root@ocp3x-master ~]# oc version oc v3.9.14 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ocp3x-master.example.com:8443 openshift v3.9.14 kubernetes v1.9.1+a0ce1bc657 ------------------------------------------------------------- [root@ocp3x-master ~]# yum info atomic-openshift-utils Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager Installed Packages Name : atomic-openshift-utils Arch : noarch Version : 3.9.14 Release : 1.git.3.c62bc34.el7 Size : 151 k Repo : installed From repo : rhel-7-server-ose-3.9-rpms Summary : Atomic OpenShift Utilities URL : https://github.com/openshift/openshift-ansible License : ASL 2.0 Description : Atomic OpenShift Utilities includes : - atomic-openshift-installer : - other utilities ------------------------------------------------------------------ root@ocp3x-master ~]# cat .config/openshift/installer.cfg.yml ansible_callback_facts_yaml: /root/.config/openshift/.ansible/callback_facts.yaml ansible_inventory_path: /root/.config/openshift/hosts ansible_log_path: /tmp/ansible.log deployment: ansible_ssh_user: root hosts: - connect_to: ocp3x-master.example.com hostname: ocp3x-master.example.com ip: 192.168.122.3 node_labels: '{''region'': ''infra''}' public_hostname: ocp3x-master.example.com public_ip: 192.168.122.3 roles: - master - etcd - node - storage master_routingconfig_subdomain: example.com openshift_disable_check: memory_availability,disk_availability,docker_storage openshift_enable_service_catalog: 'False' openshift_master_cluster_hostname: None openshift_master_cluster_public_hostname: None proxy_exclude_hosts: '' proxy_http: '' proxy_https: '' roles: etcd: {} master: {} node: {} storage: {} variant: openshift-enterprise variant_version: '3.9' version: v2
This is happening because we only label non master, non infra nodes as node-role.kubernetes.io/compute=true and then set the default node selector to match that. In the special case where all nodes are masters or all nodes are infra that's not going to work. I think we should special case those and apply the label 'node-role.kubernetes.io/compute=true' to the list of labels we set for the installer.
Just to be clear, this is only happening because of the default atomic-openshift-installer behavior. This is not something production environments would ever hit.
(In reply to Brenton Leanhardt from comment #2) > Just to be clear, this is only happening because of the default > atomic-openshift-installer behavior. This is not something production > environments would ever hit. Will it be fixed in OCP 3.9? Right now, a single Master/Node VM install is not an option.
Bug 1567028 is same root cause, marking this as a dupe of that one as that bug and it's 3.10 counterpart already have a PR that's in progress. *** This bug has been marked as a duplicate of bug 1567028 ***