Bug 1375946 - [quick install]docker registry and router can not be deployed for failedScheduling
Summary: [quick install]docker registry and router can not be deployed for failedSched...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.3.1
Assignee: Samuel Munilla
QA Contact: liujia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-14 10:09 UTC by liujia
Modified: 2016-10-27 16:13 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, the quick installer could have labeled unschedulable nodes as infra nodes. This would prevent the registry and router from deploying as the nodes were unschedulable. The quick installer has been updated to only assign the infra label to schedulable nodes ensuring that the registry and router and deployed properly.
Clone Of:
Environment:
Last Closed: 2016-10-27 16:13:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2122 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix update 2016-10-27 20:11:30 UTC

Description liujia 2016-09-14 10:09:03 UTC
Description of problem:
Trigger a installation by quick-installer, the first two masters are labled with 'region=infra', while masters  were schedulable=false. so  after installation,docker registry and router pod can not be deployed for FailedScheduling.

Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  26m		1m		29	{default-scheduler }			Warning		FailedScheduling	pod (router-1-deploy) failed to fit in any node
fit failure on node (192.168.0.65): MatchNodeSelector

  26m	1m	63	{default-scheduler }		Warning	FailedScheduling	pod (router-1-deploy) failed to fit in any node
fit failure on node (192.168.0.65): CheckServiceAffinity


Because two of masters will be labeled as infra which will be selected to deploy docker-registry and touter,but masters are not schedulable by default.

Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.3.22-1.git.0.6c888c2.el7.noarch

How reproducible:
always

Steps to Reproduce:
1.run "atomic-openshift-installer install"
2.when it install successfully, i check the pods' status as follows:
docker-registry-2-deploy   0/1       Pending   0          23m
router-1-deploy            0/1       Pending   0          24m

[root@openshift-193 ~]# oc get nodes
NAME            STATUS                     AGE
192.168.0.189   Ready,SchedulingDisabled   53m
192.168.0.65    Ready                      53m
192.168.0.80    Ready,SchedulingDisabled   53m
192.168.0.97    Ready,SchedulingDisabled   53m


Actual results:
docker registry and router can not be deployed for failedScheduling

Expected results:
Docker registry and router should be deployed successfully after installing. Maybe it should select schedulable node to labeled with infra, or it will much better for user to choose which node should be labeled with infra.

Additional info:

Comment 1 Brenton Leanhardt 2016-09-14 12:37:43 UTC
Hi Liu Jia,

Would you mind attaching the quick install configuration that you used?

Comment 2 openshift-github-bot 2016-09-15 14:42:42 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/522a069d21b6557821bd85aa77ecfee43cf7c549
a-o-i: Don't set unschedulable nodes as infra

Make sure we don't set an unschedulable node as infra as that can cause problems.

Fixes: Bug 1375946

Comment 5 liujia 2016-10-09 03:03:00 UTC
blocked verify by bug 1382885 and bug 1383004

Comment 6 liujia 2016-10-18 08:16:55 UTC
worked around bug 1382885 and still blocked by bug 1383004

Comment 7 liujia 2016-10-19 07:25:52 UTC
Version:
atomic-openshift-utils-3.3.37-1.git.0.10ff25b.el7.noarch
ansible-2.2.0.0-0.62.rc1.el7.noarch

Steps:
1.Trigger a installation by quick-installer in HA env
run "atomic-openshift-installer install" 
2.It will generate inventory file with schedulable node labeled as follows:
[nodes]
openshift-x.x.x.x  openshift_public_ip=x.x.x.x openshift_ip=192.168.2.164 openshift_public_hostname=x.x.x.x openshift_hostname=192.168.2.164 connect_to=openshift-x.x.x.x openshift_schedulable=False

openshift-x.x.x.x  openshift_public_ip=x.x.x.x openshift_ip=192.168.2.183 openshift_public_hostname=x.x.x.x openshift_hostname=192.168.2.183 connect_to=x.x.x.x openshift_schedulable=False

openshift-x.x.x.x  openshift_public_ip=x.x.x.x openshift_ip=192.168.2.184 openshift_public_hostname=x.x.x.x openshift_hostname=192.168.2.184 connect_to=x.x.x.x openshift_schedulable=False

openshift-x.x.x.x  openshift_public_ip=x.x.x.x openshift_ip=192.168.2.185 openshift_public_hostname=x.x.x.x openshift_hostname=192.168.2.185 connect_to=x.x.x.x openshift_node_labels="{'region': 'infra'}" openshift_schedulable=True

Result:
After installation complete, checked only node-192.168.2.185 is schedulable and it will be labeled with "{'region': 'infra'}".
# oc get node
NAME            STATUS                     AGE
192.168.2.164   Ready,SchedulingDisabled   34m
192.168.2.183   Ready,SchedulingDisabled   34m
192.168.2.184   Ready,SchedulingDisabled   34m
192.168.2.185   Ready                      34m

And docker registry and router container will be scheduled to 192.168.2.185 which is labeled "{'region': 'infra'}".

Comment 9 errata-xmlrpc 2016-10-27 16:13:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:2122


Note You need to log in before you can comment on or make changes to this bug.