Bug 1303939

Summary: Router's replica is set to 0 by ansible installer.
Product: OpenShift Container Platform Reporter: Gan Huang <ghuang>
Component: InstallerAssignee: Jason DeTiberus <jdetiber>
Status: CLOSED ERRATA QA Contact: Ma xiaoqiang <xiama>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.0CC: aos-bugs, bleanhar, haowang, jialiu, jokerman, mmccomas, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-ansible-3.0.40-1.git.1.4385281.el7aos Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-29 12:57:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Gan Huang 2016-02-02 13:10:08 UTC
Description of problem:
After installation, router's replica is set to 0 by ansible installer.

Go through ansible code, the following code have some problem.
1. The judgement of "openshift.master.infra_nodes is defined" in playbooks/common/openshift-master/config.yml is incorret. 
<--snip-->
  - set_fact:
      openshift_infra_nodes: "{{ hostvars | oo_select_keys(groups['nodes'])
                                 | oo_nodes_with_label('region', 'infra')
                                 | oo_collect('inventory_hostname') }}"
    when: openshift_infra_nodes is not defined
<--snip-->
- name: Create services
  <--snip-->
  roles:
  - role: openshift_router
    when: openshift.master.infra_nodes is defined
  - role: openshift_registry
    when: openshift.master.infra_nodes is defined and attach_registry_volume | bool
<--snip-->
When openshift_infra_nodes var is not define in user's host file, ansible is trying to using "region=infra" to filter nodes, when no nodes matched, openshift_infra_nodes will be define as an empty array, so openshift_router playbook will be still called. So "openshift.master.infra_nodes is defined" should be corrected to "openshift.master.infra_nodes' length is non-zero"

2. router's replica will be decided by the number of node which have "region=infra" label in roles/openshift_router/tasks/main.yml, it is a little improper, especially when user did not install any infra node and specifying openshift_router_selector='region=primary'.
In my scanriose, I want to deploy router to 'region=primary' node, but ansible installer still using "'region=infra" to match node, when no matched nodes found, it is setting replica to 0. That would confuse customer. If anisble installer could use openshift_router_selector var to filter node for setting replica that would be reasonable and perfect.



Version-Release number of selected component (if applicable):
https://github.com/openshift/openshift-ansible -b master

How reproducible:
Always

Steps to Reproduce:
1.Set up an env in advanced ansible way.
[OSEv3:children]
masters
nodes

[OSEv3:vars]
ansible_ssh_user=root
deployment_type=openshift-enterprise
openshift_router_selector='region=primary'

[masters]
10.66.80.83  openshift_ip=10.66.80.83 openshift_public_ip=10.66.80.83 openshift_hostname=ghuang1.cluster.local openshift_public_hostname=ghuang1.cluster.local
openshift_router_selector='region=primary'

[nodes]
10.66.80.83  openshift_ip=10.66.80.83 openshift_public_ip=10.66.80.83 openshift_hostname=ghuang1.cluster.local openshift_public_hostname=ghuang1.cluster.local openshift_schedulable=False
10.66.81.80  openshift_ip=10.66.81.80 openshift_public_ip=10.66.81.80 openshift_hostname=ghuang2.cluster.local openshift_public_hostname=ghuang2.cluster.local openshift_node_labels="{'region': 'primary', 'zone': 'default'}"
2. After installation, check if router pod is created.



Actual results:
After installation, found no router pod is created
3. Check /var/log/message, found anisble installer is setting its replica to 0
<--snip-->
Feb  2 05:17:10 localhost ansible-command: Invoked with creates=None executable=None chdir=None args=oadm router --create --replicas=0 --service-account=router --selector='region=primary' --credentials=/etc/origin/master/openshift-router.kubeconfig --images='registry.access.redhat.com/openshift3/ose-${component}:${version}' removes=None NO_LOG=None shell=False warn=True
<--snip-->

Expect Result:
1. When no infra nodes found, router should not be created.
2. It is better that ansible ultilize openshift_router_selector to filter nodes for setting router's replica.

Additional info:

Comment 1 Jason DeTiberus 2016-02-02 21:49:37 UTC
I've submitted a PR to address this here: https://github.com/openshift/openshift-ansible/pull/1326

Comment 4 Gan Huang 2016-02-14 09:44:13 UTC
Verfied with openshift-ansible-3.0.40-1.git.0.cfb19e2.el7aos.noarch
1)When no infra nodes found, router and dc were not created.
2)When infra nodes were configured, and openshift_router_selector='region=infra' router was created successfully

Comment 6 errata-xmlrpc 2016-02-29 12:57:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:0311