Bug 1557516 - no ansible_service_broker_selector, asb lands on compute nodes
Summary: no ansible_service_broker_selector, asb lands on compute nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.9.z
Assignee: Fabian von Feilitzsch
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks: 1571385
TreeView+ depends on / blocked
 
Reported: 2018-03-16 18:38 UTC by Dan Yocum
Modified: 2018-08-09 22:14 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1571385 (view as bug list)
Environment:
Last Closed: 2018-08-09 22:13:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
installation log with inventory embeded (2.29 MB, text/plain)
2018-04-23 07:01 UTC, Johnny Liu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2335 0 None None None 2018-08-09 22:14:25 UTC

Description Dan Yocum 2018-03-16 18:38:22 UTC
Description of problem:

When deploying the ansible service broker, it is deployed on the compute nodes.  There is no way to force it to install on, say, an type=infra node.

Version-Release number of the following components:
rpm -q openshift-ansible

3.9.7

rpm -q ansible

2.4.3

How reproducible:

Steps to Reproduce:
1. Use o-a to deploy ansible service broker
2. oc adm manage-nodes --list-pods --selector=type=compute

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

[root@ded-int-gcp-master-esjli ~]# oc adm manage-node --list-pods --selector=type=compute

Listing matched pods on node: ded-int-gcp-node-compute-35rfm

NAMESPACE                          NAME                    READY     STATUS    RESTARTS   AGE
logging                            logging-fluentd-sq9xn   1/1       Running   0          23h
openshift-ansible-service-broker   asb-etcd-1-kggj9        1/1       Running   0          23h

Listing matched pods on node: ded-int-gcp-node-compute-dl7vw

NAMESPACE   NAME                    READY     STATUS    RESTARTS   AGE
logging     logging-fluentd-ff5fn   1/1       Running   0          23h

Listing matched pods on node: ded-int-gcp-node-compute-h4ssk

NAMESPACE   NAME                    READY     STATUS    RESTARTS   AGE
logging     logging-fluentd-xnpwn   1/1       Running   0          23h

Listing matched pods on node: ded-int-gcp-node-compute-v9dzh

NAMESPACE                          NAME                    READY     STATUS    RESTARTS   AGE
logging                            logging-fluentd-4gtfv   1/1       Running   0          23h
openshift-ansible-service-broker   asb-1-deploy            0/1       Error     0          23h



Expected results:

No asb pods on compute nodes!

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 3 Fabian von Feilitzsch 2018-03-19 19:20:35 UTC
https://github.com/openshift/openshift-ansible/pull/7575

Comment 5 openshift-github-bot 2018-03-28 15:45:48 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/8b9250d37b4d7f8977ade6ba719c94516a53ea14
Bug 1557516- ASB now scheduled on infra nodes

https://github.com/openshift/openshift-ansible/commit/82887b1fbe475a550b3bde12d17aea1dc38afbee
Merge pull request #7575 from fabianvf/bz1557516

Bug 1557516- ASB now scheduled on infra nodes

Comment 6 Scott Dodson 2018-03-29 01:55:19 UTC
https://github.com/openshift/openshift-ansible/pull/7692 release-3.9 backport

Comment 10 Johnny Liu 2018-04-20 03:30:52 UTC
Retest this bug with openshift-ansible-3.9.24-1.git.0.d0289ea.el7.noarch, FAIL.

According the PR, without ansible_service_broker_node_selector setting, asb pod should be landed onto region=infra node by default, with ansible_service_broker_node_selector setting, asb pod should be landed onto specified node.

But in my testing, whatever set ansible_service_broker_node_selector={"role": "node"} or not, asb pod always land onto compute nodes.

# oc describe po/asb-1-k992z -n openshift-ansible-service-broker
<--snip-->
Node-Selectors:  node-role.kubernetes.io/compute=true
<--snip-->

Feel like the PR never being merged, but I checked my running installer, the PR is already there.

Comment 11 Fabian von Feilitzsch 2018-04-20 20:32:26 UTC
Can you post your inventory for the failed run?

Comment 12 Johnny Liu 2018-04-23 07:01:56 UTC
Created attachment 1425570 [details]
installation log with inventory embeded

Comment 13 Fabian von Feilitzsch 2018-04-23 15:52:28 UTC
it looks like you have

  ansible_service_broker_node_selector={"role": "node"}

set in the inventory. If you don't set ansible_service_broker_node_selector at all, I think you will get the behavior you are expecting. Does this work for you?

Comment 14 Fabian von Feilitzsch 2018-04-24 16:19:10 UTC
Never mind, I was confused, I think I see the issue. The node selector might need to be specified on the podspec rather than the dc spec.

Comment 15 Fabian von Feilitzsch 2018-04-24 16:28:59 UTC
https://github.com/openshift/openshift-ansible/pull/8117

Comment 17 Johnny Liu 2018-05-03 10:28:07 UTC
So far, the installer rpm version is openshift-ansible-3.9.27-1.git.0.52e35b5.el7.noarch in latest puddle, waiting for newer puddle.

Comment 18 Johnny Liu 2018-05-07 08:01:15 UTC
Verified this bug with openshift-ansible-3.9.28-1.git.0.4fc2ce4.el7.noarch, and PASS.

Because images for v3.9.28 is not built or unavailable on aws-reg registry, so use openshift-ansible-3.9.28-1.git.0.4fc2ce4.el7.noarch installer + 3.9/v3.9.27-1_2018-04-26.2 puddle for installation.


Scenarios 1:
1. don't set ansible_service_broker_node_selector at all in inventory file, trigger installation.
2. after installation, checking:
[root@qe-jialiu392-master-etcd-1 ~]# oc get po -n openshift-ansible-service-broker
NAME                READY     STATUS    RESTARTS   AGE
asb-1-deploy        0/1       Pending   0          3m
asb-etcd-1-deploy   0/1       Pending   0          3m

# oc describe po asb-etcd-1-deploy -n openshift-ansible-service-broker
Name:         asb-etcd-1-deploy
<--snip-->
Node-Selectors:  region=infra
Tolerations:     <none>
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  2s (x18 over 4m)  default-scheduler  0/2 nodes are available: 2 CheckServiceAffinity, 2 MatchNodeSelector.

# oc get node -l region=infra
No resources found.

# oc get node
NAME                                  STATUS    ROLES     AGE       VERSION
qe-jialiu392-master-etcd-1            Ready     master    17m       v1.9.1+a0ce1bc657
qe-jialiu392-node-registry-router-1   Ready     compute   17m       v1.9.1+a0ce1bc657

The default "region=infra" node selector take effect, the "pending" behavior is expected.


Scenario 2:
1. set ansible_service_broker_node_selector={"role": "node"} in inventory file, trigger installation.
2. after installation, checking:
# oc get po -n openshift-ansible-service-broker
NAME               READY     STATUS    RESTARTS   AGE
asb-1-rftf6        1/1       Running   1          58m
asb-etcd-1-vzlk6   1/1       Running   0          58m

# oc describe po asb-1-rftf6 -n openshift-ansible-service-broker
Name:           asb-1-rftf6
Namespace:      openshift-ansible-service-broker
Node:           qe-jialiu391-node-registry-router-1/10.240.0.22
<--snip-->
Node-Selectors:  role=node
<--snip-->

# oc get node -l role=node
NAME                                  STATUS    ROLES     AGE       VERSION
qe-jialiu391-master-etcd-1            Ready     master    1h        v1.9.1+a0ce1bc657
qe-jialiu391-node-registry-router-1   Ready     compute   1h        v1.9.1+a0ce1bc657

Based on currently only openshift-ansible-3.9.27 is attached to 33431 advisory, I move this bug to "MODIFIED", once 3.9.28 build is attached, will verify this bug.

Comment 21 Johnny Liu 2018-07-30 08:22:41 UTC
openshift-ansible build is not attached to the advisory yet, once attached, will re-run testing.

Comment 22 Johnny Liu 2018-08-03 06:38:13 UTC
Verified this bug with openshift-ansible-3.9.40-1.git.0.188c954.el7.noarch, and PASS.


Scenario 2:
1. set ansible_service_broker_node_selector={"role": "node"} in inventory file, trigger installation.
2. after installation, checking:
# oc get po -n openshift-ansible-service-broker
NAME               READY     STATUS    RESTARTS   AGE
asb-1-p25g2        1/1       Running   0          18h
asb-etcd-1-54sv4   1/1       Running   0          18h

# oc describe po asb-1-p25g2 -n openshift-ansible-service-broker
<--snip-->
Node-Selectors:  role=node
<--snip-->

Base on my verification and comment 18, move this bug to verified.

Comment 24 errata-xmlrpc 2018-08-09 22:13:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2335


Note You need to log in before you can comment on or make changes to this bug.