Red Hat Bugzilla – Bug 1581766
Bad spec.host value in "hawkular-metrics" route
Last modified: 2018-10-02 03:45:16 EDT
Description of problem: Upgrading from 3.7.23 to 3.9.27 using opnenshift-ansible 3.9.29. When running roles/openshift_metrics/tasks/install_hawkular.yaml, the installed hawkular-metrics-route.yaml somehow has the spec.host value set to an https:// URL instead of just the hostname. # cat /tmp/openshift-metrics-ansible-s67o2V/templates/hawkular-metrics-route.yaml apiVersion: v1 kind: Route metadata: name: hawkular-metrics annotations: {} labels: metrics-infra: hawkular-metrics spec: host: https://metrics.mbarnestest2.openshift.com/hawkular/metrics to: kind: Service name: hawkular-metrics ... But the variable "openshift_metrics_hawkular_hostname" is defined as "metrics.mbarnestest2.openshift.com". Version-Release number of the following components: openshift-ansible-3.9.29-1.git.0.051bc5c.el7.noarch ansible-2.4.3.0-1.el7ae.noarch $ ansible --version ansible 2.4.3.0 config file = /etc/ansible/ansible.cfg configured module search path = [u'/home/mbarnes/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, May 3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)] How reproducible: always Actual results: TASK [openshift_metrics : Applying /tmp/openshift-metrics-ansible-s67o2V/templates/hawkular-metrics-route.yaml] *** Tuesday 22 May 2018 20:23:39 +0000 (0:00:00.485) 0:01:46.107 *********** fatal: [18.188.156.115]: FAILED! => {"changed": false, "cmd": ["oc", "--config=/tmp/openshift-metrics-ansible-s67o2V/admin.kubeconfig", "apply", "-f", "/tmp/openshift-metrics-ansible-s67o2V/templates/hawkular-metrics-route.yaml", "-n", "openshift-infra"], "delta": "0:00:00.373263", "end": "2018-05-22 20:23:40.256858", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2018-05-22 20:23:39.883595", "stderr": "The Route \"hawkular-metrics\" is invalid: spec.host: Invalid value: \"https://metrics.mbarnestest2.openshift.com/hawkular/metrics\": host must conform to DNS 952 subdomain conventions", "stderr_lines": ["The Route \"hawkular-metrics\" is invalid: spec.host: Invalid value: \"https://metrics.mbarnestest2.openshift.com/hawkular/metrics\": host must conform to DNS 952 subdomain conventions"], "stdout": "", "stdout_lines": []} PLAY RECAP ********************************************************************* 13.58.120.78 : ok=0 changed=0 unreachable=0 failed=0 18.188.156.115 : ok=204 changed=31 unreachable=0 failed=1 18.188.26.225 : ok=0 changed=0 unreachable=0 failed=0 18.191.62.124 : ok=0 changed=0 unreachable=0 failed=0 18.191.69.108 : ok=25 changed=3 unreachable=0 failed=0 18.191.86.228 : ok=0 changed=0 unreachable=0 failed=0 18.217.205.167 : ok=0 changed=0 unreachable=0 failed=0 18.217.248.27 : ok=0 changed=0 unreachable=0 failed=0 18.219.155.52 : ok=25 changed=3 unreachable=0 failed=0 18.221.160.30 : ok=0 changed=0 unreachable=0 failed=0 localhost : ok=11 changed=0 unreachable=0 failed=0 INSTALLER STATUS *************************************************************** Initialization : Complete (0:00:26) [DEPRECATION WARNING]: The following are deprecated variables and will be no longer be used in the next minor release. Please update your inventory accordingly. openshift_hosted_logging_deploy openshift_hosted_logging_elasticsearch_cluster_size openshift_hosted_logging_deployer_version openshift_hosted_metrics_deploy openshift_hosted_metrics_storage_kind openshift_hosted_metrics_public_url Metrics Install : In Progress (0:01:21) This phase can be restarted by running: playbooks/openshift-metrics/config.yml Tuesday 22 May 2018 20:23:40 +0000 (0:00:00.590) 0:01:46.698 *********** =============================================================================== openshift_version : Get available atomic-openshift version -------------- 9.26s openshift_metrics : slurp ----------------------------------------------- 4.04s openshift_metrics : Set serviceaccounts for hawkular metrics/cassandra --- 2.68s openshift_metrics : generate hawkular-cassandra replication controllers --- 2.23s openshift_metrics : read files for the hawkular-metrics secret ---------- 1.85s Gather Cluster facts ---------------------------------------------------- 1.53s openshift_metrics : Generating serviceaccounts for hawkular metrics/cassandra --- 1.52s openshift_metrics : Generate services for cassandra --------------------- 1.48s openshift_metrics : command --------------------------------------------- 1.48s openshift_metrics : Create objects -------------------------------------- 1.48s openshift_metrics : copy local generated passwords to target ------------ 1.33s openshift_metrics : command --------------------------------------------- 1.32s openshift_metrics : Set hawkular cluster roles -------------------------- 1.24s Gathering Facts --------------------------------------------------------- 1.24s Initialize openshift.node.sdn_mtu --------------------------------------- 1.17s Gathering Facts --------------------------------------------------------- 0.97s Gathering Facts --------------------------------------------------------- 0.93s Gathering Facts --------------------------------------------------------- 0.89s openshift_metrics : generate cassandra secret template ------------------ 0.87s openshift_metrics : generate hawkular-metrics-account secret template --- 0.87s 31.02user 11.51system 1:48.01elapsed 39%CPU (0avgtext+0avgdata 148104maxresident)k 0inputs+3272outputs (0major+3627241minor)pagefaults 0swaps Killed background task: 104340 Killed background task: 104343 Killed background task: 104337 Additional info: I'll work on getting more verbose logs.
This might be due to the fact that we also define: openshift_hosted_metrics_public_url=https://metrics.mbarnestest2.openshift.com/hawkular/metrics And I see the following in roles/openshift_sanitize_inventory/tasks/__deprecations_metrics.yml: - conditional_set_fact: facts: "{{ hostvars[inventory_hostname] }}" vars: ... openshift_metrics_hawkular_hostname: openshift_hosted_metrics_public_url
Regression introduced in 3.9.24 by: https://github.com/openshift/openshift-ansible/commit/bdcc1aae09c4bb264b0595ad234522103cc01d70
Removing OpsBlocker tag since I was able to work around the problem by simply removing our "openshift_hosted_metrics_public_url" definition.
Is this really a bug for 3.9? It appears from https://docs.openshift.com/container-platform/3.9/install_config/cluster_metrics.html#metrics-ansible-variables that only openshift_metrics_hawkular_hostname is meant to be used to set the metrics route and if you only set this variable and do not include the deprecated variable openshift_hosted_metrics_public_url things work as expected.