Description of problem: Between z-streams of 3.11 openshift_hostname started being ignored. The playbook doens't fail immediately. I'm setting high severity and high priority because the bug is identified and has huge potential impact Version-Release number of the following components: Customer reproduced this on 3.10.73-1 but 3.10.87-1 should also be affected openshift_hostname used to work on openshift-ansible-3.10.47-1 How reproducible: Always Steps to Reproduce: 1. Specify a hostname using openshift_hostname instead of openshift_kubelet_name_override 2. Verify openshift_facts Actual results: openshift_hostname is ignored, the playbook keeps running with a wrong value. This has a huge impact because it can break certificates and make several components connect to unexpected endpoints. Expected results: openshift_hostname is either honored with a deprecation warning or causes the playbook to fail if it's different openshift_kubelet_name_override Additional info: $ git diff 95bc2d2 playbooks/init/validate_hostnames.yml diff --git a/playbooks/init/validate_hostnames.yml b/playbooks/init/validate_hostnames.yml index b37e6fec4..ca280684b 100644 --- a/playbooks/init/validate_hostnames.yml +++ b/playbooks/init/validate_hostnames.yml @@ -10,19 +10,20 @@ changed_when: false failed_when: false - - name: Validate openshift_hostname when defined + - name: Validate openshift_kubelet_name_override when defined fail: msg: > The hostname {{ openshift.common.hostname }} for {{ ansible_nodename }} doesn't resolve to an IP address owned by this host. Please set - openshift_hostname variable to a hostname that when resolved on the host + openshift_kubelet_name_override variable to a hostname that when resolved on the host in question resolves to an IP address matching an interface on this host. This will ensure proper functionality of OpenShift networking features. - Inventory setting: openshift_hostname={{ openshift_hostname | default ('undefined') }} + Inventory setting: openshift_kubelet_name_override={{ openshift_kubelet_name_override | default ('undefined') }} This check can be overridden by setting openshift_hostname_check=false in the inventory. See https://docs.openshift.org/latest/install_config/install/advanced_install.html#configuring-host-variables when: + - openshift_kubelet_name_override is defined - lookupip.stdout != '127.0.0.1' - lookupip.stdout not in ansible_all_ipv4_addresses - openshift_hostname_check | default(true) | bool $ git blame playbooks/init/validate_hostnames.yml | grep kubelet_name_override 5ce5800906 playbooks/init/validate_hostnames.yml (Michael Gugino 2018-10-05 10:22:35 -0400 13) - name: Validate openshift_kubelet_name_override when defined 5ce5800906 playbooks/init/validate_hostnames.yml (Michael Gugino 2018-10-05 10:22:35 -0400 18) openshift_kubelet_name_override variable to a hostname that when resolved on the host 5ce5800906 playbooks/init/validate_hostnames.yml (Michael Gugino 2018-10-05 10:22:35 -0400 21) Inventory setting: openshift_kubelet_name_override={{ openshift_kubelet_name_override | default ('undefined') }} 5ce5800906 playbooks/init/validate_hostnames.yml (Michael Gugino 2018-10-05 10:22:35 -0400 26) - openshift_kubelet_name_override is defined $ git show 5ce5800906 --summary commit 5ce5800906255a2a6bf940a17908be59d9861de7 Author: Michael Gugino <mgugino> Date: Fri Oct 5 10:22:35 2018 -0400 Fail on openshift_hostname defined; add openshift_kubelet_name_override Adding openshift_kubelet_name_override as a stand-in for various places we will need to account for possible hostname overrides. (cherry picked from commit 1faee0942dec05b6f652669ad6cfced986a0cbc9) (cherry picked from commit 8d3509838c7ecc2bafafa7f7815b9964bf08cdda)
Can you please clarify what is meant by step #2? Which playbook is actually executed that produces unexpected results? > Steps to Reproduce: > 1. Specify a hostname using openshift_hostname instead of > openshift_kubelet_name_override > 2. Verify openshift_facts
Run the openshift_facts.yml playbook and check ansible_facts.openshift.common.hostname. It's not honored.
The prerequisites.yml and deploy_cluster.yml playbooks should treat that as a fatal condition. The purpose of the playbook you're running is to calculate the default values and it seems to be doing that as expected.
Scott, the problem isn'g installation but upgrading. During an upgrade neither prerequisites.yml nor deploy_cluster.yml have to be executed. Therefore a user who is running 3.10.73-1 and decides to upgrade to the latest Z-stream may have this problem. This is a breaking change in the middle of a Z-stream, can we please reconsider if that is the behavior we want?
(In reply to Juan Luis de Sousa-Valadas from comment #5) > Scott, the problem isn'g installation but upgrading. During an upgrade > neither prerequisites.yml nor deploy_cluster.yml have to be executed. > Therefore a user who is running 3.10.73-1 and decides to upgrade to the > latest Z-stream may have this problem. > > This is a breaking change in the middle of a Z-stream, can we please > reconsider if that is the behavior we want? This would be handled by sanity_checks module which runs during installs and upgrades. Please provide logs, inventory, and appropriate version information. I don't believe this is currently an issue.