Bug 1476890 - Running health check playbooks can have unexpected side effects
Summary: Running health check playbooks can have unexpected side effects
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.9.0
Assignee: Luke Meyer
QA Contact: Wenkai Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-31 17:53 UTC by Luke Meyer
Modified: 2018-06-18 18:26 UTC (History)
8 users (show)

Fixed In Version: openshift-ansible-3.9.0-0.20.0.git.0.dce44f0.el7
Doc Type: Bug Fix
Doc Text:
Cause: Health check dependencies performed Ansible actions as part of their operation. Consequence: Running health checks could result in changes to the cluster hosts, e.g.: Docker reconfigured and restarted, yum repos modified, firewall reconfigured Fix: With v3.9 the installer has been refactored such that the changes mentioned only happen in the prerequisites.yml playbook. Result: Running the health checks no longer makes these changes. The related warning under https://docs.openshift.com/container-platform/3.7/admin_guide/diagnostics_tool.html#ansible-based-tooling-health-checks can be removed for 3.9.
Clone Of:
Environment:
Last Closed: 2018-06-13 14:51:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Luke Meyer 2017-07-31 17:53:10 UTC
Description of problem:
When running the health checks provided in openshift-ansible, users probably do not expect to make significant changes to the host systems, but these playbooks may. It is not their intent, but a side effect of the current architecture.

Version-Release number of selected component (if applicable):
openshift-ansible 3.6.0


Steps to Reproduce:
1. Deploy OpenShift with Ansible
2. Modify something on the host systems that Ansible manages. For instance:
  a. Add or remove something in INSECURE_REGISTRY in /etc/sysconfig/docker
  b. Disable the yum repo that provides OpenShift packages
3. Run a health check playbook (currently "playbooks/byo/openshift-checks/health.yml" is available).


Actual results:
Docker reconfigured and restarted, yum repo re-enabled, etc... only for roles that are dependencies of openshift_health_checker though (TODO: list these).


Expected results:
Only expected changes to install package dependencies of openshift_health_checker like python-docker-py and skopeo.


Additional info:
The health checks themselves are not making these changes, it is the roles they depend on for system information (or that their dependencies do). In an installation scenario there is not much reason to avoid making changes, but for post-installation playbooks there needs to be a way to gather the information from these roles without performing the configuration tasks.

Comment 1 Luke Meyer 2017-08-15 14:01:37 UTC
The roles in question are docker, os_firewall, and openshift_repos. We are investigating ways to make these take no action under this usage. In fact, something seems to have changed since discovering this issue such that docker no longer re-configures/restarts before the health checks, so that part may already be solved.

Comment 2 Michael Gugino 2017-08-25 00:47:13 UTC
Some work to refactor docker roll has started:  https://github.com/openshift/openshift-ansible/pull/5165

This refactor will remove docker from dependency chains of other roles.

Comment 3 Luke Meyer 2018-01-15 18:01:41 UTC
I believe with the changes made for 3.9 to pull out prerequisites.yml that this is no longer an issue. I would appreciate QE confirming that running the health.yml playbook no longer carries any apparent risk of reconfiguring hosts or restarting services. The only change expected at all might be installing RPM(s) to support the checks.

Comment 4 Wenkai Shi 2018-01-17 08:54:36 UTC
Verified with version openshift-ansible-3.9.0-0.20.0.git.0.dce44f0.el7, disable repo and health check failed. Health check will not performing configuration tasks, as expect.


Note You need to log in before you can comment on or make changes to this bug.