Bug 1479493

Summary:	OCP upgrade and scale-up automatically upgrades RHEL when it should not
Product:	OpenShift Container Platform	Reporter:	Dan Yocum <dyocum>
Component:	Installer	Assignee:	Scott Dodson <sdodson>
Status:	CLOSED NOTABUG	QA Contact:	Johnny Liu <jialiu>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	3.5.1	CC:	aos-bugs, dyocum, jkaur, jokerman, mmccomas
Target Milestone:	---	Keywords:	OpsBlocker, UpcomingRelease
Target Release:	3.7.0
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-08-29 20:34:31 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Dan Yocum 2017-08-08 16:10:19 UTC

Description of problem:

Scenario #1: when upgrading the OSD customers, it was observed that RHEL was upgrade (unexpectedly!) during the OCP upgrade. This is unacceptable from an Operations point of view; Ops tests against specific versions of the OS and to have the OS upgraded under the covers we lose the ability to create consistent environments for customers.

Scenario #2: when scaling up the nodes in a cluster, the new nodes are installed with the latest available version of the RHEL rather than the version of RHEL installed on the other masters, infra, and compute nodes.

Version-Release number of the following components:
rpm -q openshift-ansible

3.5.91

rpm -q ansible

ansible-2.2.3.0-1.el7.noarch

ansible --version

ansible 2.2.3.0

How reproducible:

Always

Steps to Reproduce:
1. install RHELv7.3 and OCP v3.4/5/6
2. upgrade OCP to version +1
3. scale up cluster by N nodes

Actual results:
No error produced.

The role in openshift-ansible that is upgrading the OS is, not surprisingly, os_update_latest:

./roles/os_update_latest/tasks/main.yml

This role is called in these 2 playbooks:

./byo/rhel_subscribe.yml
./common/openshift-cluster/update_repos_and_packages.yml

And these are included in a plethora of other playbooks, the most important being the following:

./playbooks/common/openshift-cluster/update_repos_and_packages.yml
./playbooks/byo/rhel_subscribe.yml
./playbooks/aws/openshift-cluster/scaleup.yml
./playbooks/aws/openshift-cluster/update.yml
./playbooks/gce/openshift-cluster/update.yml

Diving father into the rabbit hole, these playbooks are included in even more playbooks.

Expected results:

No upgrade of RHEL unless specified.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Dan Yocum 2017-08-08 16:12:26 UTC

I meant to add this:

We require the ability to disable the os_update_latest role.

Comment 2 Scott Dodson 2017-08-08 17:05:37 UTC

Can you please provide the list of playbooks that you invoked? These playbooks aren't included in any of the documented scaleup workflows.

Comment 9 Scott Dodson 2017-08-29 20:34:31 UTC

Reviewing the ops playbooks you're calling os_update_latest role which has only one task and that task is to update all packages so this seems like a very deliberate action and the role has been largely unchanged for over two years.

We'll work on merging https://github.com/openshift/openshift-ansible/pull/5075 to make the role more accommodating to your needs but this to me is not a bug.