1479493 – OCP upgrade and scale-up automatically upgrades RHEL when it should not

Bug 1479493 - OCP upgrade and scale-up automatically upgrades RHEL when it should not

Summary: OCP upgrade and scale-up automatically upgrades RHEL when it should not

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.5.1
Hardware:	All
OS:	All
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	3.7.0
Assignee:	Scott Dodson
QA Contact:	Johnny Liu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-08-08 16:10 UTC by Dan Yocum
Modified:	2017-08-29 20:34 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-08-29 20:34:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Dan Yocum 2017-08-08 16:10:19 UTC

Description of problem:

Scenario #1: when upgrading the OSD customers, it was observed that RHEL was upgrade (unexpectedly!) during the OCP upgrade. This is unacceptable from an Operations point of view; Ops tests against specific versions of the OS and to have the OS upgraded under the covers we lose the ability to create consistent environments for customers.

Scenario #2: when scaling up the nodes in a cluster, the new nodes are installed with the latest available version of the RHEL rather than the version of RHEL installed on the other masters, infra, and compute nodes.

Version-Release number of the following components:
rpm -q openshift-ansible

3.5.91

rpm -q ansible

ansible-2.2.3.0-1.el7.noarch

ansible --version

ansible 2.2.3.0

How reproducible:

Always

Steps to Reproduce:
1. install RHELv7.3 and OCP v3.4/5/6
2. upgrade OCP to version +1
3. scale up cluster by N nodes

Actual results:
No error produced.

The role in openshift-ansible that is upgrading the OS is, not surprisingly, os_update_latest:

./roles/os_update_latest/tasks/main.yml

This role is called in these 2 playbooks:

./byo/rhel_subscribe.yml
./common/openshift-cluster/update_repos_and_packages.yml

And these are included in a plethora of other playbooks, the most important being the following:

./playbooks/common/openshift-cluster/update_repos_and_packages.yml
./playbooks/byo/rhel_subscribe.yml
./playbooks/aws/openshift-cluster/scaleup.yml
./playbooks/aws/openshift-cluster/update.yml
./playbooks/gce/openshift-cluster/update.yml

Diving father into the rabbit hole, these playbooks are included in even more playbooks.

Expected results:

No upgrade of RHEL unless specified.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Dan Yocum 2017-08-08 16:12:26 UTC

I meant to add this:

We require the ability to disable the os_update_latest role.

Comment 2 Scott Dodson 2017-08-08 17:05:37 UTC

Can you please provide the list of playbooks that you invoked? These playbooks aren't included in any of the documented scaleup workflows.

Comment 9 Scott Dodson 2017-08-29 20:34:31 UTC

Reviewing the ops playbooks you're calling os_update_latest role which has only one task and that task is to update all packages so this seems like a very deliberate action and the role has been largely unchanged for over two years.

We'll work on merging https://github.com/openshift/openshift-ansible/pull/5075 to make the role more accommodating to your needs but this to me is not a bug.

Note You need to log in before you can comment on or make changes to this bug.