Bug 1479493

Summary: OCP upgrade and scale-up automatically upgrades RHEL when it should not
Product: OpenShift Container Platform Reporter: Dan Yocum <dyocum>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED NOTABUG QA Contact: Johnny Liu <jialiu>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.1CC: aos-bugs, dyocum, jkaur, jokerman, mmccomas
Target Milestone: ---Keywords: OpsBlocker, UpcomingRelease
Target Release: 3.7.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-29 20:34:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Yocum 2017-08-08 16:10:19 UTC
Description of problem:

Scenario #1: when upgrading the OSD customers, it was observed that RHEL was upgrade (unexpectedly!) during the OCP upgrade.  This is unacceptable from an Operations point of view; Ops tests against specific versions of the OS and to have the OS upgraded under the covers we lose the ability to create consistent environments for customers.

Scenario #2: when scaling up the nodes in a cluster, the new nodes are installed with the latest available version of the RHEL rather than the version of RHEL installed on the other masters, infra, and compute nodes. 

Version-Release number of the following components:
rpm -q openshift-ansible

3.5.91

rpm -q ansible

ansible-2.2.3.0-1.el7.noarch

ansible --version

ansible 2.2.3.0


How reproducible:

Always

Steps to Reproduce:
1. install RHELv7.3 and OCP v3.4/5/6
2. upgrade OCP to version +1
3. scale up cluster by N nodes

Actual results:
No error produced.

The role in openshift-ansible that is upgrading the OS is, not surprisingly, os_update_latest:

./roles/os_update_latest/tasks/main.yml

This role is called in these 2 playbooks:

./byo/rhel_subscribe.yml
./common/openshift-cluster/update_repos_and_packages.yml

And these are included in a plethora of other playbooks, the most important being the following:

./playbooks/common/openshift-cluster/update_repos_and_packages.yml
./playbooks/byo/rhel_subscribe.yml
./playbooks/aws/openshift-cluster/scaleup.yml
./playbooks/aws/openshift-cluster/update.yml
./playbooks/gce/openshift-cluster/update.yml

Diving father into the rabbit hole, these playbooks are included in even more playbooks.

Expected results:

No upgrade of RHEL unless specified.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Dan Yocum 2017-08-08 16:12:26 UTC
I meant to add this:

We require the ability to disable the os_update_latest role.

Comment 2 Scott Dodson 2017-08-08 17:05:37 UTC
Can you please provide the list of playbooks that you invoked? These playbooks aren't included in any of the documented scaleup workflows.

Comment 9 Scott Dodson 2017-08-29 20:34:31 UTC
Reviewing the ops playbooks you're calling os_update_latest role which has only one task and that task is to update all packages so this seems like a very deliberate action and the role has been largely unchanged for over two years.

We'll work on merging https://github.com/openshift/openshift-ansible/pull/5075 to make the role more accommodating to your needs but this to me is not a bug.