Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1610496

Summary: Upgade from 3.10 fails with message, "Changes to bootstrapped SCCs have been detected."
Product: OpenShift Container Platform Reporter: Jason Montleon <jmontleo>
Component: Cluster Version OperatorAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED ERRATA QA Contact: liujia <jiajliu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: aos-bugs, ckoep, gpei, jiajliu, jmatthew, jmontleo, jokerman, mkhan, mmccomas, sdodson, ssorce, vrutkovs, wmeng
Target Milestone: ---   
Target Release: 3.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-11 07:23:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason Montleon 2018-07-31 18:40:20 UTC
Description of problem:
I installed 3.10 and tried to upgrade to 3.11 immediately after, at which time I received the error 

Version-Release number of selected component (if applicable):
"Changes to bootstrapped SCCs have been detected. Please review the changes by running \"oc adm policy --config=/etc/origin/master/admin.kubeconfig reconcile-sccs --additive-only=true\"

How reproducible:
Always

Steps to Reproduce:
1. Install 3.10
2. Try to upgrade to 3.11

Actual results:
TASK [fail] ****************************************************************************************************
fatal: [192.168.121.233.nip.io]: FAILED! => {"changed": false, "msg": "Changes to bootstrapped SCCs have been detected. Please review the changes by running \"oc adm policy --config=/etc/origin/master/admin.kubeconfig reconcile-sccs --additive-only=true\" After reviewing the changes please apply those changes by adding the '--confirm' flag. Do not modify the default SCCs. Customizing the default SCCs will cause this check to fail when upgrading. If you require non standard SCCs please refer to https://docs.openshift.org/latest/admin_guide/manage_scc.html\n"}

Expected results:
Successful upgrade.

Additional info:
I have not manually modified any scc's. I tried running the specified command ("oc adm policy --config=/etc/origin/master/admin.kubeconfig reconcile-sccs --additive-only=true" --confirm) on a subsequent attempt before trying to upgrade without success.

Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Scott Dodson 2018-07-31 18:46:08 UTC
Can you gather exact versions of 3.10 and 3.11 before and after? I know there were some problems with improperly reconciled SCCs in earlier 3.11 builds but I thought those had been resolved.

What's the output of the manual `oc adm...` ?

Moving to auth.

Comment 2 Jason Montleon 2018-07-31 18:53:54 UTC
Yes, I used the latest puddles from today:
openshift-ansible-3.10.27-1.git.0.d5723a3.el7
openshift-ansible-3.11.0-0.10.0.git.0.91bb588None

Comment 3 Jason Montleon 2018-07-31 19:56:20 UTC
Before updating oc it outputs nothing.

After adding the 3.11 repo and doing a yum update I get:
# oc adm policy --config=/etc/origin/master/admin.kubeconfig reconcile-sccs --additive-only=true --confirm
securitycontextconstraints/privileged

Comment 4 Jason Montleon 2018-08-01 16:57:18 UTC
For now openshift_reconcile_sccs_reject_change: false allows me to bypass the issue.

Comment 5 Standa Laznicka 2018-08-08 06:13:55 UTC
So the problem here is that a new field to SCCs was added, and this field contains a value different from the default for the privileged SCC (so the previous fix does not work here). This will generate a notification of change.

Comment 6 Standa Laznicka 2018-08-08 13:46:37 UTC
The recent fix to reconcile SCCs landed only in the master, and so did the new field to SCCs which possibly causes this break. This would mean that the `oc` binary which is run during the upgrade is a binary which corresponds with the version of openshift to upgrade to. If that is true, that would be a bug in upgrade as it would prevent us from doing changes to the bootstrapped SCCs in the future.

Scott, if the above is true and, with the ansible playbook, you're running the `oc` binary with version corresponding to the version of the server to upgrade to during the reconcile-sccs verification step, then this is a bug in the Upgrade component. Please confirm.

Comment 7 Standa Laznicka 2018-08-09 08:07:55 UTC
Addressed in https://github.com/openshift/openshift-ansible/pull/9494

Comment 8 Vadim Rutkovsky 2018-08-09 12:09:00 UTC
Prepared a better PR - https://github.com/openshift/openshift-ansible/pull/9500

Comment 9 liujia 2018-08-16 06:57:34 UTC
Still hit it at openshift-ansible-3.11.0-0.16.0.git.0.e82689aNone.noarch

Comment 10 Vadim Rutkovsky 2018-08-16 08:21:14 UTC
(In reply to liujia from comment #9)
> Still hit it at openshift-ansible-3.11.0-0.16.0.git.0.e82689aNone.noarch

Could you provide the logs for this run and get `oc version` output from the targets when the issue occurs?

Comment 11 Vadim Rutkovsky 2018-08-16 08:22:03 UTC
Fix is available in openshift-ansible-3.11.0-0.16.0, moving to ON_QA

Comment 12 liujia 2018-08-16 09:56:47 UTC
openshift-ansible-3.11.0-0.16.0.git.0.e82689aNone.noarch

[root@jliu-10-master-etcd-nfs-1 ~]# oc version
oc v3.11.0-0.16.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://jliu-10-master-etcd-nfs-1:8443
openshift v3.10.14
kubernetes v1.10.0+b81c8f8

hosts file and log in attachment

Comment 13 liujia 2018-08-16 09:57:19 UTC
Created attachment 1476395 [details]
hosts

Comment 16 Vadim Rutkovsky 2018-08-16 11:02:57 UTC
Reproduced this too, scc's are being checked twice.

Created https://github.com/openshift/openshift-ansible/pull/9627 to remove the duplicated part

Comment 17 Vadim Rutkovsky 2018-08-17 07:36:37 UTC
Fix is available in openshift-ansible-3.11.0-0.17.0

Comment 18 Vadim Rutkovsky 2018-08-17 10:09:20 UTC
The fix in #9627 was incorrect, reverted in https://github.com/openshift/openshift-ansible/pull/9640

Comment 19 liujia 2018-08-20 10:19:06 UTC
(In reply to Vadim Rutkovsky from comment #18)
> The fix in #9627 was incorrect, reverted in
> https://github.com/openshift/openshift-ansible/pull/9640

pr9640 looks the same with previous pr9500. But according to comment12, the pr seems not fix the issue. 

Add testbloker, QE need the bug fixed asap due to it blocked upgrade test for rpm installed ocp on rhel now.

Comment 21 Scott Dodson 2018-08-22 19:46:33 UTC
#9640 is in openshift-ansible-3.11.0-0.20.0

Vadim can you confirm this is ready to go?

Comment 22 Vadim Rutkovsky 2018-08-23 08:03:18 UTC
Yep, should be ready for testing in openshift-ansible-3.11.0-0.20.0

Comment 25 errata-xmlrpc 2018-10-11 07:23:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652