Bug 1686590

Summary: [3.11] Upgrade from Openshift 3.9 to 3.10 fails at Remove Image Stream Tag if no Image Stream Tag is present.
Product: OpenShift Container Platform Reporter: Candace Sheremeta <cshereme>
Component: Cluster Version OperatorAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: liujia <jiajliu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.10.0CC: aos-bugs, bleanhar, cshereme, fshaikh, jaboyd, jiajliu, jialiu, jmalde, jokerman, maupadhy, mgugino, mhernon, mmccomas, mrobson, randym, sdodson, snalawad
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
The upgrade playbooks ran several `oc` commands using resource aliases which may not always be available immediately after a restart or for other reasons. Now those commands use the fully qualified resource name avoiding potential failure.
Story Points: ---
Clone Of: 1624493
: 1688452 (view as bug list) Environment:
Last Closed: 2019-04-11 05:38:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1624493    
Bug Blocks: 1688452    

Comment 1 Candace Sheremeta 2019-03-07 19:04:59 UTC
Hi all,

I have cloned Bug #1624493 as that issue was supposed to be fixed in 3.10.66, but IHAC who is hitting this in attempting to upgrade from 3.9.65 to 3.10.101.

I will attach ansible logs, inventory file, and master sosreport shortly.

Comment 10 Scott Dodson 2019-03-12 19:38:08 UTC
Under the assumption that this is related to api discovery not being available after restarting services I've opened https://github.com/openshift/openshift-ansible/pull/11342

Comment 11 Matthew Robson 2019-03-12 19:56:07 UTC
Scott, there is also some usage of scc that causes issues as well. That is from:

TASK [openshift_node_group : Ensure the service account can run privileged] *************************************

Ex:

    "msg": {
        "cmd": "/usr/bin/oc get scc privileged -o json -n openshift-node",
        "results": [
            {}
        ],
        "returncode": 1,
        "stderr": "error: the server doesn't have a resource type \"scc\"\n",
        "stdout": ""
    }
}

Comment 12 Scott Dodson 2019-03-12 19:57:51 UTC
Yeah, I've made some additional changes to that pull request, can you take another look?

Comment 13 Matthew Robson 2019-03-13 14:00:44 UTC
Looks good to me.

Comment 18 Scott Dodson 2019-03-13 18:03:05 UTC
If `oc --config=/etc/origin/master/admin.kubeconfig delete -n openshift-node imagestreamtags.image.openshift.io node:v3.10 --ignore-not-found` works but the alias of istag doesn't then this should be fixed by the PR comment 10 for release-3.11 and I'll clone this for 3.10 too.

Comment 19 Scott Dodson 2019-03-13 18:03:57 UTC
Logs from comment 17 indicate that it is related to alias use.

Comment 25 liujia 2019-03-27 06:08:20 UTC
According to bug1624493. QE verify the bug as following:
1)PR11342 has been merged into openshift-ansible-3.11.98-1.git.0.3cfa7c3.el7.noarch
2)Upgrade from v3.10 to v3.11 with openshift-ansible-3.11.98-1.git.0.3cfa7c3.el7.noarch succeed.

So change the bug status

Comment 27 errata-xmlrpc 2019-04-11 05:38:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0636