The upgrade is done on ci-builds. After upgrading, console is not accessible and oc-cli is very very slow. I am wondering what to do in such situation? Is there rolling-back if upgrade failed. Here are the steps: # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-11-054346 True False 2m31s Cluster version is 4.0.0-0.ci-2019-03-11-054346 # oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge # oc get clusterversion version -o json | jq -r '.status.availableUpdates' [ { "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-11-063655", "version": "4.0.0-0.ci-2019-03-11-063655" }, { "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-11-070013", "version": "4.0.0-0.ci-2019-03-11-070013" }, { "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-11-063655", "version": "4.0.0-0.ci-2019-03-11-063655" }, { "image": "registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-11-070013", "version": "4.0.0-0.ci-2019-03-11-070013" } ] Perform upgrade: # oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-11-063655 Updating to release image registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-11-063655 ### UI shows: UPDATE STATUS: Updating ### desired version has been changed too # oc get clusterversion version -o json | jq .status.desired.version "4.0.0-0.ci-2019-03-11-063655" ### get update process # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-11-063655 True True 3m59s Working towards 4.0.0-0.ci-2019-03-11-063655: 9% complete # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-11-063655 True True 31m Working towards 4.0.0-0.ci-2019-03-11-063655: 29% complete # oc status Error from server (ServiceUnavailable): the server is currently unable to handle the request (get routes.route.openshift.io) Error from server (ServiceUnavailable): the server is currently unable to handle the request (get buildconfigs.build.openshift.io) Error from server (ServiceUnavailable): the server is currently unable to handle the request (get imagestreams.image.openshift.io) Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request (get builds.build.openshift.io) # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-11-063655 True True 46m Unable to apply 4.0.0-0.ci-2019-03-11-063655: the cluster operator machine-config has not yet successfully rolled out # time oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-11-063655 True True 87m Unable to apply 4.0.0-0.ci-2019-03-11-063655: the cluster operator machine-config is failing real 1m30.651s user 0m0.170s sys 0m0.039s ### after over 2 hours, it seems oc-cli is fine. But console is still down. # time oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-11-063655 True True 141m Unable to apply 4.0.0-0.ci-2019-03-11-063655: the cluster operator openshift-cloud-credential-operator is failing real 0m0.717s user 0m0.220s sys 0m0.045s # time oc status Error from server (ServiceUnavailable): the server is currently unable to handle the request (get routes.route.openshift.io) Error from server (ServiceUnavailable): the server is currently unable to handle the request (get buildconfigs.build.openshift.io) Error from server (ServiceUnavailable): the server is currently unable to handle the request (get builds.build.openshift.io) Error from server (ServiceUnavailable): the server is currently unable to handle the request (get imagestreams.image.openshift.io) Error from server (ServiceUnavailable): the server is currently unable to handle the request (get deploymentconfigs.apps.openshift.io) real 0m0.757s user 0m0.184s sys 0m0.038s I will upload journal logs of masters.
This is all just coming together right now, please check again in one weeks time. Please close if it's resolved then.
Will check in a week. I also heard from my team they have got several success updates.
Upgrade from 4.0.0-0.ci-2019-03-19-102101 to 4.0.0-0.ci-2019-03-19-122710 succeeded. The only thing is that the number showing the progress of `% complete` is not incremental. Is that expected? # oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-19-122710 Updating to release image registry.svc.ci.openshift.org/ocp/release:4.0.0-0.ci-2019-03-19-122710 ... # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-19-122710 True True 11m Working towards 4.0.0-0.ci-2019-03-19-122710: 24% complete root@ip-172-31-31-218: ~ # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-19-122710 True True 14m Working towards 4.0.0-0.ci-2019-03-19-122710: 33% complete # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-19-122710 True True 16m Working towards 4.0.0-0.ci-2019-03-19-122710: 37% complete # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-19-122710 True True 18m Working towards 4.0.0-0.ci-2019-03-19-122710: 2% complete # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-19-122710 True True 18m Working towards 4.0.0-0.ci-2019-03-19-122710: 16% complete ... # oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.ci-2019-03-19-122710 True False 18m Cluster version is 4.0.0-0.ci-2019-03-19-122710
https://bugzilla.redhat.com/show_bug.cgi?id=1688454#c5 is not related to this bug. > The upgrade is done on ci-builds. > After upgrading, console is not accessible and oc-cli is very very slow. > I am wondering what to do in such situation? Is there rolling-back if upgrade failed. rollbacks are not supported or performed automatically. And based on these logs: > # time oc status > Error from server (ServiceUnavailable): the server is currently unable to handle the request (get routes.route.openshift.io) > Error from server (ServiceUnavailable): the server is currently unable to handle the request (get buildconfigs.build.openshift.io) > Error from server (ServiceUnavailable): the server is currently unable to handle the request (get builds.build.openshift.io) > Error from server (ServiceUnavailable): the server is currently unable to handle the request (get imagestreams.image.openshift.io) > Error from server (ServiceUnavailable): the server is currently unable to handle the request (get deploymentconfigs.apps.openshift.io) There was a failure to upgrade openshift-apiserver. Can you provide information if upgrade is still failing... if it is please provide `oc get co -oyaml` `oc get clusterversion -oyaml` so that we can track which operator is failing.
Thanks, Abhinav, already redid the test in Comment 5. upgrade was good there ... except the process percentage is not incremental. Let me know if you think I should run the update again.
(In reply to Hongkai Liu from comment #8) > Thanks, Abhinav, > > already redid the test in Comment 5. > upgrade was good there ... except the process percentage is not incremental. That's a separate issue. And we are already tracking that. :) > Let me know if you think I should run the update again.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758