Bug 1744153
| Summary: | oc adm upgrade from 4.1.1 to 4.2.0-0.nightly-2019-08-21-040043 on loaded cluster stuck on marketplace-operator | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> |
| Component: | OLM | Assignee: | Alexander Greene <agreene> |
| OLM sub component: | OLM | QA Contact: | Mike Fiedler <mifiedle> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | bandrade, chuo, dageoffr, jfan, jiazha, scolange |
| Version: | 4.2.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-10-16 06:37:03 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Based on the information contained within the must-gather.zip, it appears that the 4.2 version of the marketplace operator was never deployed. The Marketplace Operator logs match those of a 4.1 version. It appears that during the upgrade, the CatalogSourceConfig CRD was upgraded to `v2`, which requires the `source` field. Version 4.1 of the Marketplace Operator does not work with `v2` of the `CatalogSourceConfig` CRD. This may be related to [0], which was fixed on the 21st, the same day the 4.2.0-0.nightly-2019-08-21-040043 was created. I will attempt an upgrade from a 4.1 cluster to 4.2.0-0.nightly-2019-08-21-040043 to confirm this suspicion. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1743699 Re-testing today with 4.1.13 -> registry.svc.ci.openshift.org/ocp/release:4.2.0-0.nightly-2019-08-28-083236 @Mike(In reply to Mike Fiedler from comment #3) > Re-testing today with 4.1.13 -> > registry.svc.ci.openshift.org/ocp/release:4.2.0-0.nightly-2019-08-28-083236 Could you please update this BZ with the results. Moving to ON_QA. Code delivered previously and this was reopened due to another component failure unrelated to this. Waiting on QA (Mike F) to verify things are good here. Sorry for delay, this is all good now on 4.2.0-0.nightly-2019-09-03-102130 Meet the same problem when upgrade the cluster from 4.1.14 to 4.2.0-0.nightly-2019-09-04-142146
1. oc get clusteroperators
oc get clusteroperator
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
authentication 4.2.0-0.nightly-2019-09-04-142146 True False False 6h44m
cloud-credential 4.2.0-0.nightly-2019-09-04-142146 True False False 7h
cluster-autoscaler 4.2.0-0.nightly-2019-09-04-142146 True False False 7h
console 4.2.0-0.nightly-2019-09-04-142146 True False False 40m
dns 4.1.14 True False False 6h59m
image-registry 4.2.0-0.nightly-2019-09-04-142146 True False False 138m
ingress 4.2.0-0.nightly-2019-09-04-142146 True False False 3h15m
insights 4.2.0-0.nightly-2019-09-04-142146 True False False 57m
kube-apiserver 4.2.0-0.nightly-2019-09-04-142146 True False False 6h58m
kube-controller-manager 4.2.0-0.nightly-2019-09-04-142146 True False False 6h57m
kube-scheduler 4.2.0-0.nightly-2019-09-04-142146 True False False 6h57m
machine-api 4.2.0-0.nightly-2019-09-04-142146 True False False 7h
machine-config 4.1.14 False True True 28m
marketplace 4.1.14 True False False 11m
monitoring 4.2.0-0.nightly-2019-09-04-142146 True False False 3h12m
network 4.2.0-0.nightly-2019-09-04-142146 True True False 7h
node-tuning 4.2.0-0.nightly-2019-09-04-142146 True False False 11m
openshift-apiserver 4.2.0-0.nightly-2019-09-04-142146 True False False 106s
openshift-controller-manager 4.2.0-0.nightly-2019-09-04-142146 True False False 6h58m
openshift-samples 4.2.0-0.nightly-2019-09-04-142146 True False False 44m
operator-lifecycle-manager 4.2.0-0.nightly-2019-09-04-142146 True False False 6h59m
operator-lifecycle-manager-catalog 4.2.0-0.nightly-2019-09-04-142146 True False False 6h59m
operator-lifecycle-manager-packageserver 4.2.0-0.nightly-2019-09-04-142146 True False False 10m
service-ca 4.2.0-0.nightly-2019-09-04-142146 True False False 7h
service-catalog-apiserver 4.2.0-0.nightly-2019-09-04-142146 True False False 100s
service-catalog-controller-manager 4.2.0-0.nightly-2019-09-04-142146 True False False 107m
storage 4.2.0-0.nightly-2019-09-04-142146 True False False 43m
2. the logs of clusterversion
`
E0905 07:42:48.002084 1 task.go:77] error running apply for clusteroperator "openshift-marketplace/marketplace" (337 of 415): Cluster operator marketplace is still updating
I0905 07:42:48.002292 1 sync_worker.go:736] Summarizing 1 errors
I0905 07:42:48.002298 1 sync_worker.go:740] Update error 337 of 415: ClusterOperatorNotAvailable Cluster operator marketplace is still updating (*errors.errorString: cluster operator marketplace is still updating)
E0905 07:50:04.122092 1 task.go:77] error running apply for clusteroperator "openshift-marketplace/marketplace" (337 of 415): Cluster operator marketplace is still updating
I0905 07:50:04.122213 1 sync_worker.go:736] Summarizing 1 errors
I0905 07:50:04.122223 1 sync_worker.go:740] Update error 337 of 415: ClusterOperatorNotAvailable Cluster operator marketplace is still updating (*errors.errorString: cluster operator marketplace is still updating)
`
3. the logs of marketplace
the image of marketplace (still the 4.1 comment after the upgrade)
marketplace commit:6881ba35b74077c29e8791f26d04d2f7ec25e8de
`
time="2019-09-05T07:50:28Z" level=error msg="Unexpected error while creating CatalogSourceConfig: CatalogSourceConfig.operators.coreos.com \"certified-operators\" is invalid: []: Invalid value: map[string]interface {}{\"apiVersion\":\"operators.coreos.com/v1\", \"kind\":\"CatalogSourceConfig\", \"metadata\":map[string]interface {}{\"creationTimestamp\":\"2019-09-05T07:50:28Z\", \"generation\":1, \"labels\":map[string]interface {}{\"opsrc-datastore\":\"true\", \"opsrc-owner-name\":\"certified-operators\", \"opsrc-owner-namespace\":\"openshift-marketplace\", \"opsrc-provider\":\"certified\"}, \"name\":\"certified-operators\", \"namespace\":\"openshift-marketplace\", \"uid\":\"d44a8a02-cfb1-11e9-b840-0279035724fc\"}, \"spec\":map[string]interface {}{\"csDisplayName\":\"Certified Operators\", \"csPublisher\":\"Red Hat\", \"packages\":\"tidb-operator-certified,aqua-certified,twistlock-certified,cpx-cic-operator,robin-operator,insightedge-operator,instana-agent,portworx-certified,percona-xtradb-cluster-operator-certified,presto-operator,t8c-certified,oneagent-certified,storageos,orca,memql-certified,percona-server-mongodb-operator-certified,planetscale-certified,appdynamics-operator,anchore-engine,federatorai-certified,crunchy-postgres-operator,joget-openshift-operator,cic-operator,couchbase-enterprise-certified,openunison-ocp-certified,kong,sematext,hazelcast-enterprise-certified,nuodb-ce-certified,kubeturbo-certified,seldon-operator-certified,sysdig-certified,synopsys-certified,newrelic-infrastructure,mariadb,appsody-operator-certified,mongodb-enterprise\", \"targetNamespace\":\"openshift-marketplace\"}, \"status\":map[string]interface {}{\"currentPhase\":map[string]interface {}{\"lastTransitionTime\":interface {}(nil), \"lastUpdateTime\":interface {}(nil), \"phase\":map[string]interface {}{}}}}: validation failure list:\nspec.source in body is required" name=certified-operators namespace=openshift-marketplace type=OperatorSource
`
use another bug to track the new problem since they have the different reson: https://bugzilla.redhat.com/show_bug.cgi?id=1749643 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |
Description of problem: Created a cluster with 500 projects. Each project contains: 1 buildconfig 15 builds 10 imagestreams 1 deployment with 0 replicas 1 service 20 secrets 10 routes NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.11 True True 29m Unable to apply 4.2.0-0.nightly-2019-08-21-040043: the update could not be applied NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h cloud-credential 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h cluster-autoscaler 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h console 4.2.0-0.nightly-2019-08-21-040043 True False False 18m dns 4.1.11 True False False 5d15h image-registry 4.1.11 True True False 5d15h ingress 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h insights 4.2.0-0.nightly-2019-08-21-040043 True False False 21m kube-apiserver 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h kube-controller-manager 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h kube-scheduler 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h machine-api 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h machine-config 4.1.11 True False False 5d15h marketplace 4.1.11 True False False 5d15h monitoring 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h network 4.1.11 True False False 5d15h node-tuning 4.2.0-0.nightly-2019-08-21-040043 True False False 21m openshift-apiserver 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h openshift-controller-manager 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h openshift-samples 4.2.0-0.nightly-2019-08-21-040043 True False False 11m operator-lifecycle-manager 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h operator-lifecycle-manager-catalog 4.1.11 True False False 5d15h operator-lifecycle-manager-packageserver 4.2.0-0.nightly-2019-08-21-040043 True False False 20m service-ca 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h service-catalog-apiserver 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h service-catalog-controller-manager 4.2.0-0.nightly-2019-08-21-040043 True False False 5d15h storage 4.2.0-0.nightly-2019-08-21-040043 True False False 21m The clusterversion operator logs show that the marketplace-operator is stuck. From clusterversion operator logs: E0821 12:55:36.078647 1 task.go:77] error running apply for deployment "openshift-marketplace/marketplace-operator" (333 of 412): timed out waiting for the condition I0821 12:55:36.078694 1 task_graph.go:588] Result of work: [Cluster operator image-registry is still updating Cluster operator operator-lifecycle-manager-catalog is still updating Could not update deployment "openshift-marketplace/marketplace-operator" (333 of 412)] I0821 12:55:36.078734 1 sync_worker.go:740] Update error 333 of 412: UpdatePayloadFailed Could not update deployment "openshift-marketplace/marketplace-operator" (333 of 412) (*errors.errorString: timed out waiting for the condition) E0821 12:55:36.078753 1 sync_worker.go:311] unable to synchronize image (waiting 2m52.525702462s): Could not update deployment "openshift-marketplace/marketplace-operator" (333 of 412) From marketplace-operator logs: time="2019-08-21T12:56:05Z" level=info msg="Reconciling OperatorSource openshift-marketplace/redhat-operators\n" time="2019-08-21T12:56:05Z" level=error msg="Unexpected error while creating CatalogSourceConfig: CatalogSourceConfig.operators.coreos.com \"redhat-operators\" is invalid: []: Invalid value: map[string]interface {}{\"apiVersion\":\"operat ors.coreos.com/v1\", \"kind\":\"CatalogSourceConfig\", \"metadata\":map[string]interface {}{\"creationTimestamp\":\"2019-08-21T12:56:05Z\", \"generation\":1, \"labels\":map[string]interface {}{\"opsrc-datastore\":\"true\", \"opsrc-owner-n ame\":\"redhat-operators\", \"opsrc-owner-namespace\":\"openshift-marketplace\", \"opsrc-provider\":\"redhat\"}, \"name\":\"redhat-operators\", \"namespace\":\"openshift-marketplace\", \"uid\":\"09cae082-c413-11e9-885d-068c43ae29a2\"}, \" spec\":map[string]interface {}{\"csDisplayName\":\"Red Hat Operators\", \"csPublisher\":\"Red Hat\", \"packages\":\"kubevirt-hyperconverged,amq-online,3scale-operator,codeready-workspaces,businessautomation-operator,cluster-logging,elasti csearch-operator,openshifttemplateservicebroker,amq-streams,openshiftansibleservicebroker,jaeger-product,amq7-cert-manager,amq7-interconnect-operator\", \"targetNamespace\":\"openshift-marketplace\"}, \"status\":map[string]interface {}{\" currentPhase\":map[string]interface {}{\"lastTransitionTime\":interface {}(nil), \"lastUpdateTime\":interface {}(nil), \"phase\":map[string]interface {}{}}}}: validation failure list:\nspec.source in body is required" name=redhat-operator s namespace=openshift-marketplace type=OperatorSource Version-Release number of selected component (if applicable): upgrading from 4.1.11 to 4.2.0-0.nightly-2019-08-21-040043 Additional info: Full oc adm must-gather info will be added shortly.