Description of problem: the default catalogsourceconfig and marketplace’s pod will be deleted automatically , and the reload of marketplace will occur “clusteroperators.config.openshift.io\" is forbidden” Version-Release number of selected component (if applicable): clusterversion:4.0.0-0.nightly-2019-02-18-224151 marketplace image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fa1bfe505ba77054fd42aa8d2af7094dbe3a19242639e18b6924b564f583799a olm : quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a8ca6bf86ff96fc7487ed4d80b7d8f9fa51c6a0fbc3b6ec95b3e73ea2f7fdf2a How reproducible: always Steps to Reproduce: 1. install the cluster Actual results: 1. the marketplace’s pod and catalogsourceconfig `certified-operators`,`community-operators`, `redhat-operators` disappears somehow and the marketplace‘s first time reload have `clusteroperators.config.openshift.io\ is forbidden`, you can delete the pod and the second time the pod and catalogsourceconfig can reconciled successfully Expected results: 1. no delete actions Additional info: 1.the logs before the marketplace crash : `time="2019-02-19T08:29:28Z" level=info msg="Out of sync, scheduling for reconciliation from 'Purging' phase" name=certified-operators namespace=openshift-marketplace type=OperatorSourcetime="2019-02-19T08:29:28Z" level=info msg="Purging all resource(s)" name=certified-operators namespace=openshift-marketplace type=OperatorSourcetime="2019-02-19T08:33:24Z" level=info msg="Purging all resource(s)" name=certified-operators namespace=openshift-marketplace type=OperatorSource time="2019-02-19T08:33:24Z" level=info msg="Finalizer removed, now garbage collector will clean it up." name=community-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T08:33:24Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/certified-operators\n"time="2019-02-19T08:33:24Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/community-operators\n"time="2019-02-19T08:33:24Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/redhat-operators\n"time="2019-02-19T08:33:24Z" level=info msg="Finalizer removed, now garbage collector will clean it up." name=redhat-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T08:33:24Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/redhat-operators\n" time="2019-02-19T08:33:24Z" level=info msg="Finalizer removed, now garbage collector will clean it up." name=redhat-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T08:33:24Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/redhat-operators\n"2019/02/19 08:33:25 <nil>` 2.the logs for the first time reload of marketplace: ` 2019/02/18 07:23:47 Go Version: go1.10.8 2019/02/18 07:23:47 Go OS/Arch: linux/amd64 2019/02/18 07:23:47 operator-sdk Version: v0.3.0 time="2019-02-18T07:23:47Z" level=warning msg="ClusterOperator API not present: customresourcedefinitions.apiextensions.k8s.io \"clusteroperators.config.openshift.io\" is forbidden: User \"system:serviceaccount:openshift-marketplace:marketplace-operator\" cannot get resource \"customresourcedefinitions\" in API group \"apiextensions.k8s.io\" at the cluster scope" 2019/02/18 07:23:47 Registering Components. 2019/02/18 07:23:47 Starting the Cmd. E0218 07:23:57.948887 1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:196: Failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:openshift-marketplace:marketplace-operator" cannot list resource "secrets" in API group "" in the namespace "openshift-marketplace" time="2019-02-18T07:23:58Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/certified-operators\n" ` 3.the logs for the second time reload of marketplace: `2019/02/19 10:17:21 Go Version: go1.10.82019/02/19 10:17:21 Go OS/Arch: linux/amd642019/02/19 10:17:21 operator-sdk Version: v0.3.02019/02/19 10:17:21 Registering Components.2019/02/19 10:17:21 Starting the Cmd.time="2019-02-19T10:17:21Z" level=info msg="[sync] Operator source sync loop will start after 10m0s"time="2019-02-19T10:17:21Z" level=info msg="[sync] CatalogSourceConfig sync loop will start after 10m0s"time="2019-02-19T10:17:21Z" level=info msg="Found existing ClusterOperator"time="2019-02-19T10:17:21Z" level=info msg="Setting ClusterOperator condition: Available message: Operator running" time="2019-02-19T10:17:34Z" level=info msg="Created Deployment certified-operators with registry command: [appregistry-server -s openshift-marketplace/certified-operators -o couchbase-enterprise,mongodb-enterprise]" name=certified-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T10:17:34Z" level=info msg="Created Service certified-operators" name=certified-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T10:17:34Z" level=info msg="Creating CatalogSource certified-operators" name=certified-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T10:17:34Z" level=info msg="Created CatalogSource certified-operators" name=certified-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T10:17:34Z" level=info msg="The object has been successfully reconciled" name=certified-operators targetNamespace=openshift-marketplace type=CatalogSourceConfigtime="2019-02-19T10:17:34Z" level=info msg="Reconciling CatalogSourceConfig openshift-marketplace/certified-operators\n"time="2019-02-19T10:17:34Z" level=info msg="No action taken, the object has already been reconciled" name=certified-operators targetNamespace=openshift-marketplace type=CatalogSourceConfig `
(1. logs before the marketplace crash) The scenario here is the marketplace-operator pod crashed. There is nothing in the logs to indicate why this crash happened. (2.the logs for the first time reload of marketplace) Given that the marketplace-operator pod is part of a deployment, another instance is launched again. "User \"system:serviceaccount:openshift-marketplace:marketplace-operator\" cannot get resource \"customresourcedefinitions\" in API group \"apiextensions.k8s.io\" at the cluster scope" and "Failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:openshift-marketplace:marketplace-operator" cannot list resource "secrets" in API group "" in the namespace "openshift-marketplace" indicates that the "ClusterRole" or "ClusterRoleBindings" for the "marketplace-operator" have disappeared. One theory I have is that whatever entity that deleted "ClusterRole" or "ClusterRoleBinding", also deleted the Deployment. Then CVO recreated the Deployment first before recreating the "ClusterRole" or "ClusterRoleBinding" (3. the logs for the second time reload of marketplace) During this reload it looks like the "ClusterRole" or "ClusterRoleBinding" for the "marketplace-operator" has been created again, allowing it to successfully come up and recreate the resources required. So we need to figure out: 1. Why did the "marketplace-operator" crash in the first place? 2. Why did the "ClusterRole" or "ClusterRoleBindings" for the "marketplace-operator" disappear? As a side note, please be aware that the `CatalogSourceConfigs` and it child resources, associated with "OperatorSources" will be deleted and recreated to sync with Quay on very "marketplace-operator" restart. We plan to fix this bug soon but that is not related to this issue.
All the resources (olm's packageserver, marketplace , all the pods) will remain stable after stopping the cluster-version-operator.
The olm also has the same situation : lose resource like packageserver & serviceaccount ( https://bugzilla.redhat.com/show_bug.cgi?id=1678606 )
*** This bug has been marked as a duplicate of bug 1679309 ***