Bug 1768820 - Projects get stuck in Terminating status with "object *v1beta1.ServiceBindingList does not implement the protobuf marshalling interface and cannot be encoded to a protobuf message ..."
Summary: Projects get stuck in Terminating status with "object *v1beta1.ServiceBinding...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Service Catalog
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 4.3.0
Assignee: Fabian von Feilitzsch
QA Contact: Fan Jia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-05 10:15 UTC by Jian Zhang
Modified: 2020-01-23 11:11 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-23 11:10:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:11:05 UTC

Description Jian Zhang 2019-11-05 10:15:18 UTC
Description of problem:
Projects get stuck in Terminating status for a long time. For example:
openshift-operators-redhat                              Terminating   6h3m
...
qitang2                                                 Terminating   139m
qitang3                                                 Terminating   93m
test-operator                                           Terminating   66m

Version-Release number of selected component (if applicable):
Cluster version is 4.3.0-0.nightly-2019-11-02-092336 

How reproducible:
always

Steps to Reproduce:
1. Install the OCP 4.3
2. Create a namespace.
# oc create ns test-operator 
3. Delete it.
# oc delete ns test-operator 

Actual results:
The test-operator project got stuck in the Terminating status all the time.

Expected results:
The namespace can be deleted successfully.

Additional info:
I also related all resource in it, but it still in Terminating status.
clusterserviceversion.operators.coreos.com "elasticsearch-operator.4.3.0-201911041716" deleted
mac:~ jianzhang$ oc get csv -n test-operators
No resources found.
mac:~ jianzhang$ oc get sub -n test-operators
No resources found.
mac:~ jianzhang$ oc get catalogsource -n test-operators
No resources found.

The copied CSV file will be recreated once you delete it.
mac:~ jianzhang$ oc get csv -n test-operators
NAME                                        DISPLAY                  VERSION              REPLACES   PHASE
elasticsearch-operator.4.3.0-201911041716   Elasticsearch Operator   4.3.0-201911041716              Succeeded


Related logs:
mac:~ jianzhang$ oc logs kube-apiserver-preserve-huirwang-110-mjnxx-control-plane-0  -c kube-apiserver-10  -n openshift-kube-apiserver  |grep test-operator
...
I1105 08:59:50.053553       1 trace.go:116] Trace[52881584]: "Delete" url:/api/v1/namespaces/test-operator/podtemplates (started: 2019-11-05 08:59:48.825887763 +0000 UTC m=+4913.704412558) (total time: 1.227650927s):
I1105 08:59:56.065361       1 trace.go:116] Trace[1701676062]: "List etcd3" key:/roles/test-operator,resourceVersion:,limit:0,continue: (started: 2019-11-05 08:59:51.106018174 +0000 UTC m=+4915.984542978) (total time: 4.959294172s):
I1105 08:59:56.065489       1 trace.go:116] Trace[1633355098]: "Delete" url:/apis/rbac.authorization.k8s.io/v1/namespaces/test-operator/roles (started: 2019-11-05 08:59:51.105912766 +0000 UTC m=+4915.984437561) (total time: 4.959543554s):
I1105 09:00:03.229646       1 trace.go:116] Trace[968228445]: "List etcd3" key:/osb.openshift.io/automationbrokers/test-operator,resourceVersion:,limit:0,continue: (started: 2019-11-05 08:59:56.167930343 +0000 UTC m=+4921.046455146) (total time: 7.061667739s):
I1105 09:00:03.229962       1 trace.go:116] Trace[1304112249]: "Delete" url:/apis/osb.openshift.io/v1/namespaces/test-operator/automationbrokers (started: 2019-11-05 08:59:56.167774266 +0000 UTC m=+4921.046299050) (total time: 7.062175285s):
I1105 09:08:22.933986       1 trace.go:116] Trace[247287010]: "List etcd3" key:/operators.coreos.com/subscriptions/test-operators,resourceVersion:,limit:0,continue: (started: 2019-11-05 09:08:22.028700734 +0000 UTC m=+5426.907225535) (total time: 905.265981ms):
I1105 09:08:22.934157       1 trace.go:116] Trace[2126415962]: "List" url:/apis/operators.coreos.com/v1alpha1/namespaces/test-operators/subscriptions (started: 2019-11-05 09:08:22.028680978 +0000 UTC m=+5426.907205760) (total time: 905.46437ms):
I1105 09:08:26.054225       1 trace.go:116] Trace[604097737]: "List etcd3" key:/operators.coreos.com/subscriptions/test-operators,resourceVersion:,limit:0,continue: (started: 2019-11-05 09:08:25.028746226 +0000 UTC m=+5429.907271030) (total time: 1.025450501s):
I1105 09:08:26.054441       1 trace.go:116] Trace[975003958]: "List" url:/apis/operators.coreos.com/v1alpha1/namespaces/test-operators/subscriptions (started: 2019-11-05 09:08:25.028720338 +0000 UTC m=+5429.907245120) (total time: 1.025707508s):
I1105 09:08:26.054749       1 trace.go:116] Trace[616813751]: "List etcd3" key:/operators.coreos.com/subscriptions/test-operators,resourceVersion:,limit:0,continue: (started: 2019-11-05 09:08:24.82873049 +0000 UTC m=+5429.707255286) (total time: 1.226004633s):
I1105 09:08:26.055122       1 trace.go:116] Trace[1100523733]: "List" url:/apis/operators.coreos.com/v1alpha1/namespaces/test-operators/subscriptions (started: 2019-11-05 09:08:24.828696203 +0000 UTC m=+5429.707220989) (total time: 1.226413248s):

Comment 1 Xingxing Xia 2019-11-05 10:56:02 UTC
Take the project "qitang3" for example:
oc get project qitang3 -o yaml # shows below ContentDeletionFailed message about service catalog objects
apiVersion: project.openshift.io/v1
kind: Project
metadata:
  annotations:
    openshift.io/description: ""
    openshift.io/display-name: ""
    openshift.io/requester: system:admin
    openshift.io/sa.scc.mcs: s0:c29,c9
    openshift.io/sa.scc.supplemental-groups: 1000830000/10000
    openshift.io/sa.scc.uid-range: 1000830000/10000
  creationTimestamp: "2019-11-05T07:32:23Z"
  deletionTimestamp: "2019-11-05T07:32:36Z"
  name: qitang3
  resourceVersion: "582633"
  selfLink: /apis/project.openshift.io/v1/projects/qitang3
  uid: 0896cf23-332c-43a6-8bd5-cfedc2615e68
spec:
  finalizers:
  - kubernetes
status:
  conditions:
  - lastTransitionTime: "2019-11-05T07:38:12Z"
    message: All resources successfully discovered
    reason: ResourcesDiscovered
    status: "False"
    type: NamespaceDeletionDiscoveryFailure
  - lastTransitionTime: "2019-11-05T07:32:43Z"
    message: All legacy kube types successfully parsed
    reason: ParsedGroupVersions
    status: "False"
    type: NamespaceDeletionGroupVersionParsingFailure
  - lastTransitionTime: "2019-11-05T07:38:12Z"
    message: 'Failed to delete all resource types, 5 remaining: object *v1beta1.ServiceBindingList
      does not implement the protobuf marshalling interface and cannot be encoded
      to a protobuf message, object *v1beta1.ServiceBrokerList does not implement
      the protobuf marshalling interface and cannot be encoded to a protobuf message,
      object *v1beta1.ServiceClassList does not implement the protobuf marshalling
      interface and cannot be encoded to a protobuf message, object *v1beta1.ServiceInstanceList
      does not implement the protobuf marshalling interface and cannot be encoded
      to a protobuf message, object *v1beta1.ServicePlanList does not implement the
      protobuf marshalling interface and cannot be encoded to a protobuf message'
    reason: ContentDeletionFailed
    status: "True"
    type: NamespaceDeletionContentFailure
  - lastTransitionTime: "2019-11-05T07:32:43Z"
    message: All content successfully removed
    reason: ContentRemoved
    status: "False"
    type: NamespaceContentRemaining
  - lastTransitionTime: "2019-11-05T07:32:43Z"
    message: All content-preserving finalizers finished
    reason: ContentHasNoFinalizers
    status: "False"
    type: NamespaceFinalizersRemaining
  phase: Terminating

Comment 2 Jesus M. Rodriguez 2019-11-06 22:30:38 UTC
Can you please give me a bit more information? The original comment reproducer simply states to create a namespace then delete it. Comment #2 seems to indicate that the Service Catalog has been enabled explicitly because by default it is not enabled. How was the environment deployed? Are these service catalog deployed projects or not?

Comment 3 Jian Zhang 2019-11-07 09:25:51 UTC
Jesus,

Yes, ServiceCatalog was enabled manually. And, the ASB was deployed.

Comment 4 Qin Ping 2019-11-14 02:44:48 UTC
Hit this issue too.

At the same time, found the service-catalog-apiserver clusteroperator is not in the correct status.

$ oc describe co service-catalog-apiserver
Name:         service-catalog-apiserver
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2019-11-13T02:28:44Z
  Generation:          1
  Resource Version:    221379
  Self Link:           /apis/config.openshift.io/v1/clusteroperators/service-catalog-apiserver
  UID:                 efb57472-70de-4e7f-bcc8-c10ddddf21e7
Spec:
Status:
  Conditions:
    Last Transition Time:  2019-11-13T02:28:46Z
    Reason:                AsExpected
    Status:                False
    Type:                  Degraded
    Last Transition Time:  2019-11-13T02:47:18Z
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2019-11-13T11:40:03Z
    Message:               Available: v1beta1.servicecatalog.k8s.io is not ready: 503
    Reason:                Available
    Status:                False
    Type:                  Available
    Last Transition Time:  2019-11-13T02:46:42Z
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:               <nil>
  Related Objects:
    Group:     
    Name:      openshift-config
    Resource:  namespaces
    Group:     
    Name:      openshift-config-managed
    Resource:  namespaces
    Group:     
    Name:      openshift-service-catalog-apiserver-operator
    Resource:  namespaces
    Group:     
    Name:      openshift-service-catalog-apiserver
    Resource:  namespaces
    Group:     apiregistration.k8s.io
    Name:      v1beta1.servicecatalog.k8s.io
    Resource:  apiservices
  Versions:
    Name:     operator
    Version:  4.3.0-0.nightly-2019-11-12-185229
    Name:     service-catalog-apiserver
    Version:  
Events:       <none>

Comment 5 Jesus M. Rodriguez 2019-11-20 15:40:26 UTC
Fixed by PR https://github.com/openshift/service-catalog/pull/59

Comment 7 Fan Jia 2019-11-22 07:30:19 UTC
The latest nightly build doesn't include the fix pr, will test when the nightly build is ready.

Comment 9 Fan Jia 2019-11-25 07:12:23 UTC
test env:
cv:4.3.0-0.nightly-2019-11-24-183610

test result:
1. oc new-project kaka
2. enable service-catalog-apiserver & service-catalog-controller-manager
3. oc delete ns kaka
ns "kaka" is deleted successfully.

Comment 11 errata-xmlrpc 2020-01-23 11:10:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.