Bug 1678285 - Couchbase-cluster unable to deploy because persistentvolume permissions are not created for couchbase-operator
Summary: Couchbase-cluster unable to deploy because persistentvolume permissions are n...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 3.11.0
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: 3.11.0
Assignee: Evan Cordell
QA Contact: Fan Jia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-18 12:40 UTC by Michal Gacek
Modified: 2023-09-14 05:23 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-03 14:51:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michal Gacek 2019-02-18 12:40:06 UTC
Description of problem:
When subscribed to certified Couchbase-operator only Role and RoleBindings are created by installplan. Couchbase-operator requires permissions for get and watch persistentvolumes, persistentvolumes are Cluster Scope object, therefore Cluster Role and Cluster RoleBinding should be created.

Couchbase-operator CSV has persistentvolumes verbs listed under permissions section, which is wrong as 'permissions' object in CSV CR is reponsible only for creating namespace roles and rolebindings, as mentioned persistentvolumes is cluster scope object, therefore CSV should be patched with "clusterPermissions" section where verbs get and watch should be added for persistentvolumes object. Even when catalog is patched manually with correct CSV seems like the version of installed OLM operator does not understand clusterPermissions as still installplans are not creating needed ClusterRole and ClusterRoleBinding.

This prevents couchbase-operator from installing couchbase with persistent-volumes which defeats the whole purpose of using couchbase-operator cause who want's database server without ability to persist data

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Subscribe to Couchbase-operator in namespace allowed for OLM operators
2. Apply CouchbaseCluster CRD with persistentvolumes defined

Sample cb-pv CR taken from couchbase documentation:

```
apiVersion: couchbase.com/v1
kind: CouchbaseCluster
metadata:
  name: cb-cluster-pv
spec:
  baseImage: registry.connect.redhat.com/couchbase/server
  version: 5.5.3-3
  authSecret: cb-example-auth
  adminConsoleServices:
    - data
  cluster:
    dataServiceMemoryQuota: 512
    indexServiceMemoryQuota: 512
    searchServiceMemoryQuota: 512
    eventingServiceMemoryQuota: 512
    analyticsServiceMemoryQuota: 1024
    indexStorageSetting: memory_optimized
    autoFailoverTimeout: 120
    autoFailoverMaxCount: 3
    autoFailoverOnDataDiskIssues: true
    autoFailoverOnDataDiskIssuesTimePeriod: 120
    autoFailoverServerGroup: false
  buckets:
    - name: test1
      type: couchbase
      memoryQuota: 128
      replicas: 1
      ioPriority: high
      evictionPolicy: fullEviction
      conflictResolution: seqno
      enableFlush: true
      enableIndexReplica: false
  securityContext:
    fsGroup: 1001940000
  servers:
    - name: west-2a
      size: 1
      services:
        - data
        - index
      serverGroups:
        - us-west-2a
      pod:
        volumeMounts:
          default: us-west-2a
          data: us-west-2a
  volumeClaimTemplates:
    - metadata:
        name: us-west-2a
      spec:
        storageClassName: "gp2-us-west-2a"
        resources:
          requests:
            storage: 10Gi
```
Actual results:
Couchbase-operator unable to start couchbase pod with following error in logs:
time="2019-02-12T18:47:32Z" level=info msg="Creating a pod (cb-cluster-pv-0000) running Couchbase 5.5.3-3" cluster-name=cb-cluster-pv module=cluster
time="2019-02-12T18:47:39Z" level=info msg="deleted pod (cb-cluster-pv-0000)" cluster-name=cb-cluster-pv module=cluster
time="2019-02-12T18:47:39Z" level=error msg="Cluster setup failed: fail to create member's pod (cb-cluster-pv-0000): unknown (get persistentvolumes)" cluster-name=cb-cluster-pv module=cluster
time="2019-02-12T18:47:39Z" level=warning msg="Fail to handle event: ignore failed cluster (cb-cluster-pv). Please delete its CR"

Additionally no ClusterRole and ClusterRoleBinding is created for couchbase-operator serviceaccount which causes the above error in logs

To manually mitigate that problem cluster role needs to be created with get and watch permissions for persistentvolume and binded to couchbase-operator serviceaccount, only then couchbase-operator is able to successfully start couchbase pods backed by pvs

Expected results:
Couchbase pods start normally when CSV has proper clusterPermissions, cause during Subscription to couchbase-operator installplan is installing proper clusterrole and clusterrolebinding as well

Additional info:
All this is cause probably by two things old OLM that does not support clusterPermissions in CSV CR, and bad CSV CR that list permissions for persitentvolumes in permissions instead of clusterPermissions object in that CR.

OLM should be updated to 0.7.x and CSV patched to list clusterPermissions

Comment 1 Nick Hale 2019-02-27 18:05:11 UTC
This is not an OLM bug, but an issue with the CSV in community-operators. Is there another place we can file this?

Comment 3 Michal Gacek 2019-02-28 07:32:02 UTC
That's not true, one part is broken CSV which i suspect comes from community-operators, Issue has been filed there and ack came from team that they will fix it. But even if having correct CSV, the OLM shipped with OCP 3.11 does not understand clusterPermissions because i suspect it is too old. Regarding other place for filing this if it comes with OCP 3.11 it should be fixed in openshift-ansible as well.

So again there are two problems:
broken CSV
OLM does not understand fixed CSV anyway (too old OLM that does not process clusterPermissions object in CSV)

Comment 11 Eric Rich 2019-03-21 19:28:33 UTC
Moving this back to assigned as we are looking at backporting OLM from 4.x to 3.11 to possibly resolve parts of this.

Comment 12 Jessica Forrester 2019-03-29 19:36:43 UTC
Modifying target release because this remaining issue doesnt impact 4.1

Comment 17 Dan Geoffroy 2019-09-03 14:51:42 UTC
OLM does not support clusterPermissions in 3.11. However, it does support them in 4.1.  Please try again on a 4.1 cluster and if you have issues with OLM at that stage, please reopen a bug at that version.  If there are any further issues with the operator itself, please use Github to open an issue and track there if its with the community version or reach out to Tony Campbell directly if its a ISV certified issue so that he in turn can reach out to the vendor to get it fixed.

Comment 18 Red Hat Bugzilla 2023-09-14 05:23:52 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.