Bug 1855088 - [olm] can't install a CSV with duplicate roles on ocp 4.4.11
Summary: [olm] can't install a CSV with duplicate roles on ocp 4.4.11
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.0
Assignee: Evan Cordell
QA Contact: kuiwang
URL:
Whiteboard:
: 1856413 1861077 (view as bug list)
Depends On:
Blocks: 1858482
TreeView+ depends on / blocked
 
Reported: 2020-07-08 23:01 UTC by Matt Dorn
Modified: 2023-12-15 18:25 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:13:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1629 0 None closed Bug 1855088: generate unique (Cluster)RoleBinding names 2021-01-21 21:24:57 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:13:30 UTC

Description Matt Dorn 2020-07-08 23:01:13 UTC
Description of problem:
Can't install a CSV that contains duplicate roles.

Version-Release number of selected component (if applicable):
Server Version: 4.4.11
Kubernetes Version: v1.17.1+166b070

OLM version: 0.14.2
git commit: 9f95e55cd49a63b85ab5ca5f087a9fa13af84a17


Steps to Reproduce:

1. Install an Operator that contains a CSV with two roles that are the same (although assigned to different service accounts), i.e. ArgoCD Operator:

https://github.com/operator-framework/community-operators/blob/master/community-operators/argocd-operator/0.0.11/argocd-operator.v.0.0.11.clusterserviceversion.yaml#L704-L719

2. Operator does not install, result:

$ oc get installplan
NAME            CSV                       APPROVAL    APPROVED
install-sl8qw   argocd-operator.v0.0.11   Automatic   true
$ oc get csv
NAME                      DISPLAY   VERSION   REPLACES   PHASE
argocd-operator.v0.0.11   Argo CD   0.0.11               Pending


...
- dependents:
    - group: rbac.authorization.k8s.io
      kind: PolicyRule
      message: namespaced rule:{"verbs":["get"],"apiGroups":[""],"resources":["endpoints"]}
      status: NotSatisfied
      version: v1beta1
    group: ""
    kind: ServiceAccount
    message: Policy rule not satisfied for service account
    name: argocd-redis-ha
    status: PresentNotSatisfied
    version: v1
...


Issue is specific to 4.4.11 (OLM version: 0.14.2, git commit: 9f95e55cd49a63b85ab5ca5f087a9fa13af84a17).

Issue does not appear on 4.4.10, (OLM version: 0.14.2
git commit: 759ff4592854c066797cac51bf8ee7f460b9a59a).

Issue seems specific to a CSV containing matching roles. When installing a CSV with no matching roles, Operator is successfully installed.

Comment 2 Oleg Matskiv 2020-07-09 17:30:37 UTC
This seems to be also reported on GitHub:
https://github.com/operator-framework/operator-lifecycle-manager/issues/1625


My team bumped into this issue as well.
In the InstallPlan we noticed that two ClusterRoleBindings,
each for a different ServiceAccount, share the same name.

We found changes related to the CRB/RB names in `pkg/controller/registry/resolver/rbac.go`
where in a recent commit[1] there is a change to how name is set:
...
- Name: generateName(fmt.Sprintf("%s-%s", role.GetName(), permission.ServiceAccountName)),
+ Name: generateName(role.GetName(), role),
...


[1] https://github.com/operator-framework/operator-lifecycle-manager/commit/a7659db1a4a2ed44c51fa075a4f03e8611e008a3#diff-64691a309491514cf415b817a09535c5

Comment 4 Evan Cordell 2020-07-16 15:21:28 UTC
*** Bug 1856413 has been marked as a duplicate of this bug. ***

Comment 5 Keith Wall 2020-07-17 07:36:21 UTC
Is there an expected timescale for the fixing of this defect?  Knowing this will help us (AMQ Online) plan whether we workaround, or await the 4.4.x/4.5.x erratas.

Comment 6 Oleg Matskiv 2020-07-17 13:12:02 UTC
(In reply to Keith Wall from comment #5)
> Is there an expected timescale for the fixing of this defect?  Knowing this
> will help us (AMQ Online) plan whether we workaround, or await the
> 4.4.x/4.5.x erratas.

I created a PR with a fix, it has been reviewed and approved.
There is one automated test on the PR that needs to pass though.

I am not very familiar with the release process, but I am sure that OLM team will help me with next steps.
AFAIK there will be QE verification and cherry pick to 4.4.z stream.
I am not sure if this has a chance to get into next z-stream release, because code freeze is today.

Comment 10 kuiwang 2020-07-20 08:43:33 UTC
verify it on 4.6 cluster. LGTM

--
kuiwang@Kuis-MacBook-Pro 1855088 % oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-07-19-093912   True        False         3m48s   Cluster version is 4.6.0-0.nightly-2020-07-19-093912
kuiwang@Kuis-MacBook-Pro 1855088 % oc get pod -n openshift-operator-lifecycle-manager
NAME                               READY   STATUS    RESTARTS   AGE
catalog-operator-f966c476c-lnldf   1/1     Running   0          38m
olm-operator-b56cdfc54-qvz6z       1/1     Running   0          38m
packageserver-99bc676-ftwhb        1/1     Running   0          20m
packageserver-99bc676-jghn7        1/1     Running   0          20m
kuiwang@Kuis-MacBook-Pro 1855088 % oc exec olm-operator-b56cdfc54-qvz6z -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.16.0
git commit: 171bd7e75f85f08092be14e892c57851a94cb9eb

kuiwang@Kuis-MacBook-Pro 1855088 % oc status   
In project default on server https://api.kuiwang20200720t154113.qe.devcluster.openshift.com:6443

svc/openshift - kubernetes.default.svc.cluster.local
svc/kubernetes - 172.30.0.1:443 -> 6443

View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.

kuiwang@Kuis-MacBook-Pro 1855088 % cat og.yaml 
kind: OperatorGroup
apiVersion: operators.coreos.com/v1
metadata:
  name: og-single
  namespace: default
spec:
  targetNamespaces:
  - default

kuiwang@Kuis-MacBook-Pro 1855088 % oc apply -f og.yaml 
operatorgroup.operators.coreos.com/og-single created

kuiwang@Kuis-MacBook-Pro 1855088 % cat sub.yaml         
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: argocd
  namespace: default
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: argocd-operator
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: argocd-operator.v0.0.11
kuiwang@Kuis-MacBook-Pro 1855088 % oc apply -f sub.yaml 
subscription.operators.coreos.com/argocd created

kuiwang@Kuis-MacBook-Pro 1855088 % oc get csv
NAME                      DISPLAY   VERSION   REPLACES   PHASE
argocd-operator.v0.0.11   Argo CD   0.0.11               Succeeded

kuiwang@Kuis-MacBook-Pro 1855088 % oc get csv argocd-operator.v0.0.11 -o yaml|grep -B 15 "serviceAccountName: argocd-redis-ha-haproxy"
      - rules:
        - apiGroups:
          - ""
          resources:
          - endpoints
          verbs:
          - get
        serviceAccountName: argocd-redis-ha
      - rules:
        - apiGroups:
          - ""
          resources:
          - endpoints
          verbs:
          - get
        serviceAccountName: argocd-redis-ha-haproxy
--

Comment 11 Nick Hale 2020-07-28 21:34:26 UTC
*** Bug 1861077 has been marked as a duplicate of this bug. ***

Comment 13 Fatima 2020-08-07 11:31:51 UTC
Hi team,

Any updates on this bug? Any tentative dates for this fix in OCP 4.4.x? 

Thanks

Comment 14 Fatima 2020-09-07 13:10:15 UTC
Hi team, 

Any updates on the progress of this bug? It's been a while.

Thanks.

Comment 16 errata-xmlrpc 2020-10-27 16:13:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.