Bug 1705117

Summary: Subscriptions to operator with multiple channels are not acted upon unless defaultChannel set for package
Product: OpenShift Container Platform Reporter: David Zager <dzager>
Component: OLMAssignee: Evan Cordell <ecordell>
OLM sub component: OLM QA Contact: Salvatore Colangelo <scolange>
Status: CLOSED WONTFIX Docs Contact:
Severity: low    
Priority: medium CC: bandrade, chezhang, chuo, dyan, ecordell, jfan, jiazha, scuppett, vlaad
Version: 4.1.0Keywords: Reopened
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Rebase: Enhancements Only
Doc Text:
Story Points: ---
Clone Of:
: 1746193 (view as bug list) Environment:
Last Closed: 2019-09-20 20:45:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1746193    
Bug Blocks:    

Description David Zager 2019-05-01 14:20:39 UTC
Note: Bugzilla is forcing me to select OperatorHub sub component in spite of the fact this issue has nothing to do with OperatorHub. It was not possible to leave the Sub Component blank.

Description of problem:

When a user subscribes to an operator with multiple channels and a defaultChannel has not been set for the package then OLM does nothing with the Subscription.


Version-Release number of selected component (if applicable):
Client Version: version.Info{Major:"4", Minor:"0+", GitVersion:"v4.0.0-alpha.0+3bc8da4-1481-dirty", GitCommit:"3bc8da4", GitTreeState:"", BuildDate:"2019-03-05T02:25:42Z", GoVersion:"", Compiler:"", Platform:""}                      
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.4+6bdc5da", GitCommit:"6bdc5da", GitTreeState:"clean", BuildDate:"2019-05-01T03:59:59Z", GoVersion:"go1.11.8", Compiler:"gc", Platform:"linux/amd64"}

How reproducible: Always


Steps to Reproduce:

Recreation information can be found in https://github.com/djzager/olm-playground/tree/master/scenario3 I used both operator-registry and the ConfigMap method for defining a catalog.

1. (optional) Create namespace scenario3
apiVersion: v1
kind: Namespace
metadata:
  name: scenario3

2. (optional) Create Operator Group in scenario3 namespace
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: example-operator-a
  namespace: scenario3


3. Define ConfigMap with our test operators in openshift-operator-lifecycle-manager namespace

https://raw.githubusercontent.com/djzager/olm-playground/3fefe16d4bb97622c2c698f08836cf7d2731d353/scenario3/configmap/scenario3.configmap.yaml

The portion that is relevant is the packages section:
packages: |-
    - packageName: example-operator-a
      channels:
      - name: alpha
        currentCSV: example-operator-a.v0.0.1
      - name: stable
        currentCSV: example-operator-a.v1.0.0
    - packageName: example-operator-b
      channels:
      - name: alpha
        currentCSV: example-operator-b.v0.0.1
      - name: stable
        currentCSV: example-operator-b.v1.0.0

4. Create a CatalogSource in openshift-operator-lifecycle-manager namespace
https://raw.githubusercontent.com/djzager/olm-playground/3fefe16d4bb97622c2c698f08836cf7d2731d353/scenario3/configmap/scenario3.catalogsource.yaml

5. Create a Subscription to example-operator-a stable channel in scenario3 namespace.
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  generateName: example-operator-a-
  namespace: scenario3
spec:
  channel: stable
  name: example-operator-a
  source: scenario3
  sourceNamespace: openshift-operator-lifecycle-manager


Actual results: Nothing. I could find no evidence of any action taken on the created Subscription.


Expected results: example-operator-a to be installed (as well as example-operator-b to resolve the dependency).


Additional info:

It's worth mentioning that example-operator-a depends on the CRD provided by example-operator-b.

I looked for meaningful information on the status of the Subscription object, the status/logs of the packageserver, catalog operator, olm-operator but was unable to find anything worthwhile.

Comment 1 David Zager 2019-05-01 14:24:56 UTC
My confidence that this issue revolves around whether the package specifies a defaultChannel lies in my re-execution of this scenario with the following changes to the ConfigMap (with success).

```diff
--- a/scenario3/configmap/scenario3.configmap.yaml
+++ b/scenario3/configmap/scenario3.configmap.yaml
@@ -607,12 +607,14 @@ data:
         version: 1.0.0
   packages: |-
     - packageName: example-operator-a
+      defaultChannel: alpha
       channels:
       - name: alpha
         currentCSV: example-operator-a.v0.0.1
       - name: stable
         currentCSV: example-operator-a.v1.0.0
     - packageName: example-operator-b
+      defaultChannel: alpha
       channels:
       - name: alpha
         currentCSV: example-operator-b.v0.0.1
```

Comment 2 Evan Cordell 2019-05-01 18:33:55 UTC
Default channels are required, so this is expected behavior. When there is only one channel, it is assumed to be the default.

I am surprised there weren't errors on the registry pod; I thought we validated this and errored out. I will try to verify this behavior and update.

Comment 3 Stephen Cuppett 2019-05-01 19:15:57 UTC
Closing to get more complete repro data. Can you reopen if the Status is not set with the appropriate error (and include the full status)?

Evan will add the JIRA link when we have it.

Comment 4 Evan Cordell 2019-05-01 19:50:01 UTC
PR is here to error out if there is no default channel: https://github.com/operator-framework/operator-registry/pull/46

This won't be visible on the Catalog status until this issue is done for 4.2: https://jira.coreos.com/browse/OLM-929

Comment 5 David Zager 2019-05-01 20:26:55 UTC
But this still means that if a package happens to exist without a default channel specified that you will leave no indication on the Subscription object that something is wrong.

> Closing to get more complete repro data.

I'm not sure what more information I could provide for you. Maybe I missed something.

Comment 6 Evan Cordell 2019-05-01 20:29:45 UTC
David, 

We have an epic for 4.2 around reporting better status for our subscription flow. The specific issue I linked in jira will make the changes from the PR I linked visible. But the PR will make this change as visible to the user as other catalog problems are (the registry pod crashing with a termination log)

Comment 7 David Zager 2019-05-01 21:09:09 UTC
Fair enough. It's probably better for the package maker to be deciding the default channel than OLM.

I don't know who all we expect to be making packages + channels but I wasn't able to find this requirement for a defaultChannel documented in any of https://github.com/operator-framework/operator-lifecycle-manager/tree/master/Documentation/design or the operator-registry's README. Admittedly, I couldn't really find any documentation on how to properly write these packages so it may be a good idea to document somewhere how to do this correctly.

Comment 8 Evan Cordell 2019-05-07 15:55:28 UTC
Moving to 4.1.z, we can't merge the PR until community-operators updates.

Comment 11 Evan Cordell 2019-07-25 14:14:27 UTC
This is fixed in 4.2 and master, but not 4.1 - is this something that we should backport?

Comment 15 Evan Cordell 2019-09-20 20:45:36 UTC
We are much better about verifying the content shipped via CVP now; including checking for default channels. I don't think this is worth backporting - none of the content coming from Red Hat will have this issue.