Bug 1889961

Summary: Install etcd Operator on OCP/Z failed without Muti-arch support
Product: OpenShift Developer Tools and Services Reporter: XiaoMei Zheng <xmzheng>
Component: odoAssignee: Dharmit Shah <dshah>
Status: CLOSED UPSTREAM QA Contact: Amit <arout>
Severity: urgent Docs Contact: Robert Krátký <rkratky>
Priority: unspecified    
Version: 4.6CC: brueckner, dshah, oskutka, shahdharmit, stamba, tkral, xmzheng
Target Milestone: ---   
Target Release: 4.3   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-07 04:29:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description XiaoMei Zheng 2020-10-21 05:31:52 UTC
Description of problem:

The project owner(common user, such as developer)got errors when install the operator from the operator hub in a project.
The cluster-admin user can install the operator service or the common user with cluster-admin role can install the operator service. 
The error is :
Deploying service of type: EtcdCluster
 ✗  Deploying service [1ms]
 ✗  etcdclusters.etcd.database.coreos.com is forbidden: User "developer" cannot create resource "etcdclusters" in API group "etcd.database.coreos.com" in the namespace "myproject"

Version-Release number of selected component (if applicable):

Cluster version is OCP 4.5.0-0.nightly-s390x-2020-07-10-214321
operator-lifecycle-manager quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23dc5d2cfffca61bb5df1c762a9da76ce64dcfc6224581c1d70cf92911d37ae2

How reproducible:
always

Steps to Reproduce:
1.Install the OCP 4.5 on IBM Z.
2.Install odo 1.2.6 on OCP/Z
3.Log in the cluster as a kube:admin user, install the etcd-operator
$odo catalog list services
Operators available in the cluster
NAME                                    CRDs
etcdoperator.v0.9.4-clusterwide         EtcdCluster, EtcdBackup, EtcdRestore
service-binding-operator.v0.1.1-364     ServiceBindingRequest
4.Log in the cluster as a common user (developer), and create a project called "myproject".
5.The developer user install the etcd-operator service in project "myproject" with odo:
odo service create etcdoperator.v0.9.4-clusterwide/EtcdCluster.

Actual results:
Deploying service of type: EtcdCluster
 ✗  Deploying service [1ms]
 ✗  etcdclusters.etcd.database.coreos.com is forbidden: User "developer" cannot create resource "etcdclusters" in API group "etcd.database.coreos.com" in the namespace "myproject"

Expected results:
Deploying service of type: EtcdCluster
 ✓  Deploying service [7ms]
 ✓  Service '' was created

Progress of the provisioning will not be reported and might take a long time
You can see the current status by executing 'odo service list'
Optionally, link etcdoperator.v0.9.4-clusterwide to your component by running: 'odo link <component-name>'

./odo service list
NAME                    AGE
EtcdCluster/example     29s

Additional info:
1.login as cluster-admin user(system:admin or kubeadmin)
2../odo service create etcdoperator.v0.9.4-clusterwide/EtcdCluster
Deploying service of type: EtcdCluster
 ✓  Deploying service [7ms]
 ✓  Service '' was created

Progress of the provisioning will not be reported and might take a long time
You can see the current status by executing 'odo service list'
Optionally, link etcdoperator.v0.9.4-clusterwide to your component by running: 'odo link <component-name>'

./odo service list
NAME                    AGE
EtcdCluster/example     29s
3.But the operator status is failed due to exec format error

oc get csv -n openshift-operators
NAME                                     DISPLAY                           VERSION             REPLACES                              PHASE
etcdoperator.v0.9.4-clusterwide          etcd                              0.9.4-clusterwide                                         Failed
percona-server-mongodb-operator.v1.4.0   Percona Server MongoDB Operator   1.4.0                                                     Failed
service-binding-operator.v0.1.1-352      Service Binding Operator          0.1.1-352                                                 Failed
service-binding-operator.v0.1.1-354      Service Binding Operator          0.1.1-354  

It seems there is no support for opeartors working on IBM Z, even though the service is created by kubeadmin user . It's another question. We firstly need to solve the common user can't install operator error and then add support the target Operator on IBM Z.

I found this issue also existed on x86 platform before and be fixed here:https://bugzilla.redhat.com/show_bug.cgi?id=1690848
I am confused why I still got this error on IBM Z.

Comment 1 Evan Cordell 2020-10-21 13:17:53 UTC
The basic workflow when installing operators is:

- an admin installs the operator
- a user installs services that the operator provides

In this case, odo is attempting to create an EtcdCluster before the etcd-operator has been installed, so it fails.

The other part of this is that not all operators work on Z. The ones that do have an annotation on them that causes them to display in the operatorhub when running on Z. odo, as far as I can tell, doesn't know about these (or really about operator installation at all; it appears to assume the operators are already installed).

Neither of these appear to be bugs for OLM. It also seems like odo is working as expected - though I will move this bug to that component for further triage. We may want some additional docs / etc - I had to look through odo code to tell what should / shouldn't be happening - the docs only mention OSB/service catalog for the `service` subcommand.

Comment 2 Dharmit Shah 2020-10-23 07:25:14 UTC
How did you try to install the Operators listed in the output of `oc get csv -n openshift-operators`? If it was using the script `scripts/setup-operators.sh` on the odo repo, you might want to try it again because the MongoDB Operator has been removed from the script (https://github.com/openshift/odo/pull/4127) since it failed to get installed on the cluster and other Operators didn't get installed after that either.

The problem here is that etcd Operator hasn't succeeded in installing. We likely need to modify `odo catalog list services` to list only those Operators for which the phase is "Succeeded". So far, odo just spits every CSV in a namespace.

Comment 3 XiaoMei Zheng 2020-10-26 07:15:41 UTC
@shahdharmit I try the script 'script/setup-operators.sh' again and the installed operators are:
odo catalog list services
Operators available in the cluster
NAME                                    CRDs
etcdoperator.v0.9.4-clusterwide         EtcdCluster, EtcdBackup, EtcdRestore
service-binding-operator.v0.1.1-364     ServiceBindingRequest

The status is failed for the etcdopearator and service-binding-opearator installation coz the reason 'standard_init_linux.go:210: exec user process caused "exec format error"' .

So two things need to do:

1. Modify `odo catalog list services` to list only those Operators for which the phase is "Succeeded".
2. Confirm the Operators has been supported on IBM Z, not only on x86 platform.

@ecordell I am not sure OLM has an official status on IBM-Z operators - its up to operator authors to support multiple architectures. OLM just surfaces that content in the cluster. Could you please forward this to the target operator authors? Thanks.

Comment 4 XiaoMei Zheng 2020-10-26 08:05:02 UTC
@ecordell Hi, I reverify the process to install operators:
1.Install the OCP 4.5 on IBM Z.
2.Install odo 1.2.6 on OCP/Z
3.Log in the cluster as a kube:admin user, install the etcd-operator
$odo catalog list services
Operators available in the cluster
NAME                                    CRDs
etcdoperator.v0.9.4-clusterwide         EtcdCluster, EtcdBackup, EtcdRestore
service-binding-operator.v0.1.1-364     ServiceBindingRequest
4.Log in the cluster as a common user (developer), and create a project called "myproject".
5.The developer user install the etcd-operator service in project "myproject" with odo:
odo service create etcdoperator.v0.9.4-clusterwide/EtcdCluster.

even though, The steps met the basic workflow :
- an admin installs the operator
- a user installs services that the operator provides

The error is still here:
Deploying service of type: EtcdCluster
 ✗  Deploying service [1ms]
 ✗  etcdclusters.etcd.database.coreos.com is forbidden: User "developer" cannot create resource "etcdclusters" in API group "etcd.database.coreos.com" in the namespace "myproject"

To get the installed operators info:
$oc get csv -n openshift-operators
Error from server (Forbidden): clusterserviceversions.operators.coreos.com is forbidden: User "developer" cannot list resource "clusterserviceversions" in API group "operators.coreos.com" in the namespace "openshift-operators"

Comment 5 Sarvesh Tamba 2020-10-27 11:13:18 UTC
Will need OperatorHub support for Power as well to run make test-operator-hub test suites.

[root@odo-addon-bastion odo]# oc get csv -n openshift-operators
No resources found in openshift-operators namespace.
[root@odo-addon-bastion odo]# ./odo catalog list services
 ✗  no deployable services/operators found
[root@odo-addon-bastion odo]# ./odo service create etcdoperator.v0.9.4-clusterwide/EtcdCluster
 ✗  Could not find specified operator
[root@odo-addon-bastion odo]# ./odo service list
 ✗  No operator backed services found in namespace: default
[root@odo-addon-bastion odo]#

Comment 6 Sarvesh Tamba 2020-10-29 14:02:24 UTC
Installed 'etcdoperator.v0.9.4' using support scripts 'scripts/setup-operators.sh' provided by odo as oc kube:admin user.

Issues:-
1.) Was able to deploy service of type: EtcdCluster using the kube:admin user. However operator fails due to no support for P.

    [root@ghatwala-odo-1-abef-bastion odo]# oc whoami
    kube:admin
    [root@ghatwala-odo-1-abef-bastion odo]# ./odo service create etcdoperator.v0.9.4-clusterwide/EtcdCluster
    Deploying service of type: EtcdCluster
     ✓  Deploying service [7ms]
     ✓  Service '' was created

    Progress of the provisioning will not be reported and might take a long time
    You can see the current status by executing 'odo service list'
    Optionally, link etcdoperator.v0.9.4-clusterwide to your component by running: 'odo link <component-name>'
    [root@ghatwala-odo-1-abef-bastion odo]# ./odo service list
    NAME                    AGE
    EtcdCluster/example     7s
    [root@ghatwala-odo-1-abef-bastion odo]# oc get csv -n openshift-operators
    NAME                                  DISPLAY                    VERSION             REPLACES   PHASE
    etcdoperator.v0.9.4-clusterwide       etcd                       0.9.4-clusterwide              Failed
    service-binding-operator.v0.1.1-364   Service Binding Operator   0.1.1-364                      Failed
2.) Not able to deploy service of type: EtcdCluster using the regular/common user 'developer'.

    [root@ghatwala-odo-1-abef-bastion odo]# ./odo service create etcdoperator.v0.9.4-clusterwide/EtcdCluster
    Deploying service of type: EtcdCluster
     ✗  Deploying service [1ms]
     ✗  etcdclusters.etcd.database.coreos.com is forbidden: User "developer" cannot create resource "etcdclusters" in API group "etcd.database.coreos.com" in the namespace "myproject"

Comment 7 XiaoMei Zheng 2021-01-22 09:09:28 UTC
We need to install an etcd Operator for application developer. But the etcd operator can't support Z&Power now. Need the multi-arch support for it.

Comment 10 Dharmit Shah 2021-09-07 04:29:56 UTC
The main issue has been resolved in https://github.com/openshift/odo/pull/4202. As far as making Operators available on multi-arch is concerned, I'm not sure who the right team/person is.

I'm marking this BZ closed as we have fixed the problem of Operators that failed to get installed show up in the output of `odo catalog list services`.

Comment 11 Red Hat Bugzilla 2023-09-15 00:50:01 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days