Bug 1867735 - [manila-csi-driver-operator] manila csi driver pod fails to start in OSP env with self signed certs
Summary: [manila-csi-driver-operator] manila csi driver pod fails to start in OSP env ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.6.0
Assignee: Jan Safranek
QA Contact: Qin Ping
URL:
Whiteboard:
Depends On:
Blocks: 1867534
TreeView+ depends on / blocked
 
Reported: 2020-08-10 15:50 UTC by Jon Uriarte
Modified: 2021-01-20 15:04 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:27:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
openshift-cluster-csi-drivers namespace logs from must-gather (30.03 KB, application/gzip)
2020-08-10 15:50 UTC, Jon Uriarte
no flags Details
manila pod logs (15.35 KB, text/plain)
2020-08-13 08:53 UTC, Jon Uriarte
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-storage-operator pull 74 0 None closed Bug 1867735: Sync RBAC rules from Manila operator 2021-01-19 14:49:44 UTC
Github openshift cluster-storage-operator pull 75 0 None closed Bug 1867735: Fix syncing of Manila CA certificate 2021-01-19 14:49:38 UTC
Github openshift csi-driver-manila-operator pull 49 0 None closed Bug 1867735: Fix sync of openshift-config/cloud-provider-config ConfigMap 2021-01-19 14:49:38 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:27:08 UTC

Description Jon Uriarte 2020-08-10 15:50:15 UTC
Created attachment 1710984 [details]
openshift-cluster-csi-drivers namespace logs from must-gather

Description of problem:

OCP 4.6 installation cannot be completed on top of OSP with self signed certificates for public endpoints.
Kuryr network type was used during this test.

Version-Release number of selected component (if applicable):
OCP: 4.6.0-0.nightly-2020-08-09-151434
OSP: RHOS-16.1-RHEL-8-20200804.n.0 (16.1.1 GA)

Due to current issue https://bugzilla.redhat.com/show_bug.cgi?id=1866738, 46.82.202008080704-0 rhcos image has been used.

How reproducible: always

Steps to Reproduce:
1. Deploy OSP with self signed certs (the cacert is provided in clouds.yaml)
2. Install OCP 4.6 - "openshift-install create cluster" fails

Actual results:

ERROR Cluster operator storage Degraded is True with ManilaCSIDriverOperatorCR_ManilaController_SyncError: ManilaCSIDriverOperatorCRDegraded: ManilaControllerDegraded: Get "https://10.46.22.140:13000/": x509: certificate signed by unknown authority 
INFO Cluster operator storage Progressing is True with ManilaCSIDriverOperatorCR_WaitForOperator: ManilaCSIDriverOperatorCRProgressing: Waiting for Manila operator to report status 
INFO Cluster operator storage Available is False with ManilaCSIDriverOperatorCR_WaitForOperator: ManilaCSIDriverOperatorCRAvailable: Waiting for Manila operator to report status 
FATAL failed to initialize the cluster: Working towards 4.6.0-0.nightly-2020-08-09-151434: 99% complete 


Expected results: successful installation

Error messages in failed manila pod:

E0810 12:40:12.580733       1 base_controller.go:220] "ManilaController" controller failed to sync "key", err: Get "https://10.46.22.140:13000/": x509: certificate signed by unknown authority
E0810 12:40:13.046397       1 reflector.go:127] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:opensh
ift-cluster-csi-drivers:manila-csi-driver-operator" cannot list resource "configmaps" in API group "" in the namespace "openshift-manila-csi-driver"


(shiftstack) [stack@undercloud-0 ~]$ oc get pv -A
No resources found

(shiftstack) [stack@undercloud-0 ~]$ oc get pvc  -A
No resources found


Additional info: attached must-gather

(shiftstack) [stack@undercloud-0 ~]$ oc get nodes
NAME                        STATUS   ROLES    AGE     VERSION
ostest-ps9x4-master-0       Ready    master   3h43m   v1.19.0-rc.2+5241b27-dirty
ostest-ps9x4-master-1       Ready    master   3h43m   v1.19.0-rc.2+5241b27-dirty
ostest-ps9x4-master-2       Ready    master   3h43m   v1.19.0-rc.2+5241b27-dirty
ostest-ps9x4-worker-cnzlt   Ready    worker   3h11m   v1.19.0-rc.2+5241b27-dirty
ostest-ps9x4-worker-drrp8   Ready    worker   3h10m   v1.19.0-rc.2+5241b27-dirty
ostest-ps9x4-worker-njn5b   Ready    worker   3h11m   v1.19.0-rc.2+5241b27-dirty

(shiftstack) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+---------------------------+--------+-------------------------------------+----------+--------+
| ID                                   | Name                      | Status | Networks                            | Image    | Flavor |
+--------------------------------------+---------------------------+--------+-------------------------------------+----------+--------+
| fadd89af-def9-41f6-a599-2a063641311d | ostest-ps9x4-worker-drrp8 | ACTIVE | ostest-ps9x4-openshift=10.196.3.225 | my-rhcos |        |
| 07e35039-0f77-43fa-a300-de98e75aeef3 | ostest-ps9x4-worker-njn5b | ACTIVE | ostest-ps9x4-openshift=10.196.2.118 | my-rhcos |        |
| efcaa3b2-d3b3-4e27-97ef-5689d1d7ac90 | ostest-ps9x4-worker-cnzlt | ACTIVE | ostest-ps9x4-openshift=10.196.2.18  | my-rhcos |        |
| 1da119e9-39b0-4167-9a9f-09784bd34e39 | ostest-ps9x4-master-2     | ACTIVE | ostest-ps9x4-openshift=10.196.1.226 | my-rhcos |        |
| f64465a7-55dc-4dfd-b113-6e92f69eb693 | ostest-ps9x4-master-0     | ACTIVE | ostest-ps9x4-openshift=10.196.0.152 | my-rhcos |        |
| d4cfd5fd-fed4-4334-b968-1a23499ca7b8 | ostest-ps9x4-master-1     | ACTIVE | ostest-ps9x4-openshift=10.196.2.69  | my-rhcos |        |
+--------------------------------------+---------------------------+--------+-------------------------------------+----------+--------+

(overcloud) [stack@undercloud-0 ~]$ openstack endpoint list
+----------------------------------+-----------+--------------+----------------+---------+-----------+--------------------------------------------------+
| ID                               | Region    | Service Name | Service Type   | Enabled | Interface | URL                                              |
+----------------------------------+-----------+--------------+----------------+---------+-----------+--------------------------------------------------+
| 004971fbc9c24bc886b625c2957b8489 | regionOne | neutron      | network        | True    | admin     | http://172.17.1.96:9696                          |
| 0cbe76add0c6419e901187ceae2a688e | regionOne | neutron      | network        | True    | internal  | http://172.17.1.96:9696                          |
| 0d937a7898d94d1a8459441e4ae62e6c | regionOne | glance       | image          | True    | internal  | http://172.17.1.96:9292                          |
| 22249b6a10e24fbd878e993c4205757d | regionOne | glance       | image          | True    | admin     | http://172.17.1.96:9292                          |
| 23367b0bdebf4e8bbc7d22f1e91ee237 | regionOne | heat-cfn     | cloudformation | True    | internal  | http://172.17.1.96:8000/v1                       |
| 30962be59db14e9abc6e1d145b534cec | regionOne | glance       | image          | True    | public    | https://10.46.22.140:13292                       |
| 41e7d30dc4c249da9ea72db9ed202bc3 | regionOne | cinderv2     | volumev2       | True    | public    | https://10.46.22.140:13776/v2/%(tenant_id)s      |
| 44ed760541204bb599bed00e2514c90f | regionOne | keystone     | identity       | True    | internal  | http://172.17.1.96:5000                          |
| 45e64391c45240118c73b5a307fdc5da | regionOne | octavia      | load-balancer  | True    | public    | https://10.46.22.140:13876                       |
| 4ec89098700948589d7cc58bbbdd8a88 | regionOne | heat         | orchestration  | True    | admin     | http://172.17.1.96:8004/v1/%(tenant_id)s         |
| 595a32702491491b88690f05a2e0002d | regionOne | cinderv2     | volumev2       | True    | internal  | http://172.17.1.96:8776/v2/%(tenant_id)s         |
| 665b6307f11045d8a3f12189077d6055 | regionOne | placement    | placement      | True    | internal  | http://172.17.1.96:8778/placement                |
| 67d338a5157d47b0aa447aade56a665e | regionOne | nova         | compute        | True    | public    | https://10.46.22.140:13774/v2.1                  |
| 6b04926f91544fecbdf3b2f0956baef0 | regionOne | cinderv3     | volumev3       | True    | public    | https://10.46.22.140:13776/v3/%(tenant_id)s      |
| 6e405308f35342e08c3b8ac3f85a1ab6 | regionOne | octavia      | load-balancer  | True    | internal  | http://172.17.1.96:9876                          |
| 6fabfa9c5b604e7d9319be6b668b7085 | regionOne | cinderv3     | volumev3       | True    | internal  | http://172.17.1.96:8776/v3/%(tenant_id)s         |
| 73e35f093cb2486c8b2eab52059f28c6 | regionOne | swift        | object-store   | True    | internal  | http://172.17.3.160:8080/v1/AUTH_%(tenant_id)s   |
| 771535b48ca948e19672099e37df7cef | regionOne | nova         | compute        | True    | admin     | http://172.17.1.96:8774/v2.1                     |
| 83025640c68c4709b96a8a7662226e75 | regionOne | placement    | placement      | True    | public    | https://10.46.22.140:13778/placement             |
| 94d3fb83cdde4c3d8b7e4189946896fd | regionOne | swift        | object-store   | True    | public    | https://10.46.22.140:13808/v1/AUTH_%(tenant_id)s |
| 9ad3ab0201214c038ee94bd111d4517d | regionOne | cinderv2     | volumev2       | True    | admin     | http://172.17.1.96:8776/v2/%(tenant_id)s         |
| a4c0065ea55840d5bfab50acc868142f | regionOne | octavia      | load-balancer  | True    | admin     | http://172.17.1.96:9876                          |
| ae00af173d8f4cb29fe46b7d786626ea | regionOne | keystone     | identity       | True    | admin     | http://192.168.24.30:35357                       |
| b0141b3eeefe4ccb98f05601b6757c77 | regionOne | heat-cfn     | cloudformation | True    | public    | https://10.46.22.140:13005/v1                    |
| b13ec5b4b89d429292b06b6b4583d6b2 | regionOne | swift        | object-store   | True    | admin     | http://172.17.3.160:8080                         |
| bec33172f6fd44acb041866b580c3115 | regionOne | keystone     | identity       | True    | public    | https://10.46.22.140:13000                       |
| bee42c17b40b4abab21369b9f87ab01a | regionOne | nova         | compute        | True    | internal  | http://172.17.1.96:8774/v2.1                     |
| bfce4e6719e7452287cdf97989205c8c | regionOne | heat         | orchestration  | True    | internal  | http://172.17.1.96:8004/v1/%(tenant_id)s         |
| c35fd7fa304948d38d1f6532a84c3342 | regionOne | cinderv3     | volumev3       | True    | admin     | http://172.17.1.96:8776/v3/%(tenant_id)s         |
| c667fa5ca14d4431a299c73e92f9b0e6 | regionOne | placement    | placement      | True    | admin     | http://172.17.1.96:8778/placement                |
| e64b5925221a45ebbd5315cad180f40b | regionOne | neutron      | network        | True    | public    | https://10.46.22.140:13696                       |
| eaee3a4b0cf0486b8bbbdf6a653d6665 | regionOne | heat-cfn     | cloudformation | True    | admin     | http://172.17.1.96:8000/v1                       |
| eb473782621f4a3e87a7eee7a8391beb | regionOne | heat         | orchestration  | True    | public    | https://10.46.22.140:13004/v1/%(tenant_id)s      |
+----------------------------------+-----------+--------------+----------------+---------+-----------+--------------------------------------------------+

Comment 4 Jon Uriarte 2020-08-13 08:53:38 UTC
Created attachment 1711287 [details]
manila pod logs

Comment 5 Jon Uriarte 2020-08-13 08:54:29 UTC
I've tried with 4.6.0-0.nightly-2020-08-12-155346 and found similar error.

The installer fails:

time="2020-08-13T04:32:27-04:00" level=debug msg="Still waiting for the cluster to initialize: Cluster operator storage is reporting a failure: ManilaCSIDriverOperatorCRDegraded: ManilaControllerDegraded: Get \"https://10.46.22.140:13000/\": x509: certificate signed by unknown authority"
time="2020-08-13T04:38:39-04:00" level=info msg="Cluster operator insights Disabled is False with AsExpected: "
time="2020-08-13T04:38:39-04:00" level=info msg="Cluster operator kube-apiserver Progressing is True with NodeInstaller: NodeInstallerProgressing: 1 nodes are at revision 6; 2 nodes are at revision 7"
time="2020-08-13T04:38:39-04:00" level=error msg="Cluster operator storage Degraded is True with ManilaCSIDriverOperatorCR_ManilaController_SyncError: ManilaCSIDriverOperatorCRDegraded: ManilaControllerDegraded: Get \"https://10.46.22.140:13000/\": x509: certificate signed by unknown authority"
time="2020-08-13T04:38:39-04:00" level=info msg="Cluster operator storage Progressing is True with ManilaCSIDriverOperatorCR_WaitForOperator: ManilaCSIDriverOperatorCRProgressing: Waiting for Manila operator to report status"
time="2020-08-13T04:38:39-04:00" level=info msg="Cluster operator storage Available is False with ManilaCSIDriverOperatorCR_WaitForOperator: ManilaCSIDriverOperatorCRAvailable: Waiting for Manila operator to report status"
time="2020-08-13T04:38:39-04:00" level=fatal msg="failed to initialize the cluster: Cluster operator storage is reporting a failure: ManilaCSIDriverOperatorCRDegraded: ManilaControllerDegraded: Get \"https://10.46.22.140:13000/\": x509: certificate signed by unknown authority"

The manila pod log doesn't show the next message:
Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:manila-csi-driver-operator" cannot list resource "configmaps" in API group "" in the namespace "openshift-manila-csi-driver"

but it keeps showing:
E0813 08:24:40.987220       1 base_controller.go:220] "ManilaController" controller failed to sync "key", err: Get "https://10.46.22.140:13000/": x509: certificate signed by unknown authority

Find attached manila pod logs.

Comment 7 Jon Uriarte 2020-08-18 15:21:45 UTC
Verified in 4.6.0-0.nightly-2020-08-18-085914 on top of OSP 16.1 RHOS-16.1-RHEL-8-20200804.n.0 compose.

The cluster is installed successfully:

DEBUG Time elapsed per stage:                      
DEBUG     Infrastructure: 1m52s                    
DEBUG Bootstrap Complete: 19m51s                   
DEBUG                API: 4m56s                    
DEBUG  Bootstrap Destroy: 41s                      
DEBUG  Cluster Operators: 25m55s                   
INFO Time elapsed: 49m5s

manila csi driver pod is running ok:
openshift-cluster-csi-drivers                      manila-csi-driver-operator-6b8678cc88-9t596               1/1     Running     0          72m

Comment 9 errata-xmlrpc 2020-10-27 16:27:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.