Bug 2002554 - Cluster becomes degraded if it can't talk to Manila
Summary: Cluster becomes degraded if it can't talk to Manila
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.8.z
Assignee: Eric Duen
QA Contact: rlobillo
URL:
Whiteboard:
Depends On: 2001958
Blocks: 2002555
TreeView+ depends on / blocked
 
Reported: 2021-09-09 08:25 UTC by Martin André
Modified: 2021-09-27 19:53 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2001958
: 2002555 (view as bug list)
Environment:
Last Closed: 2021-09-27 19:53:12 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift csi-driver-manila-operator pull 125 0 None Merged [release-4.8] Bug 2002554: Do not degrade cluster on failure to reach Manila 2021-09-15 14:56:27 UTC
Red Hat Product Errata RHBA-2021:3632 0 None None None 2021-09-27 19:53:29 UTC

Comment 2 rlobillo 2021-09-15 11:37:18 UTC
Pre-verified on 4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest (cluster-bot build for build openshift/csi-driver-manila-operator#125) on  on top of OSP16.1 (RHOS-16.1-RHEL-8-20210818.n.0) with  openshiftSDN network type.


The UPI installation performed on restricted network with a proxy finished successfully when the SG rules on the proxy instance is blocking the egress traffic going to OSP manila endpoint:

$ openstack catalog show manila | grep public
|           |   public: https://10.46.44.10:13786/v1/1ebc41dabb5e4e9bae86a22bb4ffcb40 |


# Egress rules on the instance where the proxy is running:
$ openstack security group rule list --egress installer_host-sg
+--------------------------------------+-------------+-----------+-----------+-------------+-----------------------+
| ID                                   | IP Protocol | Ethertype | IP Range  | Port Range  | Remote Security Group |
+--------------------------------------+-------------+-----------+-----------+-------------+-----------------------+
| 016e5030-bca6-402d-8cfa-e4b7271ba9ec | None        | IPv6      | ::/0      |             | None                  |
| 1d4be39b-8236-4968-8624-4458a82da619 | tcp         | IPv4      | 0.0.0.0/0 | 13787:65000 | None                  |
| 9b8dbd27-299f-420e-82f2-f46e35d938be | udp         | IPv4      | 0.0.0.0/0 |             | None                  |
| dceae5ee-38fb-44b0-824b-9f4975c2ce05 | tcp         | IPv4      | 0.0.0.0/0 | 1:13785     | None                  |
+--------------------------------------+-------------+-----------+-----------+-------------+-----------------------+

This is provoking that the manila-csi-driver-operator is getting a timeout while reaching the manila API, but it is working for the rest (tested with keystone):

$ oc rsh -n openshift-cluster-csi-drivers $(oc get pods -n openshift-cluster-csi-drivers -l name=manila-csi-driver-operator -o name)
sh-4.4$ curl --connect-timeout 5 --proxy-cacert /etc/openstack-ca/ca-bundle.pem --cacert /etc/openstack-ca/ca-bundle.pem https://10.46.44.10:13786/v1/1ebc41dabb5e4e9bae86a22bb4ffcb40
curl: (28) Operation timed out after 5001 milliseconds with 0 out of 0 bytes received
sh-4.4$ curl --connect-timeout 5 --proxy-cacert /etc/openstack-ca/ca-bundle.pem --cacert /etc/openstack-ca/ca-bundle.pem https://10.46.44.10:13000                                    
{"versions": {"values": [{"id": "v3.13", "status": "stable", "updated": "2019-07-19T00:00:00Z", "links": [{"rel": "self", "href": "https://10.46.44.10:13000/v3/"}], "media-types": [{"base": 

Under these circumstances, the UPI installation works fine and all cluster operators are available:

$ oc get clusteroperators
NAME                                       VERSION                                                  AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      29s
baremetal                                  4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      45m
cloud-controller-manager                   4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      53m
cloud-credential                           4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      71m
cluster-autoscaler                         4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      44m
config-operator                            4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      53m
console                                    4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      7m24s
csi-snapshot-controller                    4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      45m
dns                                        4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      47m
etcd                                       4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      46m
image-registry                             4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      12m
ingress                                    4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      11m
insights                                   4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      48m
kube-apiserver                             4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        True          False      45m
kube-controller-manager                    4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      41m
kube-scheduler                             4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      48m
kube-storage-version-migrator              4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      48m
machine-api                                4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      39m
machine-approver                           4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      45m
machine-config                             4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      44m
marketplace                                4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      44m
monitoring                                 4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      6m38s
network                                    4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      47m
node-tuning                                4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      44m
openshift-apiserver                        4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      43m
openshift-controller-manager               4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      41m
openshift-samples                          4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      37m
operator-lifecycle-manager                 4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      46m
operator-lifecycle-manager-catalog         4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      46m
operator-lifecycle-manager-packageserver   4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      42m
service-ca                                 4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      53m
storage                                    4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         False      41m

and Manila is not deployed as stated on the clusteroperator storage:

$ oc get clusteroperator storage -o json | jq '.status.conditions[] | select(.type=="Available")'
{
  "lastTransitionTime": "2021-09-15T10:51:49Z",
  "message": "OpenStackCinderCSIDriverOperatorCRAvailable: All is well\nManilaCSIDriverOperatorCRAvailable: CSI driver for Manila is disabled: Unable to retrieve Manila share types: cannot list available share types: Get \"https://10.46.44.10:13786/v2/1ebc41dabb5e4e9bae86a22bb4ffcb40/types\": Service Unavailable",
  "reason": "AsExpected",
  "status": "True",
  "type": "Available"


$ oc get sc
NAME                 PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
standard (default)   kubernetes.io/cinder       Delete          WaitForFirstConsumer   true                   28m
standard-csi         cinder.csi.openstack.org   Delete          WaitForFirstConsumer   true                   26m

$ oc get pods -A | grep -i manila
openshift-cluster-csi-drivers                      manila-csi-driver-operator-7b69f6db85-bfdfz               1/1     Running     2          28m

$ oc get clusterversion
NAME      VERSION                                                  AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest   True        False         43s     Cluster version is 4.8.0-0.ci.test-2021-09-15-081039-ci-ln-m1nvxyt-latest

Comment 7 errata-xmlrpc 2021-09-27 19:53:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.13 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3632


Note You need to log in before you can comment on or make changes to this bug.