Bug 1918564

Summary: [manila-csi-driver-operator] does not detect csi driver work status
Product: OpenShift Container Platform Reporter: Wei Duan <wduan>
Component: StorageAssignee: Fabio Bertinatto <fbertina>
Storage sub component: OpenStack CSI Drivers QA Contact: Wei Duan <wduan>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: low CC: aos-bugs, juriarte, mfedosin, pprinett
Version: 4.7Keywords: Triaged
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-26 17:47:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wei Duan 2021-01-21 04:01:45 UTC
Description of problem:
It is not easy to find that csi driver is in non-work status. 
Due to bug 1918140, openstack-manila-csi-controllerplugin did not be installed on OSP, but there is no status specifying csi driver doesn't work when checking the clustercsidrivers/manila.csi.openstack.org, there is no anything degrade or not available, and there is no type "ManilaDriverControllerServiceController"
$ oc get clustercsidrivers manila.csi.openstack.org -o json | jq .status
{
  "conditions": [
    {
      "lastTransitionTime": "2021-01-20T09:49:58Z",
      "reason": "AsExpected",
      "status": "False",
      "type": "ManilaControllerDegraded"
    },
    {
      "lastTransitionTime": "2021-01-20T09:49:58Z",
      "status": "False",
      "type": "ManagementStateDegraded"
    },
    {
      "lastTransitionTime": "2021-01-20T23:05:24Z",
      "status": "False",
      "type": "ResourceSyncControllerDegraded"
    },
    {
      "lastTransitionTime": "2021-01-20T09:49:58Z",
      "reason": "AsExpected",
      "status": "False",
      "type": "SecretSyncDegraded"
    },
    {
      "lastTransitionTime": "2021-01-20T09:50:06Z",
      "status": "True",
      "type": "ManilaDriverNodeServiceControllerAvailable"
    },
    {
      "lastTransitionTime": "2021-01-20T11:27:07Z",
      "status": "False",
      "type": "ManilaDriverNodeServiceControllerProgressing"
    },
    {
      "lastTransitionTime": "2021-01-20T16:23:31Z",
      "reason": "AsExpected",
      "status": "False",
      "type": "ManilaDriverNodeServiceControllerDegraded"
    },
    {
      "lastTransitionTime": "2021-01-20T09:50:05Z",
      "status": "True",
      "type": "NFSDriverNodeServiceControllerAvailable"
    },
    {
      "lastTransitionTime": "2021-01-20T11:27:07Z",
      "status": "False",
      "type": "NFSDriverNodeServiceControllerProgressing"
    },
    {
      "lastTransitionTime": "2021-01-20T09:49:58Z",
      "reason": "AsExpected",
      "status": "False",
      "type": "NFSDriverNodeServiceControllerDegraded"
    },
    {
      "lastTransitionTime": "2021-01-20T09:50:01Z",
      "reason": "AsExpected",
      "status": "False",
      "type": "ManilaDriverStaticResourcesDegraded"
    }
  ],

And there is no explicit error/warning info from manila-csi-driver-operator log:  
$ oc -n openshift-cluster-csi-drivers logs manila-csi-driver-operator-5d644fbf58-tz9lz | grep "^E" 
$ oc -n openshift-cluster-csi-drivers logs manila-csi-driver-operator-5d644fbf58-tz9lz | grep "^W"
W0121 03:15:22.036441       1 cmd.go:204] Using insecure, self-signed certificates
W0121 03:15:23.025681       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256' detected.
W0121 03:15:23.025701       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256' detected.  
 
When checking the CSO, in most cases it is in normal status, you can only see it degrades in a very short moment with “oc get co storage -w”, so it's hard to find the issue early.
 
Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2021-01-19-095812

How reproducible:
On condition

Steps to Reproduce:
See Description

Actual results:

Expected results:

Comment 9 Fabio Bertinatto 2021-08-26 17:47:52 UTC
Closing as a duplicate of 1918562 because the fix (which is just logging an error message) is shared between all CSI operators, so there's no need to test one-by-one.

*** This bug has been marked as a duplicate of bug 1918562 ***

Comment 10 Red Hat Bugzilla 2023-09-15 00:58:44 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days