Bug 1858751

Summary: [UPI][Baremetal] Installation failed because storage cluster operator is on Unknown status
Product: OpenShift Container Platform Reporter: David Sanz <dsanzmor>
Component: StorageAssignee: Jan Safranek <jsafrane>
Storage sub component: Operators QA Contact: Qin Ping <piqin>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, jsafrane
Version: 4.6Keywords: TestBlocker
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:15:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Sanz 2020-07-20 09:47:39 UTC
Description of problem:

UPI installation on baremetal failed because storage operator is not becoming available

[morenod@morenod-laptop ~]$ oc get nodes
NAME                                                        STATUS   ROLES           AGE   VERSION
master-00.mrnd-packet-46-5a06.qe.devcluster.openshift.com   Ready    master,worker   28m   v1.18.3+a34fde4
master-01.mrnd-packet-46-5a06.qe.devcluster.openshift.com   Ready    master,worker   28m   v1.18.3+a34fde4
master-02.mrnd-packet-46-5a06.qe.devcluster.openshift.com   Ready    master,worker   28m   v1.18.3+a34fde4
[morenod@morenod-laptop ~]$ oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.0-0.nightly-2020-07-19-093912   True        False         False      9m44s
cloud-credential                           4.6.0-0.nightly-2020-07-19-093912   True        False         False      33m
cluster-autoscaler                         4.6.0-0.nightly-2020-07-19-093912   True        False         False      10m
config-operator                            4.6.0-0.nightly-2020-07-19-093912   True        False         False      27m
console                                    4.6.0-0.nightly-2020-07-19-093912   True        False         False      3m29s
csi-snapshot-controller                    4.6.0-0.nightly-2020-07-19-093912   True        False         False      11m
dns                                        4.6.0-0.nightly-2020-07-19-093912   True        False         False      25m
etcd                                       4.6.0-0.nightly-2020-07-19-093912   True        False         False      25m
image-registry                             4.6.0-0.nightly-2020-07-19-093912   True        False         False      11m
ingress                                    4.6.0-0.nightly-2020-07-19-093912   True        False         False      10m
insights                                   4.6.0-0.nightly-2020-07-19-093912   True        False         False      11m
kube-apiserver                             4.6.0-0.nightly-2020-07-19-093912   True        False         False      13m
kube-controller-manager                    4.6.0-0.nightly-2020-07-19-093912   True        False         False      23m
kube-scheduler                             4.6.0-0.nightly-2020-07-19-093912   True        False         False      25m
kube-storage-version-migrator              4.6.0-0.nightly-2020-07-19-093912   True        False         False      26m
machine-api                                4.6.0-0.nightly-2020-07-19-093912   True        False         False      11m
machine-approver                           4.6.0-0.nightly-2020-07-19-093912   True        False         False      23m
machine-config                             4.6.0-0.nightly-2020-07-19-093912   True        False         False      26m
marketplace                                4.6.0-0.nightly-2020-07-19-093912   True        False         False      10m
monitoring                                 4.6.0-0.nightly-2020-07-19-093912   True        False         False      8m5s
network                                    4.6.0-0.nightly-2020-07-19-093912   True        False         False      27m
node-tuning                                4.6.0-0.nightly-2020-07-19-093912   True        False         False      27m
openshift-apiserver                        4.6.0-0.nightly-2020-07-19-093912   True        False         False      12m
openshift-controller-manager               4.6.0-0.nightly-2020-07-19-093912   True        False         False      11m
openshift-samples                          4.6.0-0.nightly-2020-07-19-093912   True        False         False      9m7s
operator-lifecycle-manager                 4.6.0-0.nightly-2020-07-19-093912   True        False         False      26m
operator-lifecycle-manager-catalog         4.6.0-0.nightly-2020-07-19-093912   True        False         False      26m
operator-lifecycle-manager-packageserver   4.6.0-0.nightly-2020-07-19-093912   True        False         False      13m
service-ca                                 4.6.0-0.nightly-2020-07-19-093912   True        False         False      27m
storage                                    4.6.0-0.nightly-2020-07-19-093912   Unknown     Unknown       False      11m


$ oc describe co storage
Name:         storage
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2020-07-20T09:11:13Z
  Generation:          1
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
      f:status:
        .:
        f:extension:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2020-07-20T09:11:13Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
        f:relatedObjects:
        f:versions:
    Manager:         cluster-storage-operator
    Operation:       Update
    Time:            2020-07-20T09:34:10Z
  Resource Version:  16316
  Self Link:         /apis/config.openshift.io/v1/clusteroperators/storage
  UID:               fe035716-3d12-4d18-a492-46f485e172b7
Spec:
Status:
  Conditions:
    Last Transition Time:  2020-07-20T09:34:10Z
    Reason:                AsExpected
    Status:                False
    Type:                  Degraded
    Last Transition Time:  2020-07-20T09:34:10Z
    Reason:                NoData
    Status:                Unknown
    Type:                  Progressing
    Last Transition Time:  2020-07-20T09:34:10Z
    Reason:                NoData
    Status:                Unknown
    Type:                  Available
    Last Transition Time:  2020-07-20T09:34:10Z
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:               <nil>
  Related Objects:
    Group:     
    Name:      openshift-cluster-storage-operator
    Resource:  namespaces
    Group:     operator.openshift.io
    Name:      cluster
    Resource:  storages
  Versions:
    Name:     operator
    Version:  4.6.0-0.nightly-2020-07-19-093912
Events:       <none>


[morenod@morenod-laptop ~]$ oc logs cluster-storage-operator-76bd5d84-9dwkx
W0720 09:34:09.088762       1 cmd.go:199] Using insecure, self-signed certificates
I0720 09:34:09.521112       1 observer_polling.go:159] Starting file observer
I0720 09:34:09.538252       1 builder.go:223] cluster-storage-operator version 81aca48-81aca489d92ff980bfb10a5e10be234a331691b1
I0720 09:34:09.921857       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0720 09:34:09.921880       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0720 09:34:09.921881       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0720 09:34:09.921893       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0720 09:34:09.922107       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/serving-cert-962134490/tls.crt::/tmp/serving-cert-962134490/tls.key
I0720 09:34:09.922108       1 leaderelection.go:242] attempting to acquire leader lease  openshift-cluster-storage-operator/cluster-storage-operator-lock...
I0720 09:34:09.922256       1 secure_serving.go:178] Serving securely on [::]:8443
I0720 09:34:09.922284       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0720 09:34:09.926528       1 leaderelection.go:252] successfully acquired lease openshift-cluster-storage-operator/cluster-storage-operator-lock
I0720 09:34:09.926622       1 event.go:278] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator-lock", UID:"7e338e26-ae0f-465f-8c27-a7392f6cf8ad", APIVersion:"v1", ResourceVersion:"15974", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' ae21d47b-d0c5-4b8f-8c6f-6f49d773feb9 became leader
I0720 09:34:09.928036       1 starter.go:116] Starting the Informers.
I0720 09:34:09.928056       1 starter.go:128] Starting the controllers
I0720 09:34:09.928076       1 shared_informer.go:223] Waiting for caches to sync for SnapshotCRDController
I0720 09:34:09.928139       1 shared_informer.go:223] Waiting for caches to sync for ManagementStateController
I0720 09:34:09.928207       1 shared_informer.go:223] Waiting for caches to sync for LoggingSyncer
I0720 09:34:09.928226       1 shared_informer.go:223] Waiting for caches to sync for DefaultStorageClassController
I0720 09:34:09.928265       1 shared_informer.go:223] Waiting for caches to sync for StatusSyncer_storage
I0720 09:34:10.022710       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0720 09:34:10.028320       1 shared_informer.go:230] Caches are synced for ManagementStateController 
I0720 09:34:10.028336       1 base_controller.go:54] Starting #1 worker of ManagementStateController controller ...
I0720 09:34:10.028364       1 shared_informer.go:230] Caches are synced for LoggingSyncer 
I0720 09:34:10.028368       1 base_controller.go:54] Starting #1 worker of LoggingSyncer controller ...
I0720 09:34:10.028425       1 shared_informer.go:230] Caches are synced for DefaultStorageClassController 
I0720 09:34:10.028455       1 base_controller.go:54] Starting #1 worker of DefaultStorageClassController controller ...
I0720 09:34:10.028471       1 shared_informer.go:230] Caches are synced for StatusSyncer_storage 
I0720 09:34:10.028482       1 base_controller.go:54] Starting #1 worker of StatusSyncer_storage controller ...
I0720 09:34:10.028687       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator", UID:"3971fcdc-7e4c-4e9f-9729-6637466a4c93", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorVersionChanged' clusteroperator/storage version "operator" changed from "" to "4.6.0-0.nightly-2020-07-19-093912"
I0720 09:34:10.028996       1 status_controller.go:172] clusteroperator/storage diff {"status":{"conditions":[{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Degraded"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Progressing"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Available"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Upgradeable"}],"relatedObjects":[{"group":"","name":"openshift-cluster-storage-operator","resource":"namespaces"},{"group":"operator.openshift.io","name":"cluster","resource":"storages"}],"versions":[{"name":"operator","version":"4.6.0-0.nightly-2020-07-19-093912"}]}}
I0720 09:34:10.030980       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0720 09:34:10.031594       1 tlsconfig.go:178] loaded client CA [0/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "admin-kubeconfig-signer" [] issuer="<self>" (2020-07-20 08:50:01 +0000 UTC to 2030-07-18 08:50:01 +0000 UTC (now=2020-07-20 09:34:10.031580242 +0000 UTC))
I0720 09:34:10.031611       1 tlsconfig.go:178] loaded client CA [1/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kubelet-signer" [] issuer="<self>" (2020-07-20 08:50:04 +0000 UTC to 2020-07-21 08:50:04 +0000 UTC (now=2020-07-20 09:34:10.031604398 +0000 UTC))
I0720 09:34:10.031626       1 tlsconfig.go:178] loaded client CA [2/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kube-control-plane-signer" [] issuer="<self>" (2020-07-20 08:50:04 +0000 UTC to 2021-07-20 08:50:04 +0000 UTC (now=2020-07-20 09:34:10.031619287 +0000 UTC))
I0720 09:34:10.031642       1 tlsconfig.go:178] loaded client CA [3/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kube-apiserver-to-kubelet-signer" [] issuer="<self>" (2020-07-20 08:50:05 +0000 UTC to 2021-07-20 08:50:05 +0000 UTC (now=2020-07-20 09:34:10.031635087 +0000 UTC))
I0720 09:34:10.031656       1 tlsconfig.go:178] loaded client CA [4/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kubelet-bootstrap-kubeconfig-signer" [] issuer="<self>" (2020-07-20 08:50:02 +0000 UTC to 2030-07-18 08:50:02 +0000 UTC (now=2020-07-20 09:34:10.031649495 +0000 UTC))
I0720 09:34:10.031671       1 tlsconfig.go:178] loaded client CA [5/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kube-csr-signer_@1595236683" [] issuer="kubelet-signer" (2020-07-20 09:18:03 +0000 UTC to 2020-07-21 08:50:04 +0000 UTC (now=2020-07-20 09:34:10.031664194 +0000 UTC))
I0720 09:34:10.031683       1 tlsconfig.go:178] loaded client CA [6/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "aggregator-signer" [] issuer="<self>" (2020-07-20 08:50:03 +0000 UTC to 2020-07-21 08:50:03 +0000 UTC (now=2020-07-20 09:34:10.031676988 +0000 UTC))
I0720 09:34:10.031846       1 tlsconfig.go:200] loaded serving cert ["serving-cert::/tmp/serving-cert-962134490/tls.crt::/tmp/serving-cert-962134490/tls.key"]: "localhost" [serving] validServingFor=[localhost] issuer="cluster-storage-operator-signer@1595237649" (2020-07-20 09:34:08 +0000 UTC to 2020-08-19 09:34:09 +0000 UTC (now=2020-07-20 09:34:10.031839031 +0000 UTC))
I0720 09:34:10.031976       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1595237649" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1595237649" (2020-07-20 08:34:09 +0000 UTC to 2021-07-20 08:34:09 +0000 UTC (now=2020-07-20 09:34:10.03196843 +0000 UTC))
I0720 09:34:10.035385       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator", UID:"3971fcdc-7e4c-4e9f-9729-6637466a4c93", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/storage changed: Degraded set to Unknown (""),Progressing set to Unknown (""),Available set to Unknown (""),Upgradeable set to Unknown (""),status.relatedObjects changed from [] to [{"" "namespaces" "" "openshift-cluster-storage-operator"} {"operator.openshift.io" "storages" "" "cluster"}]
I0720 09:34:10.052902       1 status_controller.go:172] clusteroperator/storage diff {"status":{"conditions":[{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Progressing"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Available"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Upgradeable"}]}}
I0720 09:34:10.064723       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator", UID:"3971fcdc-7e4c-4e9f-9729-6637466a4c93", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/storage changed: Degraded changed from Unknown to False ("")
I0720 09:34:10.128180       1 shared_informer.go:230] Caches are synced for SnapshotCRDController 
I0720 09:34:10.128199       1 base_controller.go:54] Starting #1 worker of SnapshotCRDController controller ...
I0720 09:34:10.135029       1 status_controller.go:172] clusteroperator/storage diff {"status":{"conditions":[{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Progressing"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"NoData","status":"Unknown","type":"Available"},{"lastTransitionTime":"2020-07-20T09:34:10Z","reason":"AsExpected","status":"True","type":"Upgradeable"}]}}
I0720 09:34:10.140116       1 event.go:278] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-storage-operator", Name:"cluster-storage-operator", UID:"3971fcdc-7e4c-4e9f-9729-6637466a4c93", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/storage changed: Upgradeable changed from Unknown to True ("")



Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-07-19-093912


How reproducible:

Steps to Reproduce:
1.Install UPI on OSP
2.Check storage cluster operator status
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Jan Safranek 2020-07-21 07:23:56 UTC
Sorry about that, a fix has been merged yesterday.

Comment 6 errata-xmlrpc 2020-10-27 16:15:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196