Created attachment 1834737 [details] must-gather Created attachment 1834737 [details] must-gather Description of problem: Installation of UPI ASH fails because of storage opeartor. Version-Release number of selected component (if applicable): openshift-install-linux-4.10.16-173656 How reproducible: Always Steps to Reproduce: 1. Install 4.10 UPI ASH following: https://deploy-preview-36950--osdocs.netlify.app/openshift-enterprise/latest/installing/installing_azure_stack_hub/installing-azure-stack-hub-user-infra.html#installation-creating-azure-dns_installing-azure-stack-hub-user-infra Actual result: Installation fails with ERROR Cluster operator storage Degraded is True with AzureFileCSIDriverOperatorCR_AzureFileDriverControllerServiceController_SyncError::AzureFileDriverNodeServiceController_SyncError::AzureFileDriverStaticResourcesController_SyncError: AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverControllerServiceControllerDegraded: Deployment.apps "azure-file-csi-driver-controller" is invalid: spec.template.spec.initContainers[0].image: Required value ERROR AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverNodeServiceControllerDegraded: DaemonSet.apps "azure-file-csi-driver-node" is invalid: spec.template.spec.initContainers[0].image: Required value ERROR AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: "rbac/csi_driver_role.yaml" (string): clusterroles.rbac.authorization.k8s.io "azure-file-csi-driver-role" is forbidden: user "system:serviceaccount:openshift-cluster-csi-drivers:azure-file-csi-driver-operator" (groups=["system:serviceaccounts" "system:serviceaccounts:openshift-cluster-csi-drivers" "system:authenticated"]) is attempting to grant RBAC permissions not currently held: ERROR AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: {APIGroups:[""], Resources:["secrets"], Verbs:["create" "update" "delete" "patch"]} ERROR AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: "rbac/csi_driver_binding.yaml" (string): clusterroles.rbac.authorization.k8s.io "azure-file-csi-driver-role" not found ERROR AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: INFO Cluster operator storage Progressing is True with AzureFileCSIDriverOperatorCR_WaitForOperator: AzureFileCSIDriverOperatorCRProgressing: Waiting for AzureFile operator to report status INFO Cluster operator storage Available is False with AzureFileCSIDriverOperatorCR_WaitForOperator: AzureFileCSIDriverOperatorCRAvailable: Waiting for AzureFile operator to report status ERROR Cluster initialization failed because one or more operators are not functioning properly. ERROR The cluster should be accessible for troubleshooting as detailed in the documentation linked below, ERROR https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html ERROR The 'wait-for install-complete' subcommand can then be used to continue the installation FATAL failed to initialize the cluster: Cluster operator storage is not available All operators are available except: storage 4.10.0-0.nightly-2021-10-16-173656 False True True 100m AzureFileCSIDriverOperatorCRAvailable: Waiting for AzureFile operator to report status Must gather reports When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: 1b19050b-36d9-48e4-b668-c03c29d693b3 ClusterVersion: Installing "4.10.0-0.nightly-2021-10-16-173656" for 2 hours: Unable to apply 4.10.0-0.nightly-2021-10-16-173656: the cluster operator storage has not yet successfully rolled out ClusterOperators: clusteroperator/storage is not available (AzureFileCSIDriverOperatorCRAvailable: Waiting for AzureFile operator to report status) because AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverControllerServiceControllerDegraded: Deployment.apps "azure-file-csi-driver-controller" is invalid: spec.template.spec.initContainers[0].image: Required value AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverNodeServiceControllerDegraded: DaemonSet.apps "azure-file-csi-driver-node" is invalid: spec.template.spec.initContainers[0].image: Required value AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: "rbac/csi_driver_role.yaml" (string): clusterroles.rbac.authorization.k8s.io "azure-file-csi-driver-role" is forbidden: user "system:serviceaccount:openshift-cluster-csi-drivers:azure-file-csi-driver-operator" (groups=["system:serviceaccounts" "system:serviceaccounts:openshift-cluster-csi-drivers" "system:authenticated"]) is attempting to grant RBAC permissions not currently held: AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: {APIGroups:[""], Resources:["secrets"], Verbs:["create" "update" "delete" "patch"]} AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: "rbac/csi_driver_binding.yaml" (string): clusterroles.rbac.authorization.k8s.io "azure-file-csi-driver-role" not found AzureFileCSIDriverOperatorCRDegraded: AzureFileDriverStaticResourcesControllerDegraded: Expected results: Installation completes (same process completes with 4.9.0) Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info: Happens on dev CI as well.
AzureFile CSI driver operator gets degraded. ClusterCSIDriver file.csi.azure.com.yaml from the must-gather: - lastTransitionTime: "2021-10-19T16:02:36Z" message: 'DaemonSet.apps "azure-file-csi-driver-node" is invalid: spec.template.spec.initContainers[0].image: Required value' reason: SyncError status: "True" type: AzureFileDriverNodeServiceControllerDegraded - lastTransitionTime: "2021-10-19T16:02:36Z" message: 'Deployment.apps "azure-file-csi-driver-controller" is invalid: spec.template.spec.initContainers[0].image: Required value' reason: SyncError status: "True" type: AzureFileDriverControllerServiceControllerDegraded - lastTransitionTime: "2021-10-19T16:02:36Z" status: "False" type: ConfigObservationDegraded - lastTransitionTime: "2021-10-19T16:02:41Z" message: | "rbac/csi_driver_role.yaml" (string): clusterroles.rbac.authorization.k8s.io "azure-file-csi-driver-role" is forbidden: user "system:serviceaccount:openshift-cluster-csi-drivers:azure-file-csi-driver-operator" (groups=["system:serviceaccounts" "system:serviceaccounts:openshift-cluster-csi-drivers" "system:authe nticated"]) is attempting to grant RBAC permissions not currently held: {APIGroups:[""], Resources:["secrets"], Verbs:["create" "update" "delete" "patch"]} "rbac/csi_driver_binding.yaml" (string): clusterroles.rbac.authorization.k8s.io "azure-file-csi-driver-role" not found reason: SyncError status: "True" type: AzureFileDriverStaticResourcesControllerDegraded readyReplicas: 0
There is definitely a bug in cluster-storage-operator. It starts AzureFile CSI driver operator even without TechPreviewNoUpgrade FeatuteSet. AzureFile is in development right now, not ready for CI.
I am able to get ASH cluster installed successfully with 4.10.0-0.nightly-2021-10-22-061826
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056