Bug 1952826
Summary: | [azure disk csi operator] CSO is not available for 'AzureDiskCSIDriverOperatorDeploymentAvailable: Waiting for a Deployment pod to start | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Qin Ping <piqin> |
Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
Storage sub component: | Operators | QA Contact: | Wei Duan <wduan> |
Status: | CLOSED NOTABUG | Docs Contact: | |
Severity: | low | ||
Priority: | unspecified | CC: | aos-bugs, jsafrane |
Version: | 4.8 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-05-13 14:07:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Qin Ping
2021-04-23 09:44:48 UTC
We don't support un-installation of the CSI driver. Once TechPreviewNoUpgrade is set, it should be sticky and it should not be possible to disable it. And that's the thing we support as tech preview. It's possible to enable the driver also with customNoUpgrade, but we do not support it. I am lowering the severity, as customNoUpgrade is not supported, even as tech preview (at least I hope so). We'll see if we can do something better than Available=false, but still I do not want to support un-installation of the driver. It's hard and messy and can break apps that use PVs provisioned by the driver. I have a theory what may be wrong. 1. Removing TechPreviewNoUpgrade removes CSI migration, therefore all nodes are drained and restarted. 2. CSI driver controller pod is drained from its node - the CSI driver operator marks ClusterCSIDriver as AzureDiskCSIDriverOperatorDeploymentAvailable=false with "Waiting for a Deployment pod to start" 3. cluster-storage-operator copies the condition to Storage CR conditions. 4. cluster-storage-operator is drained from its node. 5. New cluster-storage-operator starts and as it does not see CSIDriverAzureDisk / TechPreviewNoUpgrade, it does not start a controller to watch ClusterCSIDriver and the Azure conditions on Storage are never cleared. I need to test it. We discussed this issue and we decided not to fix it. We don't want to support removal of the driver, either via TechPreviewNoUpgrade or CustomNoUpgrade FeatureSets. While the feature can be disabled via CustomNoUpgrade, status of storage ClusterOperator is unpredictable. It's up to the user to clean up its conditions. Presence of these conditions also suggest the user that there is an operator and CSI driver still running and they need to be cleaned up too. Exactly the same thing may happen when user downgrades from OCP version that has Azure CSI driver installed by default (say 4.10) to a version where the driver is optional (say 4.9) - 4.9 CSO does not know that there is a 4.10 version of the driver + operator still running and it will not remove them. Again, it's up to the user to clean up the mess. Ping, what do you think? Hi, Jan Make sense to me. Only one question for me, do we need to document this? We do not document CustomNoUpgrade and its values. If you think we should, please raise a BZ against docs. |