Bug 1943719
Summary: | storage-operator/vsphere-problem-detector causing upgrades to fail that would have succeeded in past versions | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Luke Stanton <lstanton> | |
Component: | Storage | Assignee: | Jan Safranek <jsafrane> | |
Storage sub component: | Operators | QA Contact: | Wei Duan <wduan> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | unspecified | CC: | aos-bugs, dmoessne, hekumar, jcallen, jsafrane, lmohanty, nchoudhu, palshure, pmoses, vrutkovs, WilliamC.Elliott, wking | |
Version: | 4.7 | Keywords: | Upgrades | |
Target Milestone: | --- | |||
Target Release: | 4.8.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1959546 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-27 22:56:00 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1959546 |
Description
Luke Stanton
2021-03-26 21:47:32 UTC
*** Bug 1955260 has been marked as a duplicate of this bug. *** Additional info: This also takes place if there is network segmentation blocking access back to the diesore host:port. Upgrades were able to complete by switching the operator to unmanaged/managed at several points of the upgrade however after completing the upgrade, the operator continues to show as degraded. I found an issue that the message on Available condition is sometimes cleared. Verified with 4.8.0-0.nightly-2021-05-18-033553. After change to a invalid password by: $ oc -n kube-system edit secret vsphere-creds Then check storage clusteroperator is AVAILABLE and not DEGRADED $ oc get co storage NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE storage 4.8.0-0.nightly-2021-05-18-033553 True False False 92m Message from the clusteroperator: $ oc get clusteroperator storage -o jsonpath='{.status.conditions[?(@.type=="Available")].message}' VSphereProblemDetectorControllerAvailable: failed to connect to vcenter.sddc-44-236-21-251.vmwarevmc.com: ServerFaultCode: Cannot complete login due to an incorrect user name or password. Check the vsphere_sync_errors metric and the alert raised: { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "vsphere_sync_errors", "container": "vsphere-problem-detector-operator", "endpoint": "vsphere-metrics", "instance": "10.128.0.44:8444", "job": "vsphere-problem-detector-metrics", "namespace": "openshift-cluster-storage-operator", "pod": "vsphere-problem-detector-operator-958d9f68c-w74tb", "service": "vsphere-problem-detector-metrics" }, "value": [ 1621335304.464, "1" ] } ] } } "alerts": [ { "labels": { "alertname": "VSphereOpenshiftConnectionFailure", "container": "vsphere-problem-detector-operator", "endpoint": "vsphere-metrics", "instance": "10.128.0.44:8444", "job": "vsphere-problem-detector-metrics", "namespace": "openshift-cluster-storage-operator", "pod": "vsphere-problem-detector-operator-958d9f68c-w74tb", "service": "vsphere-problem-detector-metrics", "severity": "warning" }, "annotations": { "description": "vsphere-problem-detector cannot access vCenter. As consequence, other OCP components,\nsuch as storage or machine API, may not be able to access vCenter too and provide\ntheir services. Detailed error message can be found in Available condition of\nClusterOperator \"storage\", either in console\n(Administration -> Cluster settings -> Cluster operators tab -> storage) or on\ncommand line: oc get clusteroperator storage -o jsonpath='{.status.conditions[?(@.type==\"Available\")].message}'\n", "summary": "vsphere-problem-detector is unable to connect to vSphere vCenter." }, "state": "firing", "activeAt": "2021-05-18T10:08:52.396347327Z", "value": "1e+00" }, Marked as VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |