Bug 2063301
| Summary: | Rook can fail to deploy due to startup probe failures on mon canary pods | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Blaine Gardner <brgardne> |
| Component: | rook | Assignee: | Travis Nielsen <tnielsen> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Vijay Avuthu <vavuthu> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.10 | CC: | madam, muagarwa, nberry, ocs-bugs, odf-bz-bot, rperiyas, tnielsen |
| Target Milestone: | --- | ||
| Target Release: | ODF 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.10.0-210 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-04-21 09:12:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Blaine Gardner
2022-03-11 18:05:51 UTC
Unfortunately, we don't have a repro even upstream, but we have had reports of users with the issue, and we identified the root cause. The best way to "verify" the fix is to get the details of `rook-ceph-mon-X-canary` pods when they come online and ensure that there isn't a startup probe on any of the containers. It's probably also good to make sure there isn't a readiness or liveness probe as well. Travis, please backport it to 4.10 Verified with ocs-registry:4.10.0-217 Job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/11489/console In the mon logs , didn't see any readiness or liveness probe. Also, didn't see any deployment issue regarding mon during pipeline executions there are no constant steps to reproduce the steps. Hence, not in favor of automation |