Bug 1926617
| Summary: | osds are in Init:CrashLoopBackOff with rgw in CrashLoopBackOff on KMS enabled cluster | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Persona non grata <nobody+410372> |
| Component: | rook | Assignee: | Sébastien Han <shan> |
| Status: | CLOSED ERRATA | QA Contact: | Persona non grata <nobody+410372> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.7 | CC: | branto, ebenahar, hnallurv, jijoy, jthottan, madam, muagarwa, ocs-bugs, shan, sostapov, vavuthu |
| Target Milestone: | --- | Keywords: | AutomationTriaged |
| Target Release: | OCS 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.7.0-731.ci | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-19 09:19:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Persona non grata
2021-02-09 08:29:53 UTC
logs? (In reply to Sébastien Han from comment #2) > logs? OCP logs http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz1926617.zip Must gather is taking lot of time while collecting ocs logs, will update once I collect Thanks, Shreekar Vijay, the error is different: Liveness probe failed: admin_socket: exception getting command descriptions: [Errno 2] No such file or directory Please open a different BZ. (In reply to Sébastien Han from comment #5) > Vijay, the error is different: Liveness probe failed: admin_socket: > exception getting command descriptions: [Errno 2] No such file or directory > Please open a different BZ. Thanks for the confirmation. Raised new bz for issue: https://bugzilla.redhat.com/show_bug.cgi?id=1927262 Tested on ocs-operator.v4.7.0-263.ci, did flow based ops like add capacity, node restart with running IOs, found that RGW pods are Up and Running, but existing OSD went to rook-ceph-osd-2-696d8df8d4-5hcpf 0/2 Init:CrashLoopBackOff 87 7h7m Moving to Assigned. Tested on ocs-operator.v4.7.0-731.ci with OpenShift version 4.7.0-0.nightly-2021-02-18-110409 All OSDs are up and running, add capacity worked, post add capacity, no issues seen on Existing OSDs. Moving bug to Verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041 Removing AutomationBacklog keyword. This will be covered in installation of KMS enabled cluster. A specific test case is not needed. |