Bug 2131703
| Summary: | Ceph is in HEALTH_WARN right after deployment with size 12 | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Filip Balák <fbalak> |
| Component: | odf-managed-service | Assignee: | Leela Venkaiah Gangavarapu <lgangava> |
| Status: | CLOSED NOTABUG | QA Contact: | Filip Balák <fbalak> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.10 | CC: | aeyal, ebenahar, lgangava, nberry, ocs-bugs, odf-bz-bot, owasserm |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-02-06 10:10:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
hi, - this seems to be a legit issue and need changes to resource calculations as well - for time being, I'm assigning the bug to myself @fbalak, - does this effect the IO/management ops directly? Thanks, leela. - Still awaiting to hear back any repercussions caused by this bug - Orit is also looking into it and will await an update No IO was tested with the cluster. This was a state right after installation without any operation. - pls note above workaround has to applied after each upscale - Bug is resolved, the dependent jira issue is fixed from OCM Moving to 4.12.z as the verification would be done against the ODF MS rollout that would be based on ODF 4.12 Moving to VERIFIED based on regression testing. We will clone this bug for the sake of verifying the scenario as part of ODF MS testing over ODF 4.12 or with the provider-consumer layout Size 12 is not going to be supported now. --> CLOSED NOTABUG |
Description of problem: Right after deployment of ODF Managed Service addon with size 12, the ceph is in unhealthy state: HEALTH_WARN 1 slow ops, oldest one blocked for 9728 sec, mon.c has slow ops Version-Release number of selected component (if applicable): ocs-osd-deployer.v2.0.7 How reproducible: 1/1 Steps to Reproduce: 1.Deploy a service with dev addon: rosa create service --type ocs-provider-dev --name fbalak-pr --machine-cidr 10.0.0.0/16 --size 12 --onboarding-validation-key <key> --subnet-ids <subnets> --region us-east-1 2. Check health status: oc rsh -n openshift-storage $(oc get pods -n openshift-storage|grep tool|awk '{print$1}') ceph health Actual results: HEALTH_WARN 1 slow ops, oldest one blocked for 9728 sec, mon.c has slow ops Expected results: HEALTH_OK Additional info: The cluster was deployed with dev addon that contains changes to epic ODFMS-55.