Bug 2023268
| Summary: | [Managed Service Tracker] OSDs are not evenly distributed | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Filip Balák <fbalak> | |
| Component: | odf-managed-service | Assignee: | Ohad <omitrani> | |
| Status: | CLOSED WORKSFORME | QA Contact: | Neha Berry <nberry> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.8 | CC: | aeyal, dbindra, ebenahar, mbukatov, mmuench, nibalach, ocs-bugs, odf-bz-bot, owasserm, rperiyas | |
| Target Milestone: | --- | Keywords: | Tracking | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2100713 (view as bug list) | Environment: | ||
| Last Closed: | 2023-01-20 09:45:26 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2004801 | |||
| Bug Blocks: | ||||
|
Description
Filip Balák
2021-11-15 10:43:45 UTC
2/3 OSDs on the same node is not expected. Is this a product bug? Can you attach the StorageCluster CR? Please attach the CephCluster CR as well. The possible reason would be during the ODF MS addon installation, one of the node would have went down and which lead OSDs to schedule on the available two nodes using TSC. Currently TSC doesn't have any mechanism to check for the minimum nodes before scheduling, so StorageCluster creation is still being proceeded with 3 replicas on a cluster with less than 3 nodes Jose, this looks like a regression on introducing TopologySpreadConstraints. Can we solve it in the product? Observed this problem in scale tests too, here is the bz https://bugzilla.redhat.com/show_bug.cgi?id=2004801 I'm not 100% sure if this is a regression, but it's certainly a problem we should resolve. Since the requirements on the managed service(s) is changing frequently, I'll leave it to you guys to prioritize this BZ. As long as there is a fully ACKed OCS/ODF BZ it can go into any release of ocs-operator. (In reply to Red Hat Bugzilla from comment #13) > remove performed by PnT Account Manager <pnt-expunge> In a single zone AWS deployment, flexible scaling should be enabled and we would not added rack labels. In addition 2 OSDs on the same node will mean that there will be no OSD to store the third replica, i.e the cluster always in degraded mode. This makes this a regression. (In reply to Red Hat Bugzilla from comment #13) > remove performed by PnT Account Manager <pnt-expunge> In a single zone AWS deployment, flexible scaling should be enabled and we would not added rack labels. In addition 2 OSDs on the same node will mean that there will be no OSD to store the third replica, i.e the cluster always in degraded mode. This makes this a regression. The tracking bug is fixed in the product and this needs to verify and close. I am turning this back to NEW. BZ 2100713 was closed but the reason is that it is a duplicate of BZ 2004801 which is still in NEW state. |