Bug 2305874
Summary: | [GSS][ODF 4.13 backport] Legacy LVM-based OSDs are in crashloop state | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Paul Gozart <pgozart> |
Component: | rook | Assignee: | Travis Nielsen <tnielsen> |
Status: | NEW --- | QA Contact: | Neha Berry <nberry> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.13 | CC: | odf-bz-bot, sheggodu, tnielsen |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | Type: | --- | |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Paul Gozart
2024-08-19 18:42:17 UTC
Actually, removing from 4.13.z proposal while still investigating... The original issue that this was cloned from does not apply to 4.13. That code path was that legacy lvm-based OSDs were failing to expand. It required that this method: c.getExpandPVCInitContainer() be called for the lvm-based OSDs. However, in this code snippet [1], line 556 already only applies to raw-based OSDs: initContainers = append(initContainers, c.getExpandPVCInitContainer(osdProps, osdID)) This code path does not apply to lvm-based OSDs since it's already in the "else" block for "raw" OSDs. [2] So these must not be legacy lvm-based OSDs as in the original issue. There must be some other issue causing these raw-based OSDs to fail during the resize call. If this is the case, these OSDs will likely continue to fail each time the cluster is upgraded and the OSDs are reconciled to add the resize container back. [1] https://github.com/red-hat-storage/rook/blob/release-4.13/pkg/operator/ceph/cluster/osd/spec.go#L529-L557 [2] https://github.com/red-hat-storage/rook/commit/b489d7ae47a628497be8695ffb70606d246a578c If we can't find the root cause, to avoid future issues with these OSDs, they may need to be serially wiped and replaced. |