Bug 1914159
Summary: | When OCS was deployed using arbiter mode mon's are going into CLBO state, ceph version = 14.2.11-95 | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Pratik Surve <prsurve> | |
Component: | ceph | Assignee: | Greg Farnum <gfarnum> | |
Status: | CLOSED ERRATA | QA Contact: | Pratik Surve <prsurve> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.7 | CC: | akrai, bniver, branto, gfarnum, madam, mbukatov, muagarwa, nberry, ocs-bugs, owasserm, tnielsen | |
Target Milestone: | --- | Keywords: | AutomationBackLog, TestBlocker | |
Target Release: | OCS 4.7.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | ocs-registry:4.7.0-247.ci | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1917374 (view as bug list) | Environment: | ||
Last Closed: | 2021-05-19 09:17:47 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1917374 | |||
Bug Blocks: |
Description
Pratik Surve
2021-01-08 08:52:21 UTC
Looks like the stretch cluster was fully setup, but then the mons started crashing. Greg, can you take a look at the stack? Hmm I'd have thought the mustgather output would contain actual ceph daemon logs from the ceph-mon crashes but all I'm seeing is backtraces? Anyway, based on those I think this due to an inadvertent compatibility mismatch between the monitors and clients. The fix is posted upstream but got delayed getting run through QA while I tried to add on another feature bit; I'm pushing it through now and will have the fix in downstream later today. https://github.com/ceph/ceph/pull/38531 Ah, this is the same client error? I overlooked that the base image for Rook wasn't updated. @Boris Is the base image for Rook not always the same RHCS image we use for OCS? If not, can you update the Rook base image as well with RHCS 4.2? No it’s broken in all the codebases. Give it a few more hours for tests to run in the lab and then I’ll push patches around. Pushed a patch to ceph-4.2-rhel-patches that should resolve this (In reply to Travis Nielsen from comment #9) > Ah, this is the same client error? I overlooked that the base image for Rook > wasn't updated. > > @Boris Is the base image for Rook not always the same RHCS image we use for > OCS? If not, can you update the Rook base image as well with RHCS 4.2? We use the same base image for rook as we use in OCS, the build pipeline updates it whenever we switch the RHCS image in OCS. Boris, are we picking the latest RHCS4.2 build with OCS 4.7? Yes, we use the latest RHCS 4.2 (GA) image. The image didn't change in a while though. @Greg Farnum: Did the patch make it into RHCS 4.2 or is it planned for 4.2z1? I pushed a commit to ceph-4.2-rhel patches on Sunday (https://gitlab.cee.redhat.com/ceph/ceph/-/commit/6b378160f6949011d7232a2102e15e668adf4d6b). I thought I saw an automated email suggesting it had been built, but I can't find that now and the steps after I push to the patches repo are pretty much invisible to me. Not sure what happens after that in the pipeline with all the build changes that have been going on, but that's usually all we have to do for things to appear. fixing direction of BZ dependency The latest container build (4.2 GA) is from Dec 18 so it doesn't contain that fix. We will have to include 4.2z1 in OCS 4.7 to fix this. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041 |