Bug 2007377
Summary: | CephObjectStore does not update the RGW configuration period if 'period --commit' fails in the first reconcile | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Blaine Gardner <brgardne> | |
Component: | rook | Assignee: | Blaine Gardner <brgardne> | |
Status: | CLOSED ERRATA | QA Contact: | Filip Balák <fbalak> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.9 | CC: | amaredia, bniver, brgardne, cbodley, ceph-eng-bugs, ebenahar, etamir, jijoy, jthottan, kbader, madam, mbenjamin, muagarwa, ocs-bugs, odf-bz-bot, pbalogh, tchandra, tnielsen | |
Target Milestone: | --- | Keywords: | Automation, Regression | |
Target Release: | ODF 4.9.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | 2002220 | |||
: | 2013326 (view as bug list) | Environment: | ||
Last Closed: | 2021-12-13 17:46:30 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Comment 3
Blaine Gardner
2021-09-24 19:09:19 UTC
Adding link to upstream Rook PR to always update the period: https://github.com/rook/rook/pull/8828 Once this is merged, we should run another test. Adding a link to the downstream PR to update the RGW period if the `radosgw-admin period update --commit` command fails on a previous reconcile like happened in this bug. I doubt this will fix the issue since the RGW segfault is likely to still occur, but it will be good to run a test again with these changes to verify that they fixed the ODF side of things. Linked: https://github.com/red-hat-storage/rook/pull/29 We discovered that the recent fix doesn't fully cover or address the ODF portion of this bug. There is a new upstream PR to fix the issue in full. https://github.com/rook/rook/pull/8911 Blaine, because the fix is in rook, should we change the component back to rook? There is a related fix in Rook, but I don't think the Rook fix will make the RGW stop segfaulting. There are two separate issues. Though this is the original BZ but it tracks the rook changes, hence moving it to rook. To track the ceph changes I have created BZ #2013326 We still need to wait for: https://bugzilla.redhat.com/show_bug.cgi?id=2002220 to be fixed in CEPH right? Only then we can try to verify? (In reply to Petr Balogh from comment #14) > We still need to wait for: > https://bugzilla.redhat.com/show_bug.cgi?id=2002220 to be fixed in CEPH > right? Only then we can try to verify? Yes IMO Mudit opened BZ #2013326 to track the Ceph RGW fix for ODF. See https://bugzilla.redhat.com/show_bug.cgi?id=2007377#c13 This BZ (BZ #2007377) is now a tracker for the issue I found in Rook that is tangential. The fix for this BZ, which now relates to Rook updating the RGW config period, can be verified independently of BZ #2013326 and BZ #2002220 Removing depends on since that relationship may be confusing. Adding a link to the backport PR to the Rook release-4.9 branch: https://github.com/red-hat-storage/rook/pull/300 I accidentally moved this from ON_QA to MODIFIED. Moving it back. Sorry. Verified with build: quay.io/rhceph-dev/ocs-registry:4.9.0-214.ci Production job with FIPS from our pipeline. These were vSphere FIPS enabled and both passed. https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-trigger-vsphere-upi-fips-1az-rhcos-vsan-3m-3w-tier1/103/ https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-trigger-vsphere-upi-fips-1az-rhcos-vsan-3m-6w-tier4a/99/ Also verified with build: quay.io/rhceph-dev/ocs-registry:4.9.0-233.ci and it passed. https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/7620/consoleFull --> VERIFIED Ceph part of the problem is verified here: https://bugzilla.redhat.com/show_bug.cgi?id=2013326#c4 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:5086 |