Bug 1926984
| Summary: | Reports that has specified a retention should not be requeued in the sync handler | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | tflannag |
| Component: | Metering Operator | Assignee: | tflannag |
| Status: | CLOSED ERRATA | QA Contact: | Peter Ruan <pruan> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.7 | CC: | aos-bugs, sd-operator-metering |
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause:
The reporting operator incorrectly handles Report custom resources that contain a user-provided retention period when reconciling events.
Consequence:
An "expired" Report custom resource will lead to the reporting operator hotlooping, as the affected custom resources are requeued indefinitely.
Fix:
Avoid requeueing expired Report custom resources that have specified a retention period.
Result:
The reporting operator correctly handles events for expired Report custom resources.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 22:19:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1929042 | ||
I'm going to mark this as not a blocker, despite it being reported by ENG. While this is problematic, it's not a regression as that functionality hasn't been changed since it landed in 4.6. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.2 extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2435 |
Description of problem: After the introduction of the user-provided Report expiration, it looks like Report objects are being improperly managed, which results in the following reporting-operator container logs: ``` time="2021-02-09T18:22:05Z" level=info msg="successfully synced Report \"metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report\"" Report=metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report app=metering component=reportWorker logID=qn9ITURy4Z time="2021-02-09T18:22:05Z" level=info msg="syncing Report metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report" app=metering component=reportWorker logID=BYiMIPO8J9 time="2021-02-09T18:22:05Z" level=info msg="ReportQuery, subreport-cpu-usage-run-immediately-expiration-report exists that uses report subreport-cpu-usage-run-immediately-expiration-report as input, will not delete though retention period has expired" app=metering component=reportWorker expiration=30s logID=BYiMIPO8J9 namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=warning msg="report: subreport-cpu-usage-run-immediately-expiration-report, would be deleted because expired, but is depended on" app=metering component=reportWorker expiration=30s logID=BYiMIPO8J9 namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="requeueing report that has reached its expiration date during the op.runReport method" app=metering component=reportWorker expiration=30s logID=BYiMIPO8J9 namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="successfully synced Report \"metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report\"" Report=metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report app=metering component=reportWorker logID=BYiMIPO8J9 time="2021-02-09T18:22:05Z" level=info msg="syncing Report metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report" app=metering component=reportWorker logID=xK1BdBIgPg time="2021-02-09T18:22:05Z" level=info msg="Event(v1.ObjectReference{Kind:\"Report\", Namespace:\"metering-hdfs-reportstaticinputdata\", Name:\"subreport-cpu-usage-run-immediately-expiration-report\", UID:\"0eac71a2-9b7d-4269-a05a-ce5c1affbc2d\", APIVersion:\"metering.openshift.io/v1\", ResourceVersion:\"44151\", FieldPath:\"\"}): type: 'Warning' reason: 'ExpiredReportHasDependencies' Skipping the deletion of the subreport-cpu-usage-run-immediately-expiration-report Report as other resources are dependent on it, despite reaching the desired expiration date." app=metering time="2021-02-09T18:22:05Z" level=info msg="ReportQuery, subreport-cpu-usage-run-immediately-expiration-report exists that uses report subreport-cpu-usage-run-immediately-expiration-report as input, will not delete though retention period has expired" app=metering component=reportWorker expiration=30s logID=xK1BdBIgPg namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=warning msg="report: subreport-cpu-usage-run-immediately-expiration-report, would be deleted because expired, but is depended on" app=metering component=reportWorker expiration=30s logID=xK1BdBIgPg namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="requeueing report that has reached its expiration date during the op.runReport method" app=metering component=reportWorker expiration=30s logID=xK1BdBIgPg namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="successfully synced Report \"metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report\"" Report=metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report app=metering component=reportWorker logID=xK1BdBIgPg time="2021-02-09T18:22:05Z" level=info msg="syncing Report metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report" app=metering component=reportWorker logID=IXaQiUwK62 time="2021-02-09T18:22:05Z" level=info msg="Event(v1.ObjectReference{Kind:\"Report\", Namespace:\"metering-hdfs-reportstaticinputdata\", Name:\"subreport-cpu-usage-run-immediately-expiration-report\", UID:\"0eac71a2-9b7d-4269-a05a-ce5c1affbc2d\", APIVersion:\"metering.openshift.io/v1\", ResourceVersion:\"44151\", FieldPath:\"\"}): type: 'Warning' reason: 'ExpiredReportHasDependencies' Skipping the deletion of the subreport-cpu-usage-run-immediately-expiration-report Report as other resources are dependent on it, despite reaching the desired expiration date." app=metering time="2021-02-09T18:22:05Z" level=info msg="ReportQuery, subreport-cpu-usage-run-immediately-expiration-report exists that uses report subreport-cpu-usage-run-immediately-expiration-report as input, will not delete though retention period has expired" app=metering component=reportWorker expiration=30s logID=IXaQiUwK62 namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=warning msg="report: subreport-cpu-usage-run-immediately-expiration-report, would be deleted because expired, but is depended on" app=metering component=reportWorker expiration=30s logID=IXaQiUwK62 namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="requeueing report that has reached its expiration date during the op.runReport method" app=metering component=reportWorker expiration=30s logID=IXaQiUwK62 namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="successfully synced Report \"metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report\"" Report=metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report app=metering component=reportWorker logID=IXaQiUwK62 time="2021-02-09T18:22:05Z" level=info msg="syncing Report metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report" app=metering component=reportWorker logID=ncOzf5iCaC time="2021-02-09T18:22:05Z" level=info msg="Event(v1.ObjectReference{Kind:\"Report\", Namespace:\"metering-hdfs-reportstaticinputdata\", Name:\"subreport-cpu-usage-run-immediately-expiration-report\", UID:\"0eac71a2-9b7d-4269-a05a-ce5c1affbc2d\", APIVersion:\"metering.openshift.io/v1\", ResourceVersion:\"44151\", FieldPath:\"\"}): type: 'Warning' reason: 'ExpiredReportHasDependencies' Skipping the deletion of the subreport-cpu-usage-run-immediately-expiration-report Report as other resources are dependent on it, despite reaching the desired expiration date." app=metering time="2021-02-09T18:22:05Z" level=info msg="ReportQuery, subreport-cpu-usage-run-immediately-expiration-report exists that uses report subreport-cpu-usage-run-immediately-expiration-report as input, will not delete though retention period has expired" app=metering component=reportWorker expiration=30s logID=ncOzf5iCaC namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=warning msg="report: subreport-cpu-usage-run-immediately-expiration-report, would be deleted because expired, but is depended on" app=metering component=reportWorker expiration=30s logID=ncOzf5iCaC namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="requeueing report that has reached its expiration date during the op.runReport method" app=metering component=reportWorker expiration=30s logID=ncOzf5iCaC namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="successfully synced Report \"metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report\"" Report=metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report app=metering component=reportWorker logID=ncOzf5iCaC time="2021-02-09T18:22:05Z" level=info msg="syncing Report metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report" app=metering component=reportWorker logID=ksKfwvve2H time="2021-02-09T18:22:05Z" level=info msg="Event(v1.ObjectReference{Kind:\"Report\", Namespace:\"metering-hdfs-reportstaticinputdata\", Name:\"subreport-cpu-usage-run-immediately-expiration-report\", UID:\"0eac71a2-9b7d-4269-a05a-ce5c1affbc2d\", APIVersion:\"metering.openshift.io/v1\", ResourceVersion:\"44151\", FieldPath:\"\"}): type: 'Warning' reason: 'ExpiredReportHasDependencies' Skipping the deletion of the subreport-cpu-usage-run-immediately-expiration-report Report as other resources are dependent on it, despite reaching the desired expiration date." app=metering time="2021-02-09T18:22:05Z" level=info msg="ReportQuery, subreport-cpu-usage-run-immediately-expiration-report exists that uses report subreport-cpu-usage-run-immediately-expiration-report as input, will not delete though retention period has expired" app=metering component=reportWorker expiration=30s logID=ksKfwvve2H namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=warning msg="report: subreport-cpu-usage-run-immediately-expiration-report, would be deleted because expired, but is depended on" app=metering component=reportWorker expiration=30s logID=ksKfwvve2H namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="requeueing report that has reached its expiration date during the op.runReport method" app=metering component=reportWorker expiration=30s logID=ksKfwvve2H namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=info msg="successfully synced Report \"metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report\"" Report=metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report app=metering component=reportWorker logID=ksKfwvve2H time="2021-02-09T18:22:05Z" level=info msg="syncing Report metering-hdfs-reportstaticinputdata/subreport-cpu-usage-run-immediately-expiration-report" app=metering component=reportWorker logID=8OONIMwITk time="2021-02-09T18:22:05Z" level=info msg="Event(v1.ObjectReference{Kind:\"Report\", Namespace:\"metering-hdfs-reportstaticinputdata\", Name:\"subreport-cpu-usage-run-immediately-expiration-report\", UID:\"0eac71a2-9b7d-4269-a05a-ce5c1affbc2d\", APIVersion:\"metering.openshift.io/v1\", ResourceVersion:\"44151\", FieldPath:\"\"}): type: 'Warning' reason: 'ExpiredReportHasDependencies' Skipping the deletion of the subreport-cpu-usage-run-immediately-expiration-report Report as other resources are dependent on it, despite reaching the desired expiration date." app=metering time="2021-02-09T18:22:05Z" level=info msg="ReportQuery, subreport-cpu-usage-run-immediately-expiration-report exists that uses report subreport-cpu-usage-run-immediately-expiration-report as input, will not delete though retention period has expired" app=metering component=reportWorker expiration=30s logID=8OONIMwITk namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report time="2021-02-09T18:22:05Z" level=warning msg="report: subreport-cpu-usage-run-immediately-expiration-report, would be deleted because expired, but is depended on" app=metering component=reportWorker expiration=30s logID=8OONIMwITk namespace=metering-hdfs-reportstaticinputdata report=subreport-cpu-usage-run-immediately-expiration-report ``` Above, we can see that the problematic Report keeps getting re-queued and the sync handler is incorrectly managing those roll-up Report scenarios. Version-Release number of selected component (if applicable): 4.6+ How reproducible: Always Steps to Reproduce: 1. Create a run-once Report 2. Create a ReportQuery that references that run-once Report 3. Create a roll-up Report that references that custom ReportQuery Actual results: Expected results: Additional info: