Bug 1754118

Summary: "CronJob /logging-curator is taking more than 1h to complete." reported in Prometheus
Product: OpenShift Container Platform Reporter: Greg Rodriguez II <grodrigu>
Component: MonitoringAssignee: Pawel Krupa <pkrupa>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: alegrand, anpicker, calfonso, erooth, kakkoyun, lcosic, mloibl, pkrupa, surbania
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-07 11:24:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Greg Rodriguez II 2019-09-20 20:54:17 UTC
Description of problem:
"CronJob /logging-curator is taking more than 1h to complete." is being reported daily in Prometheus, however the Curator is working as expected and the Curator pod moves from "Running" status to "Completed" in under a few minutes.  Indices older than the specified number of days are deleted without issue.

Version-Release number of selected component (if applicable):
OCP 3.11.98

How reproducible:
Multiple customers reporting this issue under same OCP version

Steps to Reproduce:
1. Set Curator cronjob to run once a day and allow it to run
2. Check AlertManager and find run length warning
3. Check list of indices and verify that Curator ran correctly (files deleted)
4. Verify no errant issues reported in Curator pod log

Actual results:
"CronJob /logging-curator is taking more than 1h to complete." in AlertManager although Curator runs correctly

Expected results:
There should be no notifications for the Curator cronjob if it runs correctly

Additional info: