Bug 1559162 - Curator should not run to delete indices at mid-night, avoiding conflicts when the initial set of new indices is created
Summary: Curator should not run to delete indices at mid-night, avoiding conflicts whe...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.7.1
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: 3.7.z
Assignee: Josef Karasek
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On: 1575820
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-21 20:46 UTC by Peter Portante
Modified: 2018-05-18 03:55 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: DELETE operation was invoked on the elasticsearch cluster at the same time with CREATE operation. This generated very high computational load on elasticsearch. Consequence: Many requests from log collector were refused. Fix: DELETE operation is run by default at 3:30am UTC time. This is configurable via installer variables. Result: Computational load is better spread between events.
Clone Of:
Environment:
Last Closed: 2018-05-18 03:54:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1576 0 None None None 2018-05-18 03:55:23 UTC

Description Peter Portante 2018-03-21 20:46:11 UTC
Curator should not run to delete indices at mid-night, avoiding conflicts when the initial set of new indices is created.

We have experimented on large clusters moving that to 3:29 UTC to give time for new indicies.

Without this fixed, as clusters scale up the number of indices they track, this can cripple the behavior of logging.

Comment 1 Jeff Cantrill 2018-03-22 02:13:10 UTC

*** This bug has been marked as a duplicate of bug 1559168 ***

Comment 2 Jeff Cantrill 2018-05-02 18:13:58 UTC
Reopening for 3.7 changes

Comment 3 Jeff Cantrill 2018-05-02 18:14:52 UTC
3.7 cherrypick https://github.com/openshift/origin-aggregated-logging/pull/1038

Comment 5 Anping Li 2018-05-09 10:38:17 UTC
The bug have been fixed in openshift-ansible v3.7.46.

# oc get dc logging-curator -o json |grep -A 1 CURATOR_RUN_
                                "name": "CURATOR_RUN_HOUR",
                                "value": "3"
--
                                "name": "CURATOR_RUN_MINUTE",
                                "value": "30"
--
                                "name": "CURATOR_RUN_TIMEZONE",
                                "value": "UTC"

Comment 8 errata-xmlrpc 2018-05-18 03:54:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1576


Note You need to log in before you can comment on or make changes to this bug.