Bug 1866963

Summary: Cluster Logging 4.5 Delete/Rollover Cronjobs Fail Occasionally
Product: OpenShift Container Platform Reporter: gcollege
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED DUPLICATE QA Contact: Anping Li <anli>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs, mallmen, tcort
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-07 19:09:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description gcollege 2020-08-07 01:38:18 UTC
Description of problem:
With the deployment of Cluster Logging 4.5 there are now 6 new cronjobs which deletes/rolls over log files in ElasticSearch. It has been seen that both the delete and rollover cronjobs fail occasionally (all log types i.e. app, infra, audit). These jobs seem to have the same error message for their respective purpose, any time they fail, AKA for rollover cronjobs the errors are:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "security_exception",
        "reason" : "Unexpected exception indices:admin/rollover"
      }
    ],
    "type" : "security_exception",
    "reason" : "Unexpected exception indices:admin/rollover"
  },
  "status" : 500
}

for delete jobs the logs appear like:

Traceback (most recent call last):
  File "<string>", line 3, in <module>
TypeError: 'int' object has no attribute '__getitem__'


Version-Release number of selected component (if applicable): 
OCP 4.5
ElasticSearch 4.5
Cluster Logging: 4.5


How reproducible:
Very. At least 4 others have seen this issue.


Steps to Reproduce:
1. Install ElasticSearch 4.5 from 4.4 according to the cluster-logging upgrade guide
2. Install ClusterLogging 4.5 from 4.4according to the cluster-logging upgrade guide

Actual results:

Job failures with error messages stated above.

Expected results:

No failures

Comment 1 Jeff Cantrill 2020-08-07 19:09:07 UTC

*** This bug has been marked as a duplicate of bug 1866019 ***