1327028 – aggregated logging / elasticsearch maintenance(https://github.com/openshift/origin-aggregated-logging/issues/42)

Bug 1327028 - aggregated logging / elasticsearch maintenance(https://github.com/openshift/origin-aggregated-logging/issues/42)

Summary: aggregated logging / elasticsearch maintenance(https://github.com/openshift/o...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	RFE
Sub Component:
Version:	3.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Luke Meyer
QA Contact:	chunchen
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1337633 (view as bug list)
Depends On:
Blocks:	1267746
TreeView+	depends on / blocked

Reported:	2016-04-14 06:39 UTC by Miheer Salunke
Modified:	2019-12-16 05:39 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-09-27 09:37:51 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	2317351	0	None	None	None	2016-06-27 06:07:56 UTC
Red Hat Product Errata	RHBA-2016:1933	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.3 Release Advisory	2016-09-27 13:24:36 UTC

Description Miheer Salunke 2016-04-14 06:39:27 UTC

Description of problem:

We've installed the aggregated-logging Stack from https://github.com/openshift/origin-aggregated-logging/tree/master/deployment
Now, collecting and displaying logs works fine, but collected data grows, and we cant invoke elasticsearch's REST API to make some cleanup/maintenance jobs. even when i connect to the elasticsearch-pod and call
curl -X GET http://127.0.0.1:9200
the response is always
curl: (52) Empty reply from server

How do you maintain elasticsearch-data? Is there a special token / secret that must be used to connect elasticsearch?

Version-Release number of selected component (if applicable):


How reproducible:
3.1

Steps to Reproduce:
1.Mentioned in description
2.
3.

Actual results:


Expected results:


Additional info:
https://github.com/openshift/origin-aggregated-logging/issues/42
Fix - https://github.com/openshift/origin-aggregated-logging/pull/57

Comment 3 Luke Meyer 2016-05-05 14:23:13 UTC

With 3.1.1 and 3.2.0 images (coming soon) we will be providing an update that also creates an administrative cert and ACL to allow manual maintenance of Elasticsearch. You can rsh into an ES pod and use the admin key/cert/ca from the logging-elasticsearch secret to run curl -X DELETE against indices and any other operation, manually. More explicit documentation on this is forthcoming.

This is a stopgap until the Curator solution is delivered post-3.2.0, which should happen as soon as we can get it through QE and out the door along with all the other major changes that are in Origin but missed 3.2.0.

Comment 4 Eric Jones 2016-05-19 16:50:26 UTC

*** Bug 1337633 has been marked as a duplicate of this bug. ***

Comment 5 Eric Jones 2016-05-19 16:56:50 UTC

@Luke, are those 3.1 images released yet? And if so, do we have that more explicit documentation yet?

Comment 6 Luke Meyer 2016-05-19 17:18:17 UTC

(In reply to Eric Jones from comment #5)
> @Luke, are those 3.1 images released yet?

Yes; if you redeploy with the latest 3.1.1 images, the deployer creates an admin cert in the elasticsearch secret.

> And if so, do we have that more
> explicit documentation yet?

Alas, no. Perhaps a project for me today.

The admin cert can be used either within the ES container or, when extracted, from anywhere that can reach the pod SDN.

Comment 7 Eric Jones 2016-05-19 17:23:20 UTC

Thanks for the information Luke.

I have another question in the same vein, how bad for the logging pods would it be to simply delete the contents of the logging PVC?

And if it would be catastrophic, is the best bet (to clean up the log storage) just to wait on how to use these new images?

Comment 8 Luke Meyer 2016-05-19 18:01:07 UTC

If you scale down the ES deployment(s) and delete the PVC contents, everything should be fine when you scale them back up, aside from (obviously) not having any data (including losing any customizations to kibana profiles). Whether to wait depends on the severity of the situation and how much you care about keeping old logs...

Comment 9 Eric Jones 2016-05-19 18:30:27 UTC

Thanks again Luke, that is exactly the answer I was hoping for.

Comment 11 Xia Zhao 2016-06-12 03:16:10 UTC

Set to verified since the logging-curator image is now available in Dockerhub for use

Comment 13 ewolinet 2016-06-15 15:13:20 UTC

There is the admin cert that is available as of 3.2 that a customer can use to manually delete old indices from Elasticsearch.

I will open up a docs PR to document how to do this.

1. Check that you have the following secret entries "admin-key", "admin-cert", "admin-ca" in the logging-elasticsearch secret
$ oc get secret/logging-elasticsearch -o yaml

1 a. If there are not these values, you will need to rerun the logging-deployer with version at least 3.2.0 so that the deployer can generate these certificates and attach them to the secret. That process is described in the OpenShift docs[1]

2. Connect to an Elasticsearch pod that is in the cluster you are attempting to clean up from:
Find a pod
$ oc get pods -l component=es
$ oc get pods -l component=es-ops

Connect to pod
$ oc rsh {your_es_pod}

2 a. From within an Elasticsearch pod you can issue the following to delete an index of your choice
$ curl --key /etc/elasticsearch/keys/admin-key --cert /etc/elasticsearch/keys/admin-cert --cacert /etc/elasticsearch/keys/admin-ca -XDELETE "https://localhost:9200/{your_index}"

There is further documentation described on the ES Delete[2] and Delete by query[3] API pages.

[1] https://docs.openshift.org/latest/install_config/upgrading/manual_upgrades.html#manual-upgrading-efk-logging-stack
[2] https://www.elastic.co/guide/en/elasticsearch/reference/1.5/docs-delete.html
[3] https://www.elastic.co/guide/en/elasticsearch/reference/1.5/docs-delete-by-query.html

Comment 14 ewolinet 2016-06-15 19:03:08 UTC

Docs PR opened here:
https://github.com/openshift/openshift-docs/pull/2291

Comment 15 Jaspreet Kaur 2016-06-16 11:16:17 UTC

Thanks for the workaround. This is definetely helpful.

Additional info :

With delete api [1], if your indices tagged with date is more simple
we can delete data that start with 2016-OR 2016-05- the first will delete all data in 2016 the second all data in may 2016.

e.g;
curl -XDELETE 'http://localhost:9200/logstash-2016-05-'
curl -XDELETE 'http://localhost:9200/logstash-2016-05-'

[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html

Comment 22 Luke Meyer 2016-07-15 19:45:45 UTC

https://github.com/openshift/openshift-docs/pull/2475 adds doc to describe using Curator for ES index maintenance. This is available in Origin and OSE 3.2.1. I expect the docs will be updated next week.

Comment 27 errata-xmlrpc 2016-09-27 09:37:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933

Note You need to log in before you can comment on or make changes to this bug.