Bug 1327028
| Summary: | aggregated logging / elasticsearch maintenance(https://github.com/openshift/origin-aggregated-logging/issues/42) | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Miheer Salunke <misalunk> | 
| Component: | RFE | Assignee: | Luke Meyer <lmeyer> | 
| Status: | CLOSED ERRATA | QA Contact: | chunchen <chunchen> | 
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.1.0 | CC: | aos-bugs, erich, erjones, ewolinet, jialiu, jkaur, jokerman, lmeyer, misalunk, mmccomas, pep, vigoyal, wsun, xiazhao | 
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-09-27 09:37:51 UTC | Type: | Bug | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1267746 | ||
| 
 
        
          Description
        
        
          Miheer Salunke
        
        
        
        
        
          2016-04-14 06:39:27 UTC
        
       
      
      
      
    With 3.1.1 and 3.2.0 images (coming soon) we will be providing an update that also creates an administrative cert and ACL to allow manual maintenance of Elasticsearch. You can rsh into an ES pod and use the admin key/cert/ca from the logging-elasticsearch secret to run curl -X DELETE against indices and any other operation, manually. More explicit documentation on this is forthcoming. This is a stopgap until the Curator solution is delivered post-3.2.0, which should happen as soon as we can get it through QE and out the door along with all the other major changes that are in Origin but missed 3.2.0. *** Bug 1337633 has been marked as a duplicate of this bug. *** @Luke, are those 3.1 images released yet? And if so, do we have that more explicit documentation yet? (In reply to Eric Jones from comment #5) > @Luke, are those 3.1 images released yet? Yes; if you redeploy with the latest 3.1.1 images, the deployer creates an admin cert in the elasticsearch secret. > And if so, do we have that more > explicit documentation yet? Alas, no. Perhaps a project for me today. The admin cert can be used either within the ES container or, when extracted, from anywhere that can reach the pod SDN. Thanks for the information Luke. I have another question in the same vein, how bad for the logging pods would it be to simply delete the contents of the logging PVC? And if it would be catastrophic, is the best bet (to clean up the log storage) just to wait on how to use these new images? If you scale down the ES deployment(s) and delete the PVC contents, everything should be fine when you scale them back up, aside from (obviously) not having any data (including losing any customizations to kibana profiles). Whether to wait depends on the severity of the situation and how much you care about keeping old logs... Thanks again Luke, that is exactly the answer I was hoping for. Set to verified since the logging-curator image is now available in Dockerhub for use There is the admin cert that is available as of 3.2 that a customer can use to manually delete old indices from Elasticsearch.
I will open up a docs PR to document how to do this.
1. Check that you have the following secret entries "admin-key", "admin-cert", "admin-ca" in the logging-elasticsearch secret
    $ oc get secret/logging-elasticsearch -o yaml
1 a. If there are not these values, you will need to rerun the logging-deployer with version at least 3.2.0 so that the deployer can generate these certificates and attach them to the secret.  That process is described in the OpenShift docs[1]
2. Connect to an Elasticsearch pod that is in the cluster you are attempting to clean up from:
  Find a pod
    $ oc get pods -l component=es
    $ oc get pods -l component=es-ops
  Connect to pod
    $ oc rsh {your_es_pod}
2 a. From within an Elasticsearch pod you can issue the following to delete an index of your choice
    $ curl --key /etc/elasticsearch/keys/admin-key --cert /etc/elasticsearch/keys/admin-cert --cacert /etc/elasticsearch/keys/admin-ca -XDELETE "https://localhost:9200/{your_index}"
There is further documentation described on the ES Delete[2] and Delete by query[3] API pages.
[1] https://docs.openshift.org/latest/install_config/upgrading/manual_upgrades.html#manual-upgrading-efk-logging-stack
[2] https://www.elastic.co/guide/en/elasticsearch/reference/1.5/docs-delete.html
[3] https://www.elastic.co/guide/en/elasticsearch/reference/1.5/docs-delete-by-query.html
    Docs PR opened here: https://github.com/openshift/openshift-docs/pull/2291 Thanks for the workaround. This is definetely helpful. Additional info : With delete api [1], if your indices tagged with date is more simple we can delete data that start with 2016-OR 2016-05- the first will delete all data in 2016 the second all data in may 2016. e.g; curl -XDELETE 'http://localhost:9200/logstash-2016-05-' curl -XDELETE 'http://localhost:9200/logstash-2016-05-' [1] https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html https://github.com/openshift/openshift-docs/pull/2475 adds doc to describe using Curator for ES index maintenance. This is available in Origin and OSE 3.2.1. I expect the docs will be updated next week. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933  |