Description of problem: The curator pod is always in Error status, there has some error message in the logs: $ oc get pod NAME READY STATUS RESTARTS AGE cluster-logging-operator-54f56bccd-7ffg7 1/1 Running 0 82m curator-1595833200-bm8xw 0/1 Error 0 6m55s elasticsearch-cdm-ge8f518d-1-774877997b-xkcbx 2/2 Running 0 76m elasticsearch-cdm-ge8f518d-2-5d6d4cc4c5-h4s7k 2/2 Running 0 76m elasticsearch-cdm-ge8f518d-3-885c875f-l4jsz 2/2 Running 0 76m fluentd-9bhv4 1/1 Running 1 76m fluentd-dtlmh 1/1 Running 0 76m fluentd-f4h97 1/1 Running 0 76m fluentd-kqn9g 1/1 Running 0 76m fluentd-tmvfq 1/1 Running 0 76m fluentd-vgwhz 1/1 Running 0 76m kibana-5469f5f7d8-qsvrl 2/2 Running 0 76m $ oc logs curator-1595833200-bm8xw Traceback (most recent call last): File "/opt/app-root/src/lib/oalconverter/convert.py", line 18, in <module> from ruamel import yaml ModuleNotFoundError: No module named 'ruamel' Usage: curator [OPTIONS] ACTION_FILE Error: Invalid value for "action_file": Path "/opt/app-root/src/actions.yaml" does not exist. $ oc get cm curator -oyaml apiVersion: v1 data: actions.yaml: | # --- # Remember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" # # Also remember that all examples have 'disable_action' set to True. If you # want to use this action as a template, be sure to set this to False after # copying it. # actions: # 1: # action: delete_indices # description: >- # Delete .operations indices older than 30 days. # Ignore the error if the filter does not # result in an actionable list of indices (ignore_empty_list). # See https://www.elastic.co/guide/en/elasticsearch/client/curator/5.2/ex_delete_indices.html # options: # # Swallow curator.exception.NoIndices exception # ignore_empty_list: True # # In seconds, default is 300 # timeout_override: ${CURATOR_TIMEOUT} # # Don't swallow any other exceptions # continue_if_exception: False # # Optionally disable action, useful for debugging # disable_action: False # # All filters are bound by logical AND # filters: # - filtertype: pattern # kind: regex # value: '^\.operations\..*$' # exclude: False # - filtertype: age # # Parse timestamp from index name # source: name # direction: older # timestring: '%Y.%m.%d' # unit: days # unit_count: 30 # exclude: False config.yaml: | # Logging example curator config file # uncomment and use this to override the defaults from env vars #.defaults: # delete: # days: 30 # to keep ops logs for a different duration: #.operations: # delete: # weeks: 8 # example for a normal project #myapp: # delete: # weeks: 1 curator5.yaml: "---\nclient:\n hosts:\n - ${ES_HOST}\n port: ${ES_PORT}\n use_ssl: True\n certificate: ${ES_CA}\n client_cert: ${ES_CLIENT_CERT}\n client_key: ${ES_CLIENT_KEY}\n ssl_no_validate: False\n timeout: ${CURATOR_TIMEOUT}\n master_only: False\nlogging:\n loglevel: ${CURATOR_LOG_LEVEL}\n logformat: default\n blacklist: ['elasticsearch', 'urllib3']\n \n" kind: ConfigMap metadata: creationTimestamp: "2020-07-27T05:50:13Z" name: curator namespace: openshift-logging ownerReferences: - apiVersion: logging.openshift.io/v1 controller: true kind: ClusterLogging name: instance uid: b62cf505-01c4-4b7c-a42d-6965c1541e83 resourceVersion: "64551" selfLink: /api/v1/namespaces/openshift-logging/configmaps/curator uid: 5f0f826b-e156-4124-8513-d639320e4630 Version-Release number of selected component (if applicable): ose-logging-curator5-v4.4.0-202007240028.p0 How reproducible: Always Steps to Reproduce: 1. deploy logging 2. check pod status 3. Actual results: Expected results: Additional info:
Moving to UpcomingSprint
No module named 'ruamel' in logging-curator5:v4.4.0-202007300614.p0 clusterlogging.4.4.0-202007312002.p0 $oc logs curator-1596607200-6rhmf Traceback (most recent call last): File "/opt/app-root/src/lib/oalconverter/convert.py", line 18, in <module> from ruamel import yaml ModuleNotFoundError: No module named 'ruamel' Usage: curator [OPTIONS] ACTION_FILE Error: Invalid value for "action_file": Path "/opt/app-root/src/actions.yaml" does not exist. [anli@preserve-docker-slave 105295]$ oc logs curator-1596607200-6rhmf Traceback (most recent call last): File "/opt/app-root/src/lib/oalconverter/convert.py", line 18, in <module> from ruamel import yaml ModuleNotFoundError: No module named 'ruamel' Usage: curator [OPTIONS] ACTION_FILE Error: Invalid value for "action_file": Path "/opt/app-root/src/actions.yaml" does not exist.
No such issue in 4.5 when I used the latest 4.5 images. CSV:clusterlogging.4.5.0-202007311600.p0, image: openshift-ose-logging-curator5-v4.5.0-202007281732.p0
Hello, Do we have a workaround for this issue? could it be safe to downgrade to a previous version? I'm asking this since if the curator is not working, the old indices are not deleted, then customers could have an issue with the storage. Regards, Oscar
This issue is impacting 5 Openshift Dedicated customers so far. We have similar concerns, that not running curation will cause side effects with ES login storage and overall ES health.
Verified on clusterlogging.4.4.0-202008051553.p0, openshift/ose-logging-curator5:v4.4.0-202008051553.p0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.4.16 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3237