Description of problem: The recent performance measurements on OCP 4.6.1 / s390x cluster running cluster logging instance shows increased CPU consumption of about - 5 CPU cores for fluentD over 6 nodes (3M + 3 W) - 1 CPU core for Elastic Search over 3 worker nodes Version-Release number of selected component (if applicable): # oc version Client Version: 4.6.0-rc.4 Server Version: 4.6.1 Kubernetes Version: v1.19.0+d59ce34 How reproducible: Every time Steps to Reproduce: 1. Install an OCP 4.6.1 cluster on s390x 2. Install Elastic Search, Cluster Logging, Local Storage operators from the console 3. Make local PVs available using LSO 4. Deploy the cluster logging instance to the cluster (definition below) 5. Measure the CPU consumption of FluentD, elasticsearch process using the top command in each of the cluster nodes the logging instance definition is as below apiVersion: "logging.openshift.io/v1" kind: "ClusterLogging" metadata: name: "instance" namespace: "openshift-logging" spec: managementState: "Managed" logStore: type: "elasticsearch" elasticsearch: nodeCount: 3 storage: storageClassName: "local-sc" size: 7043Mi redundancyPolicy: "ZeroRedundancy" resources: request: memory: 2Gi visualization: type: "kibana" kibana: replicas: 1 curation: type: "curator" curator: schedule: "30 3 * * *" collection: logs: type: "fluentd" fluentd: {} Actual results: Increased CPU consumption as described above. Expected results: What would be the reason for an increased CPU consumption by the fluentD pods? if, is there a way to reduce it? what would be the recommended CPU requests/limits for the ES, fluentD pods? Is there any performance report/profiling results available for the cluster logging components (fluentD, elasticsearch)? Additional info: The cluster’s resource spec is - master nodes - 4 CPU / 16G , - worker nodes 01,02 - 10 CPU / 32G (increased memory as needed for ES pods), - worker 03 - 4 CPU / 16G. In the log_instance definition, the fluentD / ES pod do not have any request limits specified. As the performance measurements were taken using an internally developed tool, the performance results are not made visible. Please, let me know other logs which would be of interest here, I can provide them.
We are expecting this behavior also to occur on x86. Could someone please check this on x86? Thank you
The fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=1895385 was followed up and verified in the OCP 4.6.0-0.nightly-s390x-2021-01-18-070324; hence, closing it.