Description of problem: 1. deploy EO and CLO 2. create CL instance, request more resources that the cluster has (e.g. memory) 3. ES pods are `Pending`, check the ES instance, it doesn't show why the pods are pending: managementState: Managed nodeSpec: proxyResources: limits: memory: 64Mi requests: cpu: 100m memory: 64Mi resources: requests: memory: 16Gi nodes: - genUUID: 5bx1e4fy nodeCount: 3 proxyResources: {} resources: {} roles: - client - data - master storage: size: 20Gi storageClassName: gp2 redundancyPolicy: SingleRedundancy status: cluster: activePrimaryShards: 0 activeShards: 0 initializingShards: 0 numDataNodes: 0 numNodes: 0 pendingTasks: 0 relocatingShards: 0 status: cluster health unknown unassignedShards: 0 nodes: - deploymentName: elasticsearch-cdm-5bx1e4fy-1 upgradeStatus: {} - deploymentName: elasticsearch-cdm-5bx1e4fy-2 upgradeStatus: {} - deploymentName: elasticsearch-cdm-5bx1e4fy-3 upgradeStatus: {} pods: client: failed: [] notReady: - elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf - elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b - elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j ready: [] data: failed: [] notReady: - elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf - elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b - elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j ready: [] master: failed: [] notReady: - elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b - elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j - elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf ready: [] shardAllocationEnabled: shard allocation unknown $ oc get pod NAME READY STATUS RESTARTS AGE cluster-logging-operator-5bf4bc5d44-4tpqm 1/1 Running 0 11m elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b 0/2 Pending 0 10m elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j 0/2 Pending 0 10m elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf 0/2 Pending 0 10m 4. adjust the request memory in CL instance to make the ES pods schedulable 5. wait a few minutes, check the ES pods, all the pods are `Pending`, the request memory is changed in ES CR instance, but the request memory in ES deployment isn't. managementState: Managed nodeSpec: proxyResources: limits: memory: 64Mi requests: cpu: 100m memory: 64Mi resources: requests: memory: 2Gi nodes: - genUUID: 5bx1e4fy nodeCount: 3 proxyResources: {} resources: {} roles: - client - data - master storage: size: 20Gi storageClassName: gp2 redundancyPolicy: SingleRedundancy status: cluster: activePrimaryShards: 0 activeShards: 0 initializingShards: 0 numDataNodes: 0 numNodes: 0 pendingTasks: 0 relocatingShards: 0 status: cluster health unknown unassignedShards: 0 nodes: - deploymentName: elasticsearch-cdm-5bx1e4fy-1 upgradeStatus: {} - deploymentName: elasticsearch-cdm-5bx1e4fy-2 upgradeStatus: {} - deploymentName: elasticsearch-cdm-5bx1e4fy-3 upgradeStatus: {} pods: client: failed: [] notReady: - elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf - elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b - elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j ready: [] data: failed: [] notReady: - elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf - elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b - elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j ready: [] master: failed: [] notReady: - elasticsearch-cdm-5bx1e4fy-1-79f4cddd4f-8dc7b - elasticsearch-cdm-5bx1e4fy-2-8c55476bd-dj56j - elasticsearch-cdm-5bx1e4fy-3-6f7d45fd7-fmwlf ready: [] shardAllocationEnabled: shard allocation unknown EO log: {"level":"error","ts":1602489488.6414049,"logger":"elasticsearch-operator","caller":"k8shandler/reconciler.go:65","msg":"failed to get LowestClusterVersion","cluster":"elasticsearch","namespace":"openshift-logging","error":"Get \"https://elasticsearch.openshift-logging.svc:9200/_cluster/stats/nodes/_all\": dial tcp 172.30.58.115:9200: connect: connection refused"} Version-Release number of selected component (if applicable): elasticsearch-operator.4.6.0-202010030042.p0 How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
@qitang Was the reason the pods were pending initially due to the memory request being too large?
(In reply to ewolinet from comment #1) > @qitang > > Was the reason the pods were pending initially due to the memory request > being too large? Yes, when I described the ES pods, the output was: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 26s 0/6 nodes are available: 6 Insufficient memory. Warning FailedScheduling 26s 0/6 nodes are available: 6 Insufficient memory.
Tested with quay.io/openshift/origin-elasticsearch-operator@sha256:c59349755eeefe446a5c39a2caf9dce1320a462530e8ac7b9f73fa38bc10a468, the status could be updated, the ES pod could be redeployed. nodes: - conditions: - lastTransitionTime: "2020-10-14T00:41:23Z" message: '0/6 nodes are available: 6 Insufficient memory.' reason: Unschedulable status: "True" type: Unschedulable deploymentName: elasticsearch-cdm-n4txturr-1 upgradeStatus: scheduledUpgrade: "True" underUpgrade: "True" upgradePhase: nodeRestarting
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Errata Advisory for Openshift Logging 5.0.0), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0652