Bug 1578605
| Summary: | [free-int] timeout waiting for elastic search pods to be !red | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Justin Pierce <jupierce> | ||||
| Component: | Logging | Assignee: | ewolinet | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Anping Li <anli> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.10.0 | CC: | aos-bugs, jcantril, pruan, rmeggins, xtian | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.10.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
Cause: As part of a change in how our handlers restart clusters, we changed it to always check that the pods were running.
Consequence: A cluster that requires more than one node will not be ready since it is waiting for other members to join when we are scaling up the cluster.
Fix: When we scale up, we don't wait for each pod to be ready so that it can find the correct number of cluster members.
Result: The pods are able to be ready.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-12-20 21:12:46 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1581058 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Justin Pierce
2018-05-16 00:45:51 UTC
@ewolinet, Thanks. The ES status be in red status! Are the ES pod restart changed the one ES nodes Shards Allocation status to enabled while the other nodes are still disabled? @Anping, One of the changes we are making is to disable shard allocation before the rollout of a node and re-enable shard allocation after it is available but prior to waiting for the cluster to return to 'green'. The issue we are seeing is that when a new index is created, if shard allocation is set to 'none' then we are unable to place any of the shards for the index which will automatically put the cluster into a 'red' state. This change should allow the cluster to return to 'green' between restarts. Shall we use persistent setting? I think the transient setting may be changed during ES restart. upgrade from v3.9 to v3.10 via openshift-ansible-3.10.0-0.53.0. The ES cluster are not restarted. The playbook report: Cluster logging-es was not in an optimal state and will not be automatically restarted. Please see documentation regarding doing a rolling cluster restart. [ Created attachment 1445323 [details]
The ansible logs for logging upgrade
The cluster_pods.stdout_lines is 1. It should be 3. Attached all ansible logs.
RUNNING HANDLER [openshift_logging_elasticsearch : debug] *********************************************************************************************************************************************************
ok: [qe-anli310master-etcd-1.0529-l0l.qe.rhcloud.com] => {
"msg": "Cluster logging-es was not in an optimal state and will not be automatically restarted. Please see documentation regarding doing a rolling cluster restart."
}
RUNNING HANDLER [openshift_logging_elasticsearch : debug] *********************************************************************************************************************************************************
ok: [qe-anli310master-etcd-1.0529-l0l.qe.rhcloud.com] => {
"msg": "pod status is green, number_of_nodes is 3, cluster_pods.stdout_lines is 1"
}
I watched red indices in v3.9 testing today. When I redeployed logging, one automation scripts is creating and deleting projects. some project index had become red. the .operations and .orphaned also had become red. Not sure if that is same issue, just leave a message here. The upgrade works well with 3.10.0-0.60.0. |