Bug 1738758
| Summary: | the logging-es-ops nodeSelector are changed after upgrade | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Anping Li <anli> | ||||||
| Component: | Logging | Assignee: | Noriko Hosoi <nhosoi> | ||||||
| Status: | CLOSED DEFERRED | QA Contact: | Anping Li <anli> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.11.0 | CC: | aos-bugs, jcantril, nhosoi, rmeggins | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 3.11.z | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2020-02-02 01:32:09 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1601703 [details]
deploymentconfig Before upgrade
Created attachment 1601704 [details]
Deployment configure after upgrade
There is timespan between deployment configuration changed and logging-es-ops deploymentconfigure rollout. workaound: Correct the nodeSelector in logging-es-ops deploymentconfigure before the rollout. you can correct them at the point logging-es is restarting. Hi Anping,
Sorry for my ignorance, but I'd like to learn a couple more things...
1) Could you share these outputs?
- the ansible log from the upgrade
- ES log when it fails to start
- oc get events | grep Warning
2) If you label nodes and nodeSelector like this from the beginning, the logging-es pods do not start?
The logging-es-ops pod couldn't be started, as the nodeSelector are changed after upgrade
cat elasticsearch-dc-after.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector'
"logging-es-data-master-ajbqhp8h"
"logging-es-data-master-telafmeq"
"logging-es-ops-data-master-0fr84k1a"
"logging-es-ops-data-master-9961o92h"
"logging-es-ops-data-master-o7nhcbo4"
{
"logging-es-node": "1"
}
{
"logging-es-node": "0"
}
{
"logging-es-node": "2"
}
{
"logging-es-node": "0"
}
{
"logging-es-node": "1"
}
sorry for the later I will provide you the logs in the next 3.11 testing Closing DEFERRED. Please reopen if problem persists and there are open customer cases. |
Description of problem: The nodeSelector are changed in the logging-es-ops deploymentconfig after upgrade. that logging-es pod couldn't be started. 1) nodeSelector before Upgrade: cat elasticsearch-dc-before-upgrade.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-ops-node": "2" } { "logging-es-ops-node": "0" } { "logging-es-ops-node": "1" } 2) Logging Inventory used for upgrade openshift_logging_install_logging=true openshift_logging_es_cluster_size=2 openshift_logging_es_number_of_replicas=1 openshift_logging_es_number_of_shards=1 openshift_logging_es_memory_limit=2Gi openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"} openshift_logging_use_ops=true openshift_logging_es_ops_cluster_size=3 openshift_logging_es_ops_number_of_replicas=1 openshift_logging_es_ops_number_of_shards=1 openshift_logging_es_ops_memory_limit=2Gi openshift_logging_es_ops_nodeselector={"node-role.kubernetes.io/compute": "true"} openshift_logging_elasticsearch_storage_type=hostmount 3)nodeSelector after Upgrade cat elasticsearch-dc-after.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-node": "2" } { "logging-es-node": "0" } { "logging-es-node": "1" Version-Release number of selected component (if applicable): openshift3/ose-ansible:v3.11.135 How reproducible: Always Steps to Reproduce: 1. deploy logging using openshift_logging_use_ops=true openshift_logging_install_logging=true openshift_logging_es_cluster_size=2 openshift_logging_es_number_of_replicas=1 openshift_logging_es_number_of_shards=1 openshift_logging_es_memory_limit=2Gi openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"} openshift_logging_use_ops=true openshift_logging_es_ops_cluster_size=3 openshift_logging_es_ops_number_of_replicas=1 openshift_logging_es_ops_number_of_shards=1 openshift_logging_es_ops_memory_limit=2Gi openshift_logging_es_ops_nodeselector={"node-role.kubernetes.io/compute": "true"} 2. Add hostpath volume and nodeSelector to ES and ES-Ops deployment configure "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-ops-node": "2" } { "logging-es-ops-node": "0" } { "logging-es-ops-node": "1" } 3. Upgrade to latest version using openshift3/ose-ansible:v3.11.135 Actual results: The logging-es-ops pod couldn't be started, as the nodeSelector are changed after upgrade cat elasticsearch-dc-after.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-node": "2" } { "logging-es-node": "0" } { "logging-es-node": "1"