Description of problem: The nodeSelector are changed in the logging-es-ops deploymentconfig after upgrade. that logging-es pod couldn't be started. 1) nodeSelector before Upgrade: cat elasticsearch-dc-before-upgrade.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-ops-node": "2" } { "logging-es-ops-node": "0" } { "logging-es-ops-node": "1" } 2) Logging Inventory used for upgrade openshift_logging_install_logging=true openshift_logging_es_cluster_size=2 openshift_logging_es_number_of_replicas=1 openshift_logging_es_number_of_shards=1 openshift_logging_es_memory_limit=2Gi openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"} openshift_logging_use_ops=true openshift_logging_es_ops_cluster_size=3 openshift_logging_es_ops_number_of_replicas=1 openshift_logging_es_ops_number_of_shards=1 openshift_logging_es_ops_memory_limit=2Gi openshift_logging_es_ops_nodeselector={"node-role.kubernetes.io/compute": "true"} openshift_logging_elasticsearch_storage_type=hostmount 3)nodeSelector after Upgrade cat elasticsearch-dc-after.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-node": "2" } { "logging-es-node": "0" } { "logging-es-node": "1" Version-Release number of selected component (if applicable): openshift3/ose-ansible:v3.11.135 How reproducible: Always Steps to Reproduce: 1. deploy logging using openshift_logging_use_ops=true openshift_logging_install_logging=true openshift_logging_es_cluster_size=2 openshift_logging_es_number_of_replicas=1 openshift_logging_es_number_of_shards=1 openshift_logging_es_memory_limit=2Gi openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"} openshift_logging_use_ops=true openshift_logging_es_ops_cluster_size=3 openshift_logging_es_ops_number_of_replicas=1 openshift_logging_es_ops_number_of_shards=1 openshift_logging_es_ops_memory_limit=2Gi openshift_logging_es_ops_nodeselector={"node-role.kubernetes.io/compute": "true"} 2. Add hostpath volume and nodeSelector to ES and ES-Ops deployment configure "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-ops-node": "2" } { "logging-es-ops-node": "0" } { "logging-es-ops-node": "1" } 3. Upgrade to latest version using openshift3/ose-ansible:v3.11.135 Actual results: The logging-es-ops pod couldn't be started, as the nodeSelector are changed after upgrade cat elasticsearch-dc-after.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-node": "2" } { "logging-es-node": "0" } { "logging-es-node": "1"
Created attachment 1601703 [details] deploymentconfig Before upgrade
Created attachment 1601704 [details] Deployment configure after upgrade
There is timespan between deployment configuration changed and logging-es-ops deploymentconfigure rollout. workaound: Correct the nodeSelector in logging-es-ops deploymentconfigure before the rollout. you can correct them at the point logging-es is restarting.
Hi Anping, Sorry for my ignorance, but I'd like to learn a couple more things... 1) Could you share these outputs? - the ansible log from the upgrade - ES log when it fails to start - oc get events | grep Warning 2) If you label nodes and nodeSelector like this from the beginning, the logging-es pods do not start? The logging-es-ops pod couldn't be started, as the nodeSelector are changed after upgrade cat elasticsearch-dc-after.json | jq '.items[].metadata.name, .items[].spec.template.spec.nodeSelector' "logging-es-data-master-ajbqhp8h" "logging-es-data-master-telafmeq" "logging-es-ops-data-master-0fr84k1a" "logging-es-ops-data-master-9961o92h" "logging-es-ops-data-master-o7nhcbo4" { "logging-es-node": "1" } { "logging-es-node": "0" } { "logging-es-node": "2" } { "logging-es-node": "0" } { "logging-es-node": "1" }
sorry for the later I will provide you the logs in the next 3.11 testing
Closing DEFERRED. Please reopen if problem persists and there are open customer cases.