Created attachment 1290419 [details] fluentd log Description of problem: Deploy logging on GCE which has 10 nodes, 5 fluentd pods are in MatchNodeSelector status and can not be started up, and find "Pod Predicate MatchNodeSelector failed" in fluentd pod log # oc get node NAME STATUS AGE upg0620-master-etcd-1 Ready,SchedulingDisabled 1d upg0620-master-etcd-2 Ready,SchedulingDisabled 1d upg0620-master-etcd-3 Ready,SchedulingDisabled 1d upg0620-node-primary-1 Ready 1d upg0620-node-primary-2 Ready 1d upg0620-node-primary-3 Ready 1d upg0620-node-primary-4 Ready 1d upg0620-node-primary-5 Ready 1d upg0620-node-registry-router-1 Ready 1d upg0620-node-registry-router-2 Ready 1d # oc get po -n logging -o wide NAME READY STATUS RESTARTS AGE IP NODE logging-curator-1-t0lgf 1/1 Running 0 26m 10.2.18.40 upg0620-node-primary-4 logging-curator-ops-1-c70lk 1/1 Running 0 26m 10.2.16.29 upg0620-node-primary-5 logging-es-euacbkmo-1-7jfbl 1/1 Running 0 26m 10.2.10.54 upg0620-node-primary-1 logging-es-ops-ouquca0p-1-vpqq5 1/1 Running 0 26m 10.2.8.31 upg0620-node-primary-3 logging-fluentd-0j05t 0/1 MatchNodeSelector 0 27m <none> upg0620-master-etcd-1 logging-fluentd-1l12k 1/1 Running 0 27m 10.2.10.53 upg0620-node-primary-1 logging-fluentd-1vjrp 0/1 MatchNodeSelector 0 26m <none> upg0620-node-registry-router-2 logging-fluentd-28vk4 0/1 MatchNodeSelector 0 26m <none> upg0620-node-registry-router-1 logging-fluentd-3vn58 0/1 MatchNodeSelector 0 27m <none> upg0620-master-etcd-3 logging-fluentd-dmz9b 0/1 MatchNodeSelector 0 27m <none> upg0620-master-etcd-2 logging-fluentd-nfz79 1/1 Running 0 26m 10.2.18.38 upg0620-node-primary-4 logging-fluentd-qx3k5 1/1 Running 0 27m 10.2.12.38 upg0620-node-primary-2 logging-fluentd-scslf 1/1 Running 0 26m 10.2.16.28 upg0620-node-primary-5 logging-fluentd-tlx7v 1/1 Running 0 27m 10.2.8.30 upg0620-node-primary-3 logging-kibana-1-45t4b 2/2 Running 0 26m 10.2.18.39 upg0620-node-primary-4 logging-kibana-ops-1-c0djx 2/2 Running 0 26m 10.2.10.55 upg0620-node-primary-1 Version-Release number of selected component (if applicable): # oc version oc v3.5.5.27 kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO Images from ops mirror # docker images | grep logging logging-kibana 3.5.0 e0974f3393e2 10 hours ago 343.1 MB logging-fluentd 3.5.0 63a1d8086c64 10 hours ago 232.8 MB logging-curator 3.5.0 c14f234e4210 10 hours ago 211.3 MB logging-auth-proxy 3.5.0 90d8b97402af 10 hours ago 215.3 MB logging-elasticsearch 3.5.0 14766cbe8b39 10 hours ago 399.5 MB How reproducible: Always Steps to Reproduce: 1.Deploy logging on GCE which has 10 nodes 2. 3. Actual results: 5 fluentd pods are in MatchNodeSelector status and can not be started up Expected results: All pods should be in running status Additional info: Attached inventory file and fluentd log
Created attachment 1290420 [details] ansible inventory file
system works as expected from scheduler point. Why 10 fluentd pods needed?
those 5 pods are not running due to MatchNodeSelector, the reason is right. those 5 nodes do not match all requirement. nodeSelector: logging-infra-fluentd: "true" region: primary role: node
wrong configurations, closed as WORKSFORME