Description of problem: Seeing inconsistent issues with pod to pod communication when new pods come up (scale), when pods move between nodes of pods or when jobs run that connect to other existing pods. Sometimes there is no issue, sometimes the issue resolves and sometimes the flows are never updated without some form of intervention. Intervention being a) restarting the sdn pods on the nodes with impacted pods or b) scaling down the impacted pods and scaling back up again oc get networkpolicy --all-namespaces | wc -l 3691 Here is one example we pulled: NAME NETID EGRESS IPS 8ad0ea-dev 2075391 hex: 0x1faaff The indexer crashloops because it can not connect to the DB # oc -n 8ad0ea-dev get pods -l app=Offline-Indexing-oli,env=dev,role=offline-indexer -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES offline-indexer-oli-3-5hh8x 0/1 CrashLoopBackOff 31 160m 10.97.14.152 app-07.dmz <none> <none> # oc -n 8ad0ea-dev get pods -l app=Offline-Indexing-oli,role=db,env=dev -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES db-oli-2-ktpl9 1/1 Running 0 162m 10.97.106.200 =app-27.dmz <none> <none> NetPol for this one is `db-oli` Looking at the flows, 10.97.14.152 on app7 -> 10.97.106.200 on app27 on port 5432 does not exist within table80. app-07 cookie=0x0, duration=10169.271s, table=20, n_packets=203, n_bytes=8526, priority=100,arp,in_port=12460,arp_spa=10.97.14.152,arp_sha=00:00:0a:61:0e:98/00:00:ff:ff:ff:ff actions=load:0x1faaff->NXM_NX_REG0[],goto_table:21 cookie=0x0, duration=10169.271s, table=20, n_packets=426, n_bytes=27744, priority=100,ip,in_port=12460,nw_src=10.97.14.152 actions=load:0x1faaff->NXM_NX_REG0[],goto_table:21 cookie=0x0, duration=10169.271s, table=25, n_packets=234, n_bytes=17316, priority=100,ip,nw_src=10.97.14.152 actions=load:0x1faaff->NXM_NX_REG0[],goto_table:30 cookie=0x0, duration=10169.272s, table=70, n_packets=189, n_bytes=13986, priority=100,ip,nw_dst=10.97.14.152 actions=load:0x1faaff->NXM_NX_REG1[],load:0x30ac->NXM_NX_REG2[],goto_table:80 cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.89.180 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.89.180 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.49.66 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.49.66 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.88.53 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.88.53 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.68.27 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.68.27 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.70.228 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.70.228 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.77.10 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.639s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.77.10 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.49.66,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.68.27,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.70.228,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.77.10,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.88.53,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.49.66,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.68.27,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.70.228,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.77.10,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.88.53,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.110.150,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.110.150,nw_dst=10.97.71.211,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.49.66,nw_dst=10.97.71.211,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.88.53,nw_dst=10.97.71.211,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.641s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.544s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1d64be,nw_src=10.97.102.45,nw_dst=10.97.110.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.544s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1d64be,nw_src=10.97.105.210,nw_dst=10.97.110.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.544s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1d64be,nw_src=10.97.117.206,nw_dst=10.97.110.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=5.544s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1d64be,nw_src=10.97.23.4,nw_dst=10.97.110.53,tp_dst=8024 actions=output:NXM_NX_REG2[] app-27: cookie=0x0, duration=10225.072s, table=20, n_packets=106, n_bytes=4452, priority=100,arp,in_port=6143,arp_spa=10.97.106.200,arp_sha=00:00:0a:61:6a:c8/00:00:ff:ff:ff:ff actions=load:0x1faaff->NXM_NX_REG0[],goto_table:21 cookie=0x0, duration=10225.074s, table=20, n_packets=3054, n_bytes=209708, priority=100,ip,in_port=6143,nw_src=10.97.106.200 actions=load:0x1faaff->NXM_NX_REG0[],goto_table:21 cookie=0x0, duration=10225.075s, table=25, n_packets=0, n_bytes=0, priority=100,ip,nw_src=10.97.106.200 actions=load:0x1faaff->NXM_NX_REG0[],goto_table:30 cookie=0x0, duration=10225.080s, table=70, n_packets=4378, n_bytes=299540, priority=100,ip,nw_dst=10.97.106.200 actions=load:0x1faaff->NXM_NX_REG1[],load:0x17ff->NXM_NX_REG2[],goto_table:80 cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.49.66,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.68.27,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.70.228,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.77.10,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.88.53,nw_dst=10.97.102.45,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.49.66,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.68.27,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.70.228,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.77.10,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.88.53,nw_dst=10.97.117.206,tp_dst=8080 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.110.150,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.63.127,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.88.225,tp_dst=8983 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.110.150,nw_dst=10.97.71.211,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.49.66,nw_dst=10.97.71.211,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.88.53,nw_dst=10.97.71.211,tp_dst=5432 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.49.66,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.739s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.740s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.740s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.88.53,tp_dst=8024 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.740s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.102.45,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.740s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.105.210,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.740s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.117.206,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.740s, table=80, n_packets=0, n_bytes=0, priority=150,tcp,reg0=0x1faaff,reg1=0x1faaff,nw_src=10.97.23.4,nw_dst=10.97.89.180,tp_dst=5672 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.49.66 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.49.66 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.88.53 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.88.53 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.68.27 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.68.27 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.70.228 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.70.228 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.77.10 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.77.10 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0,reg1=0x1faaff,nw_dst=10.97.89.180 actions=output:NXM_NX_REG2[] cookie=0x0, duration=3.761s, table=80, n_packets=0, n_bytes=0, priority=150,ip,reg0=0xcecde4,reg1=0x1faaff,nw_dst=10.97.89.180 actions=output:NXM_NX_REG2[] Fails to connect: # oc -n 8ad0ea-dev rsh offline-indexer-oli-3-5hh8x $ timeout 5 bash -c '< /dev/tcp/10.97.106.200/5432'; echo $? 124 $ # oc -n openshift-sdn get pods -o wide | grep -E "mcs-silver-app-07.dmz|mcs-silver-app-27.dmz" ovs-2bcrq 1/1 Running 0 39d 142.34.151.188 app-27.dmz <none> <none> ovs-9dtth 1/1 Running 0 79m 142.34.151.147 app-07.dmz <none> <none> sdn-g9dx8 2/2 Running 0 78m 142.34.151.147 app-07.dmz <none> <none> sdn-xf75g 2/2 Running 0 76m 142.34.151.188 app-27.dmz <none> <none> We flipped the sdn pods into debug mode which triggered a restart and, as expected, resolved the communication issue. In the debug logs you can now see the flow being added; 2021-06-04T21:58:27.064896598Z flow add table=80, priority=150, reg1=2075391, ip, nw_dst=10.97.106.200, reg0=2075391, ip, nw_src=10.97.14.152, tcp, tp_dst=5432, actions=output:NXM_NX_REG2[] Parsing the NetworkPolicy shows it an unchanged: 2021-06-04T21:58:26.659861311Z I0604 21:58:26.659816 1280823 networkpolicy.go:596] Parsed NetworkPolicy: &node.npPolicy{policy:v1.NetworkPolicy{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"db-oli", GenerateName:"", Namespace:"8ad0ea-dev", SelfLink:"/apis/networking.k8s.io/v1/namespaces/8ad0ea-dev/networkpolicies/db-oli", UID:"c8f08526-1f5e-407a-9942-10471871b536", ResourceVersion:"1131519476", Generation:1, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63758414551, loc:(*time.Location)(0x2ce4220)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"app":"Offline-Indexing-oli", "app-group":"offline-indexing", "env":"dev", "name":"db-oli"}, Annotations:map[string]string{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"networking.k8s.io/v1\",\"kind\":\"NetworkPolicy\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"Offline-Indexing-oli\",\"app-group\":\"offline-indexing\",\"env\":\"dev\",\"name\":\"db-oli\"},\"name\":\"db-oli\",\"namespace\":\"8ad0ea-dev\"},\"spec\":{\"description\":\"Allow the api, msg queue worker, backup container, and schema spy to access the database.\\n\",\"ingress\":[{\"from\":[{\"namespaceSelector\":{\"matchLabels\":{\"environment\":\"dev\",\"name\":\"8ad0ea\"}},\"podSelector\":{\"matchLabels\":{\"app\":\"Offline-Indexing-oli\",\"env\":\"dev\",\"role\":\"api\"}}},{\"namespaceSelector\":{\"matchLabels\":{\"environment\":\"dev\",\"name\":\"8ad0ea\"}},\"podSelector\":{\"matchLabels\":{\"app\":\"Offline-Indexing-oli\",\"env\":\"dev\",\"role\":\"msg-queue-worker\"}}},{\"namespaceSelector\":{\"matchLabels\":{\"environment\":\"dev\",\"name\":\"8ad0ea\"}},\"podSelector\":{\"matchLabels\":{\"app\":\"Backup-oli\",\"env\":\"dev\",\"role\":\"backup\"}}},{\"namespaceSelector\":{\"matchLabels\":{\"environment\":\"dev\",\"name\":\"8ad0ea\"}},\"podSelector\":{\"matchLabels\":{\"app\":\"Offline-Indexing-oli\",\"env\":\"dev\",\"role\":\"schema-spy\"}}},{\"namespaceSelector\":{\"matchLabels\":{\"environment\":\"dev\",\"name\":\"8ad0ea\"}},\"podSelector\":{\"matchLabels\":{\"app\":\"Offline-Indexing-oli\",\"env\":\"dev\",\"role\":\"offline-indexer\"}}}],\"ports\":[{\"port\":5432,\"protocol\":\"TCP\"}]}],\"podSelector\":{\"matchLabels\":{\"app\":\"Offline-Indexing-oli\",\"env\":\"dev\",\"role\":\"db\"}}}}\n"}, OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"oc.exe", Operation:"Update", APIVersion:"networking.k8s.io/v1", Time:(*v1.Time)(0xc0028419c0), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc0028419e0)}}}, Spec:v1.NetworkPolicySpec{PodSelector:v1.LabelSelector{MatchLabels:map[string]string{"app":"Offline-Indexing-oli", "env":"dev", "role":"db"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}, Ingress:[]v1.NetworkPolicyIngressRule{v1.NetworkPolicyIngressRule{Ports:[]v1.NetworkPolicyPort{v1.NetworkPolicyPort{Protocol:(*v1.Protocol)(0xc0021bd680), Port:(*intstr.IntOrString)(0xc002841a00)}}, From:[]v1.NetworkPolicyPeer{v1.NetworkPolicyPeer{PodSelector:(*v1.LabelSelector)(0xc002841a40), NamespaceSelector:(*v1.LabelSelector)(0xc002841a60), IPBlock:(*v1.IPBlock)(nil)}, v1.NetworkPolicyPeer{PodSelector:(*v1.LabelSelector)(0xc002841a80), NamespaceSelector:(*v1.LabelSelector)(0xc002841aa0), IPBlock:(*v1.IPBlock)(nil)}, v1.NetworkPolicyPeer{PodSelector:(*v1.LabelSelector)(0xc002841ac0), NamespaceSelector:(*v1.LabelSelector)(0xc002841ae0), IPBlock:(*v1.IPBlock)(nil)}, v1.NetworkPolicyPeer{PodSelector:(*v1.LabelSelector)(0xc002841b00), NamespaceSelector:(*v1.LabelSelector)(0xc002841b40), IPBlock:(*v1.IPBlock)(nil)}, v1.NetworkPolicyPeer{PodSelector:(*v1.LabelSelector)(0xc002841b80), NamespaceSelector:(*v1.LabelSelector)(0xc002841ba0), IPBlock:(*v1.IPBlock)(nil)}}}}, Egress:[]v1.NetworkPolicyEgressRule(nil), PolicyTypes:[]v1.PolicyType{"Ingress"}}}, watchesNamespaces:true, watchesAllPods:true, watchesOwnPods:true, flows:[]string{"ip, nw_dst=10.97.106.200, reg0=2075391, ip, nw_src=10.97.14.152, tcp, tp_dst=5432, "}, selectedIPs:[]string{"10.97.106.200"}} 2021-06-04T21:58:26.659875067Z I0604 21:58:26.659856 1280823 networkpolicy.go:622] NetworkPolicy 8ad0ea-dev/db-oli is unchanged Version-Release number of selected component (if applicable): 4.6.25 How reproducible: Random but occurs quite often across different projects Steps to Reproduce: 1. 2. 3. Actual results: Communication between pods is blocked. Expected results: Additional info:
Created attachment 1789629 [details] sdn pods memory graph
Created attachment 1789630 [details] sdn pods cpu graph
Created attachment 1789633 [details] sdn pod memory sorted
Created attachment 1789634 [details] sdn pod cpu sorted
(In reply to Matthew Robson from comment #26) > While there may be more we can look into with respect to why it ends up > holding so much memory, can we at least look to add some logging to help > warn or debug this issue in openshift-sdn? Yup. Already filed a Jira issue about this last week: https://issues.redhat.com/browse/SDN-1960
OK, so: 1. The customer problem was worked around by removing NetworkPolicies that behaved pathologically under openshift-sdn 2. There is a Jira issue about about providing better feedback to the user when this happens in the future 3. The support case has been closed so there's nothing more to do here... I guess I'll call this "DEFERRED" since we created a Jira to adress part of it