In https://github.com/openshift/origin/pull/22833/commits/bcf30988efa3a23189802b6992bada21affb8997 we disabled 2 service network tests, these should be enabled back.
The test `should be rejected when no endpoints exist` seems pretty important - we need to look in to that and decide. We'll have an update tomorrow. The other is no big deal, and can be fixed whenever.
I have been debugging this. This is what I followed: 1. Create a service without a selector so no corresponding endpoint is created 2. Created a dummy pod (e.g. gcr.io/hello-minikube-zero-install/hello-node ) 3. Logged into dummy pod and ran 'wget test-service:8080' 4. The wget commands times out, instead of returning back with a connection refused This is not intended behaviour, since upstream kubernetes explicitly adds an iptable rule for services that do not have a corresponding endpoint to reject packets, source here: https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/proxy/iptables/proxier.go#L853 Looking closer to rules on SDN pod where pod was located, I ran listed iptables and looked for that reject rule: <snip> # iptables -L ... ... Chain KUBE-SERVICES (3 references) target prot opt source destination REJECT tcp -- anywhere ip-172-30-77-132.us-east-2.compute.internal /* default/test-service: has no endpoints */ tcp dpt:webcache reject-with icmp-port-unreachable ... ... </snip> However, we do use nftables with the iptables compat layer, thus the engine that filters packets is not iptables but nftables. Looking for corresponding rule in nft showed this: <snip> chain KUBE-SERVICES { meta l4proto tcp ip daddr 172.30.77.132 counter packets 0 bytes 0 } </snip> Note there is no reject, but just count, thus the packet has no reject rule after all. I confirmed this by running a dummy iptables add command with reject which showed same behaviour: <snip> iptables -t filter -A KUBE-SERVICES -m comment --comment "ricky test" -p tcp --destination 1.1.1.1 -j REJECT nft list ruleset |grep 1.1.1.1 meta l4proto tcp ip daddr 1.1.1.1 counter packets 0 bytes 0 </snip> Adding a rule with drop makes the corresponding nft rule to work correcly, it shows drop. So it seems like a bug in nft/iptables compat for the reject case. </snip>
*** Bug 1711605 has been marked as a duplicate of this bug. ***
Hi, (In reply to Ricardo Carrillo Cruz from comment #2) [...] > <snip> > chain KUBE-SERVICES { > meta l4proto tcp ip daddr 172.30.77.132 counter packets 0 > bytes 0 > } > </snip> > > Note there is no reject, but just count, thus the packet has no reject rule > after all. What's worse, counters haven't increased so no packet has hit this rule! There is one known issue: Negated matches for input interface in POSTROUTING chains will never match in iptables-nft and always match in iptables-legacy. Paste your full ruleset somewhere and I'll have a look if it is affected by that problem. Cheers, Phil
Created attachment 1594626 [details] nftables list ruleset
(In reply to Ricardo Carrillo Cruz from comment #5) > Created attachment 1594626 [details] > nftables list ruleset Sorry, but that dump neither contains address 1.1.1.1 nor 172.30.77.132. Which rule is supposed to match but doesn't?
Yeah, our cluster for doing dev are ephemeral and torn down automatically by a pruner, so the rules I attached do not match the initial description of this bug. On the attachment you should look for 172.30.233.246, which is on the KUBE-SERVICES chain. Does that look like something as the known issue you mentioned?
(In reply to Ricardo Carrillo Cruz from comment #7) > Yeah, our cluster for doing dev are ephemeral and torn down automatically by > a pruner, so > the rules I attached do not match the initial description of this bug. > On the attachment you should look for 172.30.233.246, which is on the > KUBE-SERVICES chain. I see, thanks. > Does that look like something as the known issue you mentioned? No, it looks fine from that aspect, so not clear why the rule doesn't match your traffic. Did you look at tcpdump output already to make sure SYN from wget looks as expected?
Hi there, sorry for delay, I was out in vacation. The rule are indeed matched, earlier pastes are bogus since iptables/nft commands where not run on the node hosting the pod running the wget command: [ricky@ricky-laptop ~]$ cat /tmp/test-service.yaml apiVersion: v1 kind: Service metadata: name: my-service spec: ports: - protocol: TCP port: 80 targetPort: 8080 [ricky@ricky-laptop ~]$ oc create -f /tmp/test-service.yaml service/my-service created [ricky@ricky-laptop ~]$ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-130-235.ec2.internal Ready worker 20m v1.14.0+f667219f4 ip-10-0-134-73.ec2.internal Ready master 25m v1.14.0+f667219f4 ip-10-0-135-156.ec2.internal Ready master 24m v1.14.0+f667219f4 ip-10-0-140-105.ec2.internal Ready worker 20m v1.14.0+f667219f4 ip-10-0-144-208.ec2.internal Ready master 24m v1.14.0+f667219f4 ip-10-0-150-179.ec2.internal Ready worker 20m v1.14.0+f667219f4 [ricky@ricky-laptop ~]$ oc create deployment hello-node --image=gcr.io/hello-minikube-zero-install/hello-node deployment.apps/hello-node created [ricky@ricky-laptop ~]$ oc describe pod hello-node-78cd77d68f-zmr67 | grep Node Node: ip-10-0-130-235.ec2.internal/10.0.130.235 Node-Selectors: <none> [ricky@ricky-laptop ~]$ oc -n openshift-sdn get pods -l app=sdn --field-selector spec.nodeName=ip-10-0-130-235.ec2.internal NAME READY STATUS RESTARTS AGE sdn-zlqcf 1/1 Running 0 19m Now, open a session on the hello-node pod, other session on sdn-zlqcf: hello-node ---------- # wget my-service converted 'http://my-service' (ANSI_X3.4-1968) -> 'http://my-service' (UTF-8) --2019-08-20 08:29:51-- http://my-service/ Resolving my-service (my-service)... 172.30.121.3 Connecting to my-service (my-service)|172.30.121.3|:80... sdn-zlqcf --------- Chain KUBE-SERVICES (3 references) pkts bytes target prot opt in out source destination 3 180 REJECT tcp -- any any anywhere ip-172-30-121-3.ec2.internal /* default/my-service: has no endpoints */ tcp dpt:http reject-with icmp-port-unreachable Running a tcpdump from hello-node during the wget shows only SYN packets outgoing, no SYN-ACK reply: # tcpdump host 172.30.121.3 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 08:33:13.323465 IP hello-node-78cd77d68f-zmr67.47536 > my-service.default.svc.cluster.local.http: Flags [S], seq 2652938099, win 26733, options [mss 8911,sackOK,TS val 1564345239 ecr 0,nop,wscale 7], length 0 08:33:50.715119 IP hello-node-78cd77d68f-zmr67.48526 > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win 26733, options [mss 8911,sackOK,TS val 1564382631 ecr 0,nop,wscale 7], length 0 08:33:51.723454 IP hello-node-78cd77d68f-zmr67.48526 > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win 26733, options [mss 8911,sackOK,TS val 1564383640 ecr 0,nop,wscale 7], length 0 08:33:53.771453 IP hello-node-78cd77d68f-zmr67.48526 > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win 26733, options [mss 8911,sackOK,TS val 1564385688 ecr 0,nop,wscale 7], length 0 08:33:57.803449 IP hello-node-78cd77d68f-zmr67.48526 > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win 26733, options [mss 8911,sackOK,TS val 1564389720 ecr 0,nop,wscale 7], length 0 08:34:06.059459 IP hello-node-78cd77d68f-zmr67.48526 > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win 26733, options [mss 8911,sackOK,TS val 1564397976 ecr 0,nop,wscale 7], length 0
It would be good to figure this out eventually; right now clients get "no route to host" for either "openshift-sdn screwed up iptables" or "your service crashed". If we fixed the ICMP thing then the latter case would get "connection refused" instead, making it easier for other people to distinguish when something is definitely not our bug.
Hi Ricardo, (In reply to Ricardo Carrillo Cruz from comment #9) [...] > # wget my-service > converted 'http://my-service' (ANSI_X3.4-1968) -> 'http://my-service' (UTF-8) > --2019-08-20 08:29:51-- http://my-service/ > Resolving my-service (my-service)... 172.30.121.3 > Connecting to my-service (my-service)|172.30.121.3|:80... > > > sdn-zlqcf > --------- > > Chain KUBE-SERVICES (3 references) > > pkts bytes target prot opt in out source > destination > > 3 180 REJECT tcp -- any any anywhere > ip-172-30-121-3.ec2.internal /* default/my-service: has no endpoints */ tcp > dpt:http reject-with icmp-port-unreachable > > > Running a tcpdump from hello-node during the wget shows only SYN packets > outgoing, no SYN-ACK reply: > > # tcpdump host 172.30.121.3 > tcpdump: verbose output suppressed, use -v or -vv for full protocol decode > listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes > 08:33:13.323465 IP hello-node-78cd77d68f-zmr67.47536 > > my-service.default.svc.cluster.local.http: Flags [S], seq 2652938099, win > 26733, options [mss 8911,sackOK,TS val 1564345239 ecr 0,nop,wscale 7], > length 0 > 08:33:50.715119 IP hello-node-78cd77d68f-zmr67.48526 > > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win > 26733, options [mss 8911,sackOK,TS val 1564382631 ecr 0,nop,wscale 7], > length 0 > 08:33:51.723454 IP hello-node-78cd77d68f-zmr67.48526 > > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win > 26733, options [mss 8911,sackOK,TS val 1564383640 ecr 0,nop,wscale 7], > length 0 > 08:33:53.771453 IP hello-node-78cd77d68f-zmr67.48526 > > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win > 26733, options [mss 8911,sackOK,TS val 1564385688 ecr 0,nop,wscale 7], > length 0 > 08:33:57.803449 IP hello-node-78cd77d68f-zmr67.48526 > > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win > 26733, options [mss 8911,sackOK,TS val 1564389720 ecr 0,nop,wscale 7], > length 0 > 08:34:06.059459 IP hello-node-78cd77d68f-zmr67.48526 > > my-service.default.svc.cluster.local.http: Flags [S], seq 3364694714, win > 26733, options [mss 8911,sackOK,TS val 1564397976 ecr 0,nop,wscale 7], > length 0 OK, so assuming 172.30.121.3 reverse-resolves into my-service.default.svc.cluster.local, this doesn't look too bad. Could you please also run 'tcpdump -npi <iface> not port 22' on 172.30.121.3 itself? Seeing the output of 'ip route show' would help, as well. In case you have a problematic system available and can get me access to it, I could have a look myself as well. Thanks, Phil
Hey, I've run few CI runs the past few days and test is passing. Not sure if it's due to bump to 1.16 or what. I linked PR to reenable test to bug. The service-proxy test cannot be reenabled, as it requires wget to be installed in nodes and we don't have that in OpenShift.
(In reply to Ricardo Carrillo Cruz from comment #12) > The service-proxy test cannot be reenabled, as it requires wget to be > installed in nodes and we don't have that in OpenShift. If you haven't already can you file an upstream issue pointing this out?
Verified this bug according to comment 12.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062