Bug 1438762
| Summary: | sdn traffic leaking out of the cluster | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Alexander Koksharov <akokshar> |
| Component: | Networking | Assignee: | Phil Cameron <pcameron> |
| Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.4.0 | CC: | aloughla, aos-bugs, atragler, bbennett, eparis, hongli, pcameron, smunilla, weliang |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: Missing iptables rule to block INVALID packets.
Consequence: packet escape cluster
Fix: Add missing rule
Result: No more leaks.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-10 05:20:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Alexander Koksharov
2017-04-04 11:17:08 UTC
From https://access.redhat.com/support/cases/#/case/01807669, customer did: "We have changed the default SDN network CIDR accordingly: osm_cluster_network_cidr=10.254.0.0/16 " According to: https://github.com/danwinship/openshift-docs/blob/d5f85deae2c227459f8146f4cbc16c28cbef7851/install_config/configuring_sdn.adoc#renumbering-the-sdn-network. After changing osm_cluster_network_cidr, need restart master and node: systemctl restart atomic-openshift-master systemctl restart atomic-openshift-node Did not see the restart steps in https://access.redhat.com/support/cases/#/case/01807669. Hello Weibin,
This case is not about re-configuring cluster. It is up and running.
The issue is that certain traffic leaving a node is not being masqueraded. Could you please clarify how restarting node-service can resolve this?
Just to clarify the situation again:
- masquerade rule is in place:
# iptables -L POSTROUTING -nv -t nat
...
12M 712M MASQUERADE all -- * * 10.254.0.0/16 0.0.0.0/0
...
- at the same time traffic sourced with 10.254.0.0/16 is being sent out through node's physical interface.
(see trace in case description)
- adding rule to drop packets with invalid conntrack state resolves the issue.
# iptables -I FORWARD 1 -s 10.254.0.0/16 -m conntrack --ctstate INVALID -j DROP
- this all could meant that host machine cleared out connection information for some reason. It might be timing out if container does not sent any data for certain period. This needs to be checked. The fact is that internal subnet information is exposed to outside. Which is clearly a bug and might be a security issue.
Hi Alexander, Thank you for your good information. I can reproduce the sdn traffic leaking issue in my env now even I follow the steps in https://github.com/danwinship/openshift-docs/blob/d5f85deae2c227459f8146f4cbc16c28cbef7851/install_config/configuring_sdn.adoc#renumbering-the-sdn-network. #### [root@ip-172-18-8-121 ~]# tcpdump -i eth0 -nv host 20.128.0.5 tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 11:52:37.259913 IP (tos 0x0, ttl 63, id 13932, offset 0, flags [DF], proto TCP (6), length 40) 20.128.0.5.49490 > 173.222.212.251.https: Flags [R], cksum 0x5f03 (correct), seq 3410766628, win 0, length 0 11:52:37.456406 IP (tos 0x0, ttl 63, id 14077, offset 0, flags [DF], proto TCP #### 20.128.0.5 is pod's IP address: sh-4.2# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: eth0@if31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8951 qdisc noqueue state UP link/ether 1e:d8:13:5b:65:a6 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 20.128.0.5/23 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::1cd8:13ff:fe5b:65a6/64 scope link valid_lft forever preferred_lft forever sh-4.2# PR 13680 is out for review https://github.com/openshift/origin/pull/13680 Let's get a PR open to nuke the offending rule (per eparis and danw's comments on the above PR) and see if it passes extended network testing. Then we can merge it first thing on Monday if it is clean. Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/2d9a8e38ee15b85670db51557bad0b7bc2a9f516 sdn traffic leaking out of the cluster Customer has discovered that traffic sourced with ip from SDN subnet is being sent out of the cluster non-masqueraded. tcp --ctstate INVALID packets are escaping from the SDN This change adds a FORWARD rule to DROP these packets. filter chain FORWARD -s n.clusterNetworkCIDR -m conntrack --ctstate INVALID -j DROP bug 1438762 https://bugzilla.redhat.com/show_bug.cgi?id=1438762 verified in atomic-openshift-3.6.109-1.git.0.378bacd.el7.x86_64 and didn't see the traffic sourced ip from pod, and one iptables rule below has been added: -A OPENSHIFT-FIREWALL-FORWARD -s 10.128.0.0/14 -m comment --comment "attempted resend after connection close" -m conntrack --ctstate INVALID -j DROP Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716 |