Bug 1866495
| Summary: | OpenShift 3.11.248 fix for CVE-2020-8558 has exposed RHEL 7 source IP bug | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Brad <behle> |
| Component: | Node | Assignee: | Ryan Phillips <rphillips> |
| Status: | CLOSED DUPLICATE | QA Contact: | Sunil Choudhary <schoudha> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.11.0 | CC: | aos-bugs, bbennett, bretm, bshirren, jokerman, rbost |
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-08-10 16:31:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Brad
2020-08-05 17:50:24 UTC
This was fixed in origin by https://github.com/openshift/origin/pull/25141/files for the bug https://bugzilla.redhat.com/show_bug.cgi?id=1849175. But that's just a backport of Casey's https://github.com/kubernetes/kubernetes/pull/91569 PR. Just to clarify, I don't believe this bug has been fixed yet. The PRs mentioned above solve CVE-2020-8558, but I believe they CAUSE the case I detailed above. This bug is about how the fix for CVE-2020-8558 seems to have blocked liveness/readiness probes that use network calls to localhost from the local kubelet. I was able to narrow this down even further. It appears like this problem appears only on nodes that have had a HostPort pod run on them. It appears running a HostPort pod on a node causes the following iptables MASQUERADE rule to be added to the node at the end of the POSTROUTING chain in the nat table:
```
Chain POSTROUTING (policy ACCEPT 55 packets, 3828 bytes)
pkts bytes target prot opt in out source destination
18772 1304K cali-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:O3lYWMrLQYEMJtB5 */
17249 1218K KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
975 107K MASQUERADE all -- * lo 127.0.0.0/8 0.0.0.0/0 /* SNAT for localhost access to hostports */
```
This `/* SNAT for localhost access to hostports */` rule is what is causing the behavior I described initially in this ticket. When I remove this rule manually, the behavior goes back to normal and the liveness probe succeeds.
I think it is something in the Openshift components that adds this rule and implements HostPorts in general, but I'm not sure about that. I've looked at the "portmap" CNI plugin which is used to implement HostPort on base k8s clusters, and they both use more specific iptables rules so that only traffic bound for a HostPort is MASQ'd. So I think something similar would fix this issue.
Brad, Thanks for the excellent analysis. I think you're exactly right. *** Bug 1866132 has been marked as a duplicate of this bug. *** Update: we need to backport https://github.com/kubernetes/kubernetes/pull/80591 to CRI-O. For bureaucratic reasons, marking this as a duplicate of bug 1866132. Your analysis was spot-on, and it turns out we already have a fix that needs to be backported. *** This bug has been marked as a duplicate of bug 1866132 *** |