Bug 1703261
Summary: | podman containers unable to communicate via IP to one container when use port forward on host | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | James Hartsock <hartsjc> |
Component: | podman | Assignee: | Matthew Heon <mheon> |
Status: | CLOSED ERRATA | QA Contact: | atomic-bugs <atomic-bugs> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.3 | CC: | ailan, bbaude, berrange, cpippin, dcbw, dornelas, dwalsh, imcleod, jligon, jnovy, johannes.grumboeck, laine, lfriedma, lsm5, mburns, mcambria, mheon, mindruv, obockows, oli.wade, pasik, pmorey, pthomas, rheron, rkhan, rmanes, santiago, scohen, skrenger, smccarty, subhat, tsweeney, umohnani, veaceslav.mindru, ypu |
Target Milestone: | rc | Keywords: | Extras |
Target Release: | 8.2 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | podman-1.9.x and newer | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-07-21 15:31:54 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1186913, 1594286, 1793607 |
Description
James Hartsock
2019-04-25 22:21:54 UTC
Adding CNI subnet masq seems to fix.... # iptables -t nat -I CNI-DN-bdff0f8aeb262a7d5d536 -p tcp -s 10.88.0.0/24 --dport 5001 -j CNI-HOSTPORT-SETMARK # iptables -t nat -I CNI-DN-7e5f467dcdfe79cbd1303 -p tcp -s 10.88.0.0/24 --dport 5002 -j CNI-HOSTPORT-SETMARK Showing the rules # iptables -t nat -vnL CNI-DN-bdff0f8aeb262a7d5d536 Chain CNI-DN-bdff0f8aeb262a7d5d536 (1 references) pkts bytes target prot opt in out source destination 0 0 CNI-HOSTPORT-SETMARK tcp -- * * 10.88.0.0/24 0.0.0.0/0 tcp dpt:5001 2 120 CNI-HOSTPORT-SETMARK tcp -- * * 10.88.0.55 0.0.0.0/0 tcp dpt:5001 0 0 CNI-HOSTPORT-SETMARK tcp -- * * 127.0.0.1 0.0.0.0/0 tcp dpt:5001 2 120 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:5001 to:10.88.0.55:80 # iptables -t nat -vnL CNI-DN-7e5f467dcdfe79cbd1303 Chain CNI-DN-7e5f467dcdfe79cbd1303 (1 references) pkts bytes target prot opt in out source destination 0 0 CNI-HOSTPORT-SETMARK tcp -- * * 10.88.0.0/24 0.0.0.0/0 tcp dpt:5002 0 0 CNI-HOSTPORT-SETMARK tcp -- * * 10.88.0.56 0.0.0.0/0 tcp dpt:5002 0 0 CNI-HOSTPORT-SETMARK tcp -- * * 127.0.0.1 0.0.0.0/0 tcp dpt:5002 2 120 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:5002 to:10.88.0.56:80 And now things connect as one would expect (and like docker) # podman exec hello-world-a wget -O /dev/null http://10.88.0.1:5001 Connecting to 10.88.0.1:5001 (10.88.0.1:5001) null 100% |*******************************| 7218 0:00:00 ETA # podman exec hello-world-a wget -O /dev/null http://10.88.0.1:5002 Connecting to 10.88.0.1:5002 (10.88.0.1:5002) null 100% |*******************************| 7218 0:00:00 ETA I think this is a CNI problem - nearest I can tell, the rules in question are generated by the CNI plugins on behalf of Podman, not Podman itself. Mike, Would you care to comment? I just wanted to link this issue https://github.com/containers/libpod/issues/2886 which is also referencing a pull request. Showing failing # podman exec hello-world-a wget http://10.88.0.1:5002 Connecting to 10.88.0.1:5002 (10.88.0.1:5002) ^C Load module and check value # modprobe br_netfilter # cat /proc/sys/net/bridge/bridge-nf-call-iptables 1 And then works as stated # podman exec hello-world-a wget http://10.88.0.1:5002 Connecting to 10.88.0.1:5002 (10.88.0.1:5002) index.html 100% |*******************************| 7218 0:00:00 ETA So should podman not load br_netfilter so works out of the box like docker? (In reply to James Hartsock from comment #18) > So should podman not load br_netfilter so works out of the box like docker? podman show ensure that br_netfilter is loaded to allow /proc/sys/net/bridge/bridge-nf-call-iptables to be set to 1 (which is the module default.) Brent is this fixed in current podman? Matt? I struggled with this issue for a couple days before I realized it was a podman issue. I am adding what I did to make this work for the benefit of others. # Load br_netfilter module modprobe br_netfilter # Ensure it's loaded on boot cat > /etc/modules-load.d/podman-net.conf <<EOF br_netfilter EOF # Setup sysctl params, these persist across reboots cat > /etc/sysctl.d/podman-net.conf <<EOF net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 EOF # Load systcl params sysctl --system Thanks @Rodrique but it looks like we need only br_netfilter, however (sysctl params are set like that). A question why podman doesn't have any dependency for that module? or at least rpm scripts should set something like /etc/modules-load.d/podman-net.conf Do we need this by default or just for special setups. Mike? Matt? Dan? We can rutn on the br_netfilter by default by dropping that files. Does this make sense? https://bugzilla.redhat.com/show_bug.cgi?id=1703261#c24 Just to make it clear and more helpful, my problem was: containers cannot access published ports of other containers on the same host. I hit the issue on RHEL 8.1, so I am not sure it would be the same on RHEL 7 We do face this also in RHEL 7. Confirmed on Fedora 32 that in podman-1.9.1-1.fc32.x86_64.rpm the module is now added to `/etc/modules-load.d/` as per https://github.com/containers/libpod/issues/5316 ``` $ cat etc/modules-load.d/podman.conf br_netfilter ``` According to https://koji.fedoraproject.org/koji/taskinfo?taskID=42891689 the change is also in podman-1.8.2-3. It is not likely that we will ship another version of Podman in RHEL7, So this is fixed in the next release which is RHEL8. Should I move the bugzilla to reflect this? Understood. I believe the workaround (to manually load `br_netfilter`) is acceptable for RHEL7. So please move the BugZilla to RHEL8. Thanks. Jindrich, setting this to you. I believe were all set for the packaging for RHEL 8 for this. Please close this BZ if so. FYI, modloading br_netfilter will break a *huge* number of existing installs where virtual machines are connected to bridges that are attached to an ethernet. This just showed up in Fedora (when Fedora 32 was released) and libvirt has started getting bug reports from people whose vm networking is broken by the update. (I ended up at this BZ while troubleshooting Bug 1832723, which reports that virtual machines connected to the physical network via a bridge no longer have IPv4 connectivity after an upgrade from Fedora 31 to Fedora 32). These same reports are going to start coming in for RHEL8 when this change goes into the downstream podman. For many years the default behavior has been that traffic traversing bridges does *not* go through iptables, and admins expect that their VMs will have network connectivity without requiring any extra firewall rules. Forcing a change in default behavior when podman is installed is going to cause (already is causing() headaches for a lot of other people. (BTW, podman is installed even in a basic Fedora Workstation install, so this is going to affect literally everyone with a vm connected via a host bridge). I agree with Laine's concerns here. If podman is installed on a host that is already using bridges for some purpose (such as VMs, but there can be others), then loading 'br_netfilter' is highly likely to cause a functional regression in the customer's networking setup. Traffic between the network layer bridge ports, instead of being handled purely at the network layer, now gets injected up into the IP layer, where firewalld filtering is likely to block traffic that previous would be allowed. The br_netfilter default settings have been a frequent source of networking breakage for customers in RHEL over the years, which is what motivated splitting it into a separate kmod, so customers don't have it loaded by default. Apologies that we didn't consult with the Virt team here - we were unaware of the history behind this one. From what I'm hearing here, it sounds like containers need `br_netfliter` loaded for this bugfix, and VMs need it unloaded (and have relied on that for some time) to ensure VM networking continues working. It sounds like we're deciding whether containers or VMs ship broken out of the box, which is not a good decision to have to make. I'll talk with the rest of my team about this one. Daniel Do you have any suggestions on how we could fix this? (In reply to Daniel Walsh from comment #38) > Daniel Do you have any suggestions on how we could fix this? I'm not sure I understand what problem podman is facing that requires the br_netfilter usage in the first place. I admit I've not looked at podman networking in any great detail, but I under the impression podman used the same kind of conceptual approach to networking setup as libvirt did for VMs, which at a high level was: - An isolated bridge device not connected to any physical NIC. - Container's tap devices connected to the bridge - Traffic between containers and between the container & host is forwarded on their common (private) IP subnet - Host firewall protects host from traffic originating in the containers or WAN - Host firewall isn't involved in filtering traffic between containers - Firewall rules to NAT forwarded IP traffic from containers to the outside LAN/WAN If I'm right that podman is following this kind of approach, then AFAIK, there shouldn't be a need for br_netfilter. The only global OS setting needed would be ip_forward=1 The fundamentals there are correct, but we also have a port-forwarding mechanism (`podman run --publish`) that uses IPTables to forward traffic from a port on the host to a port in the container to allow external reachability of container services. This in itself does not require `br_netfilter`, but people have begun using this to communicate between containers (presumably because of the difficulty in identifying the IP of a container hosting a specific service). Instead of directly communicating between containers, they are forwarding a port into a container using `--publish` and then hitting that port on the host's bridge IP (which they know because it's the default route for their interface) to get to the other container without ambiguity. We have some mechanisms for allowing containers to resolve the IPs of other containers, but their availability is inconsistent which has hampered their adoption. I believe, but have not yet confirmed, that Docker also has this behaviour, which would also require us to attempt to support it for compatibility reasons. Regardless, from our discussion, we don't want to break Libvirt; so I think we need to figure out what the state of this is, and pull it out of the build if at all possible. The 8.2.1 target date gives me hope that it's early enough in the cycle to do this without undue inconvenience. This is most likely related to https://bugzilla.redhat.com/show_bug.cgi?id=1703261. The same upstream fix should address this BZ and that one, but with different packaging needs. Sorry for the confusion with my last comment, I did mean to tag https://bugzilla.redhat.com/show_bug.cgi?id=1832723. I am experiencing something similar, tgough br_netfilter and sysctl rules are set correctly, am i doing something wrong? should i open a separate bugzila? I am bit stuck for couple of days, can any one preaty pelase help! :) Running podman on C8. [root@srv4 amb-docker]# cat /etc/redhat-release CentOS Linux release 8.1.1911 (Core) [root@srv4 amb-docker]# podman --version podman version 1.6.4 [root@srv4 amb-docker]# [root@srv4 amb-docker]# sysctl net.ipv4.ip_forward net.ipv4.ip_forward = 1 [root@srv4 amb-docker]# sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-iptables = 1 [root@srv4 amb-docker]# lsmod | grep -i br_netfilter br_netfilter 28672 0 bridge 196608 1 br_netfilter ipv6 524288 24507 bridge,ip6t_rpfilter,l2tp_core,br_netfilter,ip_gre,nf_reject_ipv6,udp_diag,nft_fib_ipv6,wireguard,ip6table_mangle,ip_vs [root@srv4 amb-docker]# doing NC to an IP cause not even DNS resolving works, i have google NS servers in resolv.conf [root@srv4 amb-docker]# podman exec -ti amb-docker_ambweb_1 nc 172.217.23.238 80 -v nc: 172.217.23.238 (172.217.23.238:80): Host is unreachable Error: non zero exit code: 1: OCI runtime error [root@srv4 amb-docker]# on the other hand ICMP works ... [root@srv4 amb-docker]# podman exec -ti amb-docker_ambweb_1 ping 172.217.23.238 -c 2 PING 172.217.23.238 (172.217.23.238): 56 data bytes 64 bytes from 172.217.23.238: seq=0 ttl=54 time=0.534 ms 64 bytes from 172.217.23.238: seq=1 ttl=54 time=0.537 ms --- 172.217.23.238 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.534/0.535/0.537 ms [root@srv4 amb-docker]# latest makes me thing it's some sort of masquarade/iptables problem? [root@srv4 amb-docker]# iptables -t nat -L Chain PREROUTING (policy ACCEPT) target prot opt source destination CNI-HOSTPORT-DNAT all -- anywhere anywhere ADDRTYPE match dst-type LOCAL Chain INPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination CNI-HOSTPORT-MASQ all -- anywhere anywhere /* CNI portfwd requiring masquerade */ CNI-f2833c0837fc08e40d556038 all -- 10.88.0.29 anywhere /* name: "podman" id: "9d17440a8a5ad3ba4675885927f2fb1131d26184ba1b131d0c298293b6fae9ee" */ Chain OUTPUT (policy ACCEPT) target prot opt source destination CNI-HOSTPORT-DNAT all -- anywhere anywhere ADDRTYPE match dst-type LOCAL Chain CNI-HOSTPORT-SETMARK (8 references) target prot opt source destination MARK all -- anywhere anywhere /* CNI portfwd masquerade mark */ MARK or 0x2000 Chain CNI-HOSTPORT-MASQ (1 references) target prot opt source destination MASQUERADE all -- anywhere anywhere mark match 0x2000/0x2000 Chain CNI-HOSTPORT-DNAT (2 references) target prot opt source destination CNI-DN-f2833c0837fc08e40d556 tcp -- anywhere anywhere /* dnat name: "podman" id: "9d17440a8a5ad3ba4675885927f2fb1131d26184ba1b131d0c298293b6fae9ee" */ multiport dports entextnetwk,mysql,8008,opsession-prxy Chain CNI-f2833c0837fc08e40d556038 (1 references) target prot opt source destination ACCEPT all -- anywhere 10.88.0.0/16 /* name: "podman" id: "9d17440a8a5ad3ba4675885927f2fb1131d26184ba1b131d0c298293b6fae9ee" */ MASQUERADE all -- anywhere !base-address.mcast.net/4 /* name: "podman" id: "9d17440a8a5ad3ba4675885927f2fb1131d26184ba1b131d0c298293b6fae9ee" */ Chain CNI-DN-f2833c0837fc08e40d556 (1 references) target prot opt source destination CNI-HOSTPORT-SETMARK tcp -- 10.88.0.29 anywhere tcp dpt:entextnetwk CNI-HOSTPORT-SETMARK tcp -- srv4 anywhere tcp dpt:entextnetwk DNAT tcp -- anywhere anywhere tcp dpt:entextnetwk to:10.88.0.29:12001 CNI-HOSTPORT-SETMARK tcp -- 10.88.0.29 anywhere tcp dpt:mysql CNI-HOSTPORT-SETMARK tcp -- srv4 anywhere tcp dpt:mysql DNAT tcp -- anywhere anywhere tcp dpt:mysql to:10.88.0.29:3306 CNI-HOSTPORT-SETMARK tcp -- 10.88.0.29 anywhere tcp dpt:8008 CNI-HOSTPORT-SETMARK tcp -- srv4 anywhere tcp dpt:8008 DNAT tcp -- anywhere anywhere tcp dpt:8008 to:10.88.0.29:8000 CNI-HOSTPORT-SETMARK tcp -- 10.88.0.29 anywhere tcp dpt:opsession-prxy CNI-HOSTPORT-SETMARK tcp -- srv4 anywhere tcp dpt:opsession-prxy DNAT tcp -- anywhere anywhere tcp dpt:opsession-prxy to:10.88.0.29:3307 [root@srv4 amb-docker]# [root@srv4 amb-docker]# iptables -nL Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination [root@srv4 amb-docker]# [root@srv4 amb-docker]# firewall-cmd --get-active-zones home sources: 89.190.47.14 public interfaces: venet0 [root@srv4 amb-docker]# firewall-cmd --list-all public (active) target: default icmp-block-inversion: no interfaces: venet0 sources: services: ports: protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: [root@srv4 amb-docker]# firewall-cmd --list-all --zone=home home (active) target: default icmp-block-inversion: no interfaces: sources: 89.190.47.14 services: ports: 2299/tcp 8008/tcp protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: [root@srv4 amb-docker]# Matt can you take a quick look at: https://bugzilla.redhat.com/show_bug.cgi?id=1703261#c53 from Veaceslav Mindru please? Can you provide the contents of any files in `/etc/cni/net.d/`? [root@srv4 ~]# cat /etc/cni/net.d/87-podman-bridge.conflist { "cniVersion": "0.4.0", "name": "podman", "plugins": [ { "type": "bridge", "bridge": "cni-podman0", "isGateway": true, "ipMasq": true, "ipam": { "type": "host-local", "routes": [{ "dst": "0.0.0.0/0" }], "ranges": [ [ { "subnet": "10.88.0.0/16", "gateway": "10.88.0.1" } ] ] } }, { "type": "portmap", "capabilities": { "portMappings": true } }, { "type": "tuning" } ] } [root@srv4 ~]# ls -l /etc/cni/net.d/ total 2 -rw-r--r-- 1 root root 574 Apr 10 13:50 87-podman-bridge.conflist [root@srv4 ~]# test with containernetworking-plugins-0.8.6-1.module+el8.2.1+6626+598993b4.x86_64 podman-1.9.3-1.module+el8.2.1+6750+e53a300c.x86_64 And use the steps in comments #1. It works as expect now. We can download from both port. So set this to verified. # podman exec hello-world-a wget -O /dev/null http://10.88.0.1:5002 Connecting to 10.88.0.1:5002 (10.88.0.1:5002) null 100% |*******************************| 7218 0:00:00 ETA # podman exec hello-world-a wget -O /dev/null http://10.88.0.1:5001 Connecting to 10.88.0.1:5001 (10.88.0.1:5001) null 100% |*******************************| 7218 0:00:00 ETA All & OpenStack Team, It seems like there is a workaround, and we are happy to look at this in a future version of podman, but this is expected behavior that has been in place for a long time, so a back port to podman 1.6.4 is not really an option. I'm moving this to RHEL 8.3. We might be able to take a look at this for 8.3 or the 12 week release that comes out after 8.3. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3053 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |