Description of problem: firewall-cmd reload (even when there are no rule changes) causes iptables reload error and removes egress IP rules. To resolve it, we have to run oc patch hostsubnet to remove and add the egress IP back to the individual namespaces. Version-Release number of selected component (if applicable): v3.7.46 How reproducible: Always Steps to Reproduce: 1. Follow the instructions below to enable static egress IP: https://docs.openshift.com/container-platform/3.7/admin_guide/managing_networking.html#enabling-static-ips-for-external-project-traffic 2. Run: firewall-cmd reload Actual results: Following IPTable rules are thrown: Oct 24 18:46:59 firewalld[1071]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w2 -t nat -n -L DOCKER' failed: iptables: No chain/target/match by that name. ... ... Oct 24 18:46:59 firewalld[1071]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w2 -t nat -C POSTROUTING -s <redacted>/16 ! -o docker0 -j MASQUERADE' failed: iptables: No chain/target/match by that name. Expected results: Egress IP should work when firewalld is enabled. Additional info:
@Meng Bo: can you try this again? It won't fail completely, but the egress traffic will end up using the node's normal IP rather than the egress IP: 1. Set up a cluster with firewalld running on the nodes 2. Set up an egress IP, test that it works 3. On the node with the egress IP, run "firewall-cmd --reload" 4. Try egress from a pod again, see that it uses the node IP rather than the egress IP
Hmm... Yes, I got the problem result now. After firewall-cmd --reload, the pod will use the node's IP as source IP instead of egressIP. The reason should be the condition which Weibin discovered. Thanks, Weibin!
(In reply to Weibin Liang from comment #5) > But egreeIP rule can be restored in iptalbes if continue running systemctl > restart openvsitch/docker/atomic-openshift-node. Sure, but you're not supposed to have to do that. Fixed by https://github.com/openshift/origin/pull/21441. I'll do backports after that merges.
So do we need this backported to 3.7 or is the customer happy with their current workaround? (Or planning to upgrade to something newer than 3.7 soon?)
Tested on ocp 3.11.50 The issue has been fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758