Bug 1541528
Summary: | Changing firewall rules with Director saves copy of all rules into /etc/sysconfig/iptables as part of a stack update with `openstack overcloud deploy` | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Andreas Karis <akaris> |
Component: | puppet-tripleo | Assignee: | Emilien Macchi <emacchi> |
Status: | CLOSED ERRATA | QA Contact: | Gurenko Alex <agurenko> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 10.0 (Newton) | CC: | akaris, amuller, aschultz, chrisw, dhill, djuran, dmacpher, dnavale, dvd, emacchi, glamb, ihrachys, jjoyce, jschluet, jslagle, mburns, mcornea, mircea.vutcovici, mschuppe, nyechiel, rchincho, rhel-osp-director-maint, sandyada, slinaber, srevivo, tvignaud |
Target Milestone: | z8 | Keywords: | Triaged, ZStream |
Target Release: | 10.0 (Newton) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | puppet-tripleo-5.6.4-6.el7ost | Doc Type: | Bug Fix |
Doc Text: |
The iptables rules were managed by the Red Hat OpenStack Platform director and the OpenStack Networking service, which resulted in the rules created by the OpenStack Networking service to persist on to the disk. As a result, the rules that should not be loaded after an iptables restart or a system reboot would be loaded causing traffic issues.
With this update, the Red Hat OpenStack Platform director has been updated to exclude the OpenStack Networking rules from '/etc/sysconfig/iptables' when the director saves the firewall rules. As a result, iptables restart or a system reboot should work without causing traffic problems.
Note: It might be necessary to perform a rolling restart of the controller nodes to ensure that any orphaned managed neutron rules are no longer reloaded.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-05-17 15:40:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Andreas Karis
2018-02-02 19:30:52 UTC
Could you please confirm PurgeFirewallRules is not set to True in your environment? Looking at the code: https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-firewall.yaml#L38 https://github.com/openstack/puppet-tripleo/blob/master/manifests/firewall.pp#L57 It's disabled by default, so firewall rules shouldn't be purged by the puppetlabs-firewall module unless I missed something. Nevermind Andreas. I talked with Alex Schultz and here's our conclusion: - Neutron manages IPtables rules and make them consistent with iptables-save: https://github.com/openstack/neutron/blob/cbee0f9f88ff34f70ff19590471b5405e06ff2a9/neutron/agent/linux/iptables_manager.py#L548 - puppetlabs-firewall manages TripleO firewall rules but it could be any operator managing IPtables persistent rules while Neutron is running, the result would be the same, we would have conflict like it happens in this bug. One workaround could be to override https://github.com/openstack/neutron/blob/cbee0f9f88ff34f70ff19590471b5405e06ff2a9/etc/neutron/rootwrap.d/iptables-firewall.filters#L13 and change the content to be: iptables-save: CommandFilter, /bin/true, root ip6tables-save: CommandFilter, /bin/true, root (not tested though). Why does tripleo dump iptables-save result into the file? Is it to store rules it created during deployment to give access to API and other services for the next reboot? If so, I don't think iptables-save is the right approach. Instead of blindly dumping all rules there are at a particular moment, we should fill in the sysconfig file with exact rules needed for those services, and nothing else. To comment on #7, no it's completely wrong to override rootwrap filters like that. It will break neutron firewall because the way we use iptables-save is that we pipe its stdout as stdin into iptables-restore (after modifications). We never dump result to any persisted files. The wrongdoer here is, puppetlabs-firewall, or operator, that decides to save all rules to sysconfig while neutron agents are running. Those rules are not supposed to persist. If you need SOME rules in sysconfig, then explicitly write those rules there, and only those rules. *** Bug 1528605 has been marked as a duplicate of this bug. *** I totally agree with Ihar. ~~~ - Neutron manages IPtables rules and make them consistent with iptables-save: https://github.com/openstack/neutron/blob/cbee0f9f88ff34f70ff19590471b5405e06ff2a9/neutron/agent/linux/iptables_manager.py#L548 ~~~ This statement is 100% wrong. neutron makes no rule whatsoever consistent. All rules are written temporarily. On a node reboot, those rules are lost. The code that you are referring to above does this: ~~~ This will create a diff between the rules from the previous runs and replace them with the current set of rules. This happens atomically, thanks to iptables-restore. ~~~ I only half-heartedly read through the code, but to me it doesn't look as if it was saving anything to file. It's using iptables-save / iptables-restore to create and apply a dynamical list of changes: ~~~ if changes: # if there are changes to the table, we put on the header # and footer that iptables-save needs commands += (['# Generated by iptables_manager'] + ['*%s' % table_name] + changes + ['COMMIT', '# Completed by iptables_manager']) if not commands: continue all_commands += commands # always end with a new line commands.append('') args = ['%s-restore' % (cmd,), '-n'] if self.namespace: args = ['ip', 'netns', 'exec', self.namespace] + args err = self._run_restore(args, commands) if err: self._log_restore_err(err, commands) raise err ~~~ What Director (and if the above assumptions are correct, puppetlabs-firewall) does, is go in and blindly assume that it's completely save to take all non-persistent firewall rules from within iptables and to write them into /etc/sysconfig/iptables. And that's absolutely not o.k., because it breaks the way neutron works, which is what I outlined above when I opened this ticket. - Andreas OMG, I'm tired .. consistent, not persistent. So yes, it makes them consistent with iptables-save / -restore, but it does not make them persistent. We're on the same page then. Sorry for misreading your statement! Your conclusion however is wrong. The wrongdoer is indeed the firewall module. neutron is doing everything right. So a change here: https://github.com/openstack/neutron/blob/cbee0f9f88ff34f70ff19590471b5405e06ff2a9/etc/neutron/rootwrap.d/iptables-firewall.filters#L13 ...is the wrong approach (In reply to Ihar Hrachyshka from comment #8) > Why does tripleo dump iptables-save result into the file? Is it to store > rules it created during deployment to give access to API and other services > for the next reboot? If so, I don't think iptables-save is the right > approach. This is what TripleO does: https://github.com/puppetlabs/puppetlabs-firewall/blob/d714f122d19155afa00b2b990a9267b422e5946d/lib/puppet/util/firewall.rb#L168 /sbin/service iptables save > Instead of blindly dumping all rules there are at a particular > moment, we should fill in the sysconfig file with exact rules needed for > those services, and nothing else. How do you do it? We can't prevent operators to add their firewall rules and make them consistent on day 2 operations. > To comment on #7, no it's completely wrong to override rootwrap filters like > that. (...) ack, it makes sense. > The wrongdoer here is, puppetlabs-firewall, or operator, that decides to save > all rules to sysconfig while neutron agents are running. Those rules are not > supposed to persist. If you need SOME rules in sysconfig, then explicitly > write those rules there, and only those rules. Like I said, the puppet module is really not filtering anything now. We would have to think about something like this, but for sure we don't have it now. Also mentioned on IRC by Alex, we can't set PurgeFirewallRules to True because it would kill Neutron rules and we have no mechanism to trigger a neutron restart when firewall rules would be purged, because we run containers and not systemd services on the host. I disagree the fix will belong to neutron. Neutron has no idea that you dumped no longer valid rules and later will try to restore them when resources they reference to, like ipsets, are long gone because they are not needed from neutron data plane perspective. I would argue "service iptables save" is just not compatible with how neutron uses iptables, and there is nothing we can do about it except documenting it, and making tripleo not do what it does right now - issuing the command in the middle of operation. As for customers doing it with PostConfig, they should get guidance on how to do it properly. I imagine RHEL has better mechanism to persistently configure firewall than just dumping current rules into a file. Whatever this mechanism is, it should be used instead. To give an example, suggesting neutron must be compatible with 'service iptables save' would be similar to suggesting neutron must continue working correctly while operator adds and deletes random rules by calling to iptables directly. Neither the former nor the latter makes much sense to me, and we should just discourage issuing those tools when neutron agents are deployed. Is there any technical problem with making tripleo persist just the needed iptables rules? https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/manual_installation_procedures/sect-configuring_the_firewall_for_telemtry_service https://access.redhat.com/articles/9014 https://access.redhat.com/solutions/367653 These 3 links recommend to use `service iptables save`. We like it or not, this is how our customers are recommended to do now. For your question about TripleO & "needed" persistent rules: yes for now puppetlabs-firewall isn't able to do that. I think it would be a major change, and I'm even not sure it would be accepted by Puppetlabs. If that happens, we would have to carry some custom functions in puppet-tripleo. Note when we switch to Ansible or whatever tool in the future, we would have to rewrite it in $newframework... I'll investigate further but I might have found a way to ignore some rules: https://github.com/puppetlabs/puppetlabs-firewall/blob/master/README.markdown#parameters-1 I'll keep digging... Another option is to overload the utility: https://github.com/puppetlabs/puppetlabs-firewall/blob/d714f122d19155afa00b2b990a9267b422e5946d/lib/puppet/util/firewall.rb#L168 And grep -v "neutron-", so neutron-managed rules would be ignored by Puppet. > Instead of blindly dumping all rules there are at a particular
> moment, we should fill in the sysconfig file with exact rules needed for
> those services, and nothing else.
How do you do it? We can't prevent operators to add their firewall rules and make them consistent on day 2 operations.
+++
Of course we can. We put it into documentation and tell them not to save the rules or edit /etc/sysconfig/iptables manually instead. Indeed, we do not support *any* manual change to a file on an overcloud node, so day 2 firewall changes via iptables, via iptables-save > /etc/sysconfig/iptables or via manual modification of /etc/susconfig/iptables are unsupported and users live with the fact that these will be lost on the next `openstack overcloud deploy`. What's important, though, is that Director doesn't mess up things, and right now it's doing exactly what we have been telling users not to do since OSP 7 at least (probably even prior): saving dynamic neutron firewall rules into /etc/sysconfig/iptables
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/manual_installation_procedures/sect-configuring_the_firewall_for_telemtry_service https://access.redhat.com/articles/9014 https://access.redhat.com/solutions/367653 These 3 links recommend to use `service iptables save`. We like it or not, this is how our customers are recommended to do now. +++ Link 1) is for the manual installation procedure, a procedure that was never supported in any version from 7 to 12 and since 12 the guide is even completely retired. Link 2) is for RHEL, not OSP Link 3) is for RHEL (see the environment section) +++ The solutions for OSP are here https://access.redhat.com/solutions/2678381 and here https://access.redhat.com/solutions/3172881 +++ Additionally we all know that our own documentation may contain bugs from time to time or contradicting instructions. The KCS solutions are all written by our support representatives in the field, and my not always be 100% accurate for as much as we try. Still looking at how to fix it in puppet. The "ignore" parameter in puppetlabs-firewall is implemented for each chain (e.g. "FORWARD:filter:IPv4", which means chains have to be predictable, which isn't the case in Neutron (e.g. "neutron-openvswi-ibb9657ae-c"). See https://github.com/puppetlabs/puppetlabs-firewall/blob/4d602914b27bffbd050b44f9957ba754761f21ff/lib/puppet/type/firewallchain.rb#L231 Which is where the rule isn't manage if matching the regex in ignore. The problem is that chain has to be managed by Puppet. So we need to figure out how we can tell the puppet module to unmanage for all chains, and not only the ones we declare in the catalog. I'm unsure if that will work but I'm testing something with: https://review.openstack.org/#/c/541852/ https://review.openstack.org/#/c/541849/ I'll keep this bug updated with my findings. I'm having issues to reproduce your dependency error on Queens, see the logs on the compute: https://logs.rdoproject.org/49/541849/20/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z858ae5bb9aea41208c31d072659ef51b/overcloud-novacompute-0/var/log/journal.txt.gz#_Feb_15_07_33_51 I'll try to reproduce on Newton... ok Alex and I realized https://review.openstack.org/#/c/545110/ is also needed (was not backported to Newton). We'll see if that helps. More update to come. Verified on puddle 2018-05-09.2 [stack@undercloud-0 ~]$ rpm -q puppet-tripleo puppet-tripleo-5.6.8-3.el7ost.noarch hostlist=`nova list | awk '/ACTIVE/ {print $(NF-1)}' | awk -F '=' '{print $NF}' | tr '\n' ' '` for i in $hostlist ; do ssh heat-admin@$i "hostname; sudo grep -i \"neutron-\" /etc/sysconfig/iptables /etc/sysconfig/ip6tables"; done ceph-0 ceph-1 ceph-2 compute-0 compute-1 controller-0 controller-1 controller-2 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1593 |