Bug 2149039 - RFE: firewalld calling set_policy("DROP") on reload produces issues on hosts with busy monitoring services
Summary: RFE: firewalld calling set_policy("DROP") on reload produces issues on hosts ...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: firewalld
Version: 9.1
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: rc
: ---
Assignee: Thomas Haller
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-28 16:24 UTC by Juanma Sanchez
Modified: 2023-08-14 16:37 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Story
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github firewalld firewalld pull 1194 0 None Draft [th/reload-policy] add ReloadPolicy to not drop packets during reload 2023-08-14 16:37:14 UTC
Red Hat Issue Tracker RHELPLAN-140693 0 None None None 2022-11-28 16:27:16 UTC

Description Juanma Sanchez 2022-11-28 16:24:52 UTC
Description of problem:

  When firewall-cmd --reload is called, one of the first actions performed is
  to add a new table "firewalld_policy_drop" and set everything to DROP except
  for related,established traffic.
  This ruleset is kept until reload is completed.

  On hosts creating a high amount of connections per second, this inevitably
  results in a number of connection attempts returning EPERM when the
  "connection" is UDP or ICMP and sendto() is called.

  An example of a host behaving like this is any monitoring software performing 
  pings and retrieving SNMP data (udp) from hundreds or thousands of hosts with 
  1-2 minutes interval per host.

  An example of a real world problem: OpenNMS will mark a service/host as down  
  if sendto() fails when trying to retrieve SNMP data or when trying to ping    
  a host and there's a netfilter rule that would drop that packet.
  In practical terms, busy OpenNMS hosts using firewalld and calling            
  `firewall-cmd --reload` will temporarily mark a high number of hosts as down  
  because java.io.IOException("EPERM") is raised and OpenNMS seems to           
  understand that the host/service is down because no response is received.

  I understand that we could blame OpenNMS here because receiving EPERM as      
  result of a call to `sendto()` doesn't mean that a host/service is down, and  
  it should keep trying a number of times until the request times out or a      
  response is received.                                                         


  It would be useful in these situations if the user could have some more       
  control over this "firewalld_policy_drop" temporary state.

  For example, having a configuration option like this one:

      AllowOutputOnReload=yes/no


  Could allow the user to configure whether outbound traffic is considered safe 
  and should be permitted while the ruleset is being reloaded.                  


  I understand that this is probably more an RFE than a bug report, but I wanted to check first to make sure I'm not missing something and maybe this behavior can be controlled somehow.


Version-Release number of selected component (if applicable):                   

  All firewalld releases.                                                       


How reproducible:                                                               

  Always                                                                        


Steps to Reproduce:                                                             

  1. Start listening UDP on one host:                                           
     # socat udp-recv:1234 stdout  >/dev/null

  2. Send UDP packets as quickly as possible to that host:

     # for i in {1..10000}; do nc -u DEST_IP 1234 <<<$i >/dev/null; done

  3. Reload the firewalld ruleset via `firewall-cmd --reload`. See the EPERM    
     messages on the shell running the for loop

      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.



Actual results:                                                                 

  Some UDP and ICMP outbound traffic is forbidden with EPERM


Expected results:

  Outbound traffic shouldn't be forbidden while firewalld is reloading the
  ruleset

Additional info:

  No additional info


Note You need to log in before you can comment on or make changes to this bug.