Bug 2149039

Summary: RFE: firewalld calling set_policy("DROP") on reload produces issues on hosts with busy monitoring services
Product: Red Hat Enterprise Linux 9 Reporter: Juanma Sanchez <juasanch>
Component: firewalldAssignee: Thomas Haller <thaller>
Status: ASSIGNED --- QA Contact: qe-baseos-daemons
Severity: low Docs Contact:
Priority: unspecified    
Version: 9.1CC: arawal, cutaylor, egarver, mcolombo, myllynen, sferguso, todoleza
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Story
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Juanma Sanchez 2022-11-28 16:24:52 UTC
Description of problem:

  When firewall-cmd --reload is called, one of the first actions performed is
  to add a new table "firewalld_policy_drop" and set everything to DROP except
  for related,established traffic.
  This ruleset is kept until reload is completed.

  On hosts creating a high amount of connections per second, this inevitably
  results in a number of connection attempts returning EPERM when the
  "connection" is UDP or ICMP and sendto() is called.

  An example of a host behaving like this is any monitoring software performing 
  pings and retrieving SNMP data (udp) from hundreds or thousands of hosts with 
  1-2 minutes interval per host.

  An example of a real world problem: OpenNMS will mark a service/host as down  
  if sendto() fails when trying to retrieve SNMP data or when trying to ping    
  a host and there's a netfilter rule that would drop that packet.
  In practical terms, busy OpenNMS hosts using firewalld and calling            
  `firewall-cmd --reload` will temporarily mark a high number of hosts as down  
  because java.io.IOException("EPERM") is raised and OpenNMS seems to           
  understand that the host/service is down because no response is received.

  I understand that we could blame OpenNMS here because receiving EPERM as      
  result of a call to `sendto()` doesn't mean that a host/service is down, and  
  it should keep trying a number of times until the request times out or a      
  response is received.                                                         


  It would be useful in these situations if the user could have some more       
  control over this "firewalld_policy_drop" temporary state.

  For example, having a configuration option like this one:

      AllowOutputOnReload=yes/no


  Could allow the user to configure whether outbound traffic is considered safe 
  and should be permitted while the ruleset is being reloaded.                  


  I understand that this is probably more an RFE than a bug report, but I wanted to check first to make sure I'm not missing something and maybe this behavior can be controlled somehow.


Version-Release number of selected component (if applicable):                   

  All firewalld releases.                                                       


How reproducible:                                                               

  Always                                                                        


Steps to Reproduce:                                                             

  1. Start listening UDP on one host:                                           
     # socat udp-recv:1234 stdout  >/dev/null

  2. Send UDP packets as quickly as possible to that host:

     # for i in {1..10000}; do nc -u DEST_IP 1234 <<<$i >/dev/null; done

  3. Reload the firewalld ruleset via `firewall-cmd --reload`. See the EPERM    
     messages on the shell running the for loop

      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.
      Ncat: Operation not permitted.



Actual results:                                                                 

  Some UDP and ICMP outbound traffic is forbidden with EPERM


Expected results:

  Outbound traffic shouldn't be forbidden while firewalld is reloading the
  ruleset

Additional info:

  No additional info