Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2238325 - MaxRequestsPerChild from tuning triggers sporadic silent response for clients using HTTP/2
Summary: MaxRequestsPerChild from tuning triggers sporadic silent response for clients...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Installation
Version: 6.13.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: 6.15.0
Assignee: Ewoud Kohl van Wijngaarden
QA Contact: Griffin Sullivan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-11 13:24 UTC by Pavel Moravec
Modified: 2024-08-22 04:25 UTC (History)
10 users (show)

Fixed In Version: foreman-installer-3.9.0-0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-23 17:14:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 36784 0 Normal Closed Drop Apache mpm_event MaxRequestPerChild values from tuning profiles 2023-10-03 10:33:02 UTC
Red Hat Issue Tracker SAT-19974 0 None None None 2023-09-11 13:27:19 UTC
Red Hat Knowledge Base (Solution) 7032939 0 None None None 2023-09-11 20:18:02 UTC
Red Hat Product Errata RHSA-2024:2010 0 None None None 2024-04-23 17:14:26 UTC

Description Pavel Moravec 2023-09-11 13:24:27 UTC
Description of problem:
Satellite uses httpd version vulnerable to https://github.com/apache/httpd/pull/281 bug, where clients using HTTP/2 connections can hit no response from httpd whenever MaxRequestsPerChild is used (and the threshold is just hit).

That is dangerous due to two reasons:
1) Investigating the cause is very tricky, as clients wont get any response *randomly*, and httpd logs do not log anything relevant. Basically enabling httpd debugs is the only option to confirm this.
2) We do recommend using MaxRequestsPerChild both in performance guide (https://access.redhat.com/documentation/en-us/red_hat_satellite/6.13/html/tuning_performance_of_red_hat_satellite/configuring_project_for_performance_performance-tuning#tuning_apache_httpd_child_processes_performance-tuning), as well as in tuning profiles:

# grep maxrequestsperchild /usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/*yaml
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/extra-extra-large.yaml:apache::mod::event::maxrequestsperchild: 4000
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/extra-large.yaml:apache::mod::event::maxrequestsperchild: 4000
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/large.yaml:apache::mod::event::maxrequestsperchild: 4000
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/medium.yaml:apache::mod::event::maxrequestsperchild: 4000
#

So the bug can be hit by any customer using HTTP/2 clients (esp. using some automation that very randomly would fail).



Version-Release number of selected component (if applicable):
Sat6.13
- httpd-2.4.37-56.module+el8.8.0+18758+b3a9c8da.6.x86_64


How reproducible:
100%


Steps to Reproduce:
1. Apply either tuning, or follow the tuning guide directly, to have MaxRequestsPerChild enabled in /etc/httpd/conf.modules.d/event.conf . For the sake of testing, manually decrease the value from 4000 to e.g. 10 or 100 (and restart httpd service)
2. Run random API requests (or even login page requests) using HTTP/2 protocol, like:

while true; do
  cnt=0
  while true; do
    cnt=$((cnt+1))
    if [ $((cnt%1000)) -eq 0 ]; then
      echo "running $cnt-th iteration"
    fi
    if [[ $(curl -o /dev/null -s -k --http2 https://localhost/ -w '%{size_download}') == 0 ]]; then
      echo "no response received in $cnt-th iteration"
      break
    fi
  done
  sleep 1
done

(you can use any URI there, e.g. https://localhost:443/api/v2/status or https://localhost:443/katello/api/v2/organizations/1/ )

The --http2 option is crucial.


Actual results:
2. On average, no response will be received in each MaxRequestsPerChild iteration. Like (for value 100):

no response received in 127-th iteration
no response received in 26-th iteration
no response received in 153-th iteration
no response received in 82-th iteration
no response received in 67-th iteration
no response received in 166-th iteration
no response received in 86-th iteration
no response received in 119-th iteration
no response received in 24-th iteration
no response received in 191-th iteration
no response received in 9-th iteration
no response received in 177-th iteration
no response received in 47-th iteration
no response received in 190-th iteration
no response received in 9-th iteration
no response received in 144-th iteration


Expected results:
The script doesn't print a "no response received" error.


Additional info:

Comment 1 Ewoud Kohl van Wijngaarden 2023-09-27 11:09:16 UTC
Thanks for uncovering this. I get the impression just raising the MaxRequestsPerChild value isn't solving it. Just making it less frequent. Should we push for RHEL to include the Apache bugfix? In the mean time, we can raise the default value in our installer so it's 4000 everywhere.

Comment 2 Pavel Moravec 2023-09-27 11:14:18 UTC
(In reply to Ewoud Kohl van Wijngaarden from comment #1)
> Thanks for uncovering this. I get the impression just raising the
> MaxRequestsPerChild value isn't solving it. Just making it less frequent.

Indeed, that is my understanding as well.

> Should we push for RHEL to include the Apache bugfix? In the mean time, we
> can raise the default value in our installer so it's 4000 everywhere.

I think pushing for RHEL fix is the right long-term way since the bug is in httpd component - let me know if I shall help with raising that BZ (i.e. preparing a standalone reproducer outside Satellite).

No idea if/what some better short-to-middle term solution exists.

Comment 3 Ewoud Kohl van Wijngaarden 2023-09-27 11:50:02 UTC
Looking deeper at the docs we can see that for Apache there's https://httpd.apache.org/docs/2.2/mod/mpm_common.html#MaxRequestsPerChild but that's Apache 2.2, defaulting to 10000. Actually in 2.4 it was renamed to MaxConnectionsPerChild: https://httpd.apache.org/docs/2.4/mod/mpm_common.html#maxconnectionsperchild (which interprets the old MaxRequestsPerChild setting), defaulting to 0.

So our default tuning doesn't limit it at all and doesn't recycle workers, but large installations do. So why do we even set this today? Back in the day with mod_wsgi and mod_passenger it was possible to have memory leaks in application code so it made sense, but now we're a pure reverse proxy with minimal code. I'd trust Apache to not leak memory and propose we drop the setting, relying on the default.

Small detail: since https://github.com/puppetlabs/puppetlabs-apache/commit/cedd45b63be89ea54bd2a596e6cd3a3f60d4faf8 the parameter doesn't exist anymore and setting apache::mod::event::maxrequestsperchild in Hiera does nothing. So Foreman 3.8 includes puppetlabs-apache >= 9.0 and I hadn't noticed this before.

Can you perform testing with the value set to 0 and see if it still happens?

Comment 4 Ewoud Kohl van Wijngaarden 2023-09-27 11:55:30 UTC
I opened https://projects.theforeman.org/issues/36784 to drop it from the newer versions (since it's pointless). If testing shows it's better to drop the tuning, we should cherry pick it further back.

Comment 5 Pavel Moravec 2023-09-27 17:36:08 UTC
> Can you perform testing with the value set to 0 and see if it still happens?

I run two sets of tests:

1) MaxRequestsPerChild set to zero
2) MaxRequestsPerChild setting even removed (which should imply zero, but let double-check..)

In both cases, I run > 160k iterations / individual curl requests without an issue. So either setting prevents the bug.

Comment 7 Ewoud Kohl van Wijngaarden 2023-10-03 10:33:03 UTC
Thanks for testing. The default value is 0, so both of those test cases should be the same but still good to have confirmation.

The event MPM is default since Foreman 3.3 (https://projects.theforeman.org/issues/20889), so this bug affects users who chose a tuning profile on Satellite 6.12+.

Moving to POST since https://github.com/theforeman/foreman-installer/commit/4462c6d4fc34cfdfe73d31c63cbf39eb979f73e6 was merged.

Comment 11 Griffin Sullivan 2023-10-18 17:08:13 UTC
FailedQA on 6.14 snap 19

The changes are not present in the snap.


# grep maxrequestsperchild /usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/*yaml
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/extra-extra-large.yaml:apache::mod::event::maxrequestsperchild: 4000
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/extra-large.yaml:apache::mod::event::maxrequestsperchild: 4000
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/large.yaml:apache::mod::event::maxrequestsperchild: 4000
/usr/share/foreman-installer/config/foreman.hiera/tuning/sizes/medium.yaml:apache::mod::event::maxrequestsperchild: 4000

Comment 12 Bryan Kearney 2023-10-18 20:02:49 UTC
Upstream bug assigned to ekohlvan

Comment 17 Brad Buckingham 2023-10-30 11:29:29 UTC
Bulk setting Target Milestone = 6.15.0 where sat-6.15.0+ is set.

Comment 19 Griffin Sullivan 2023-12-13 15:54:17 UTC
Verified in 6.15.0 snap 2.1

maxrequestperchild not set in tuning profiles.

Comment 22 errata-xmlrpc 2024-04-23 17:14:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.15.0 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:2010

Comment 23 Red Hat Bugzilla 2024-08-22 04:25:16 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.