Bug 2022240

Summary: Intermittent HTTP 502 Bad Gateway cause by connection reset by peer when Satellite server having high http load
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: InstallerAssignee: satellite6-bugs <satellite6-bugs>
Status: NEW --- QA Contact: Gaurav Talreja <gtalreja>
Severity: high Docs Contact:
Priority: high    
Version: 6.9.6CC: ahumbe, alex.bron, alsouza, dleroux, egolov, ehelms, ekohlvan, gtalreja, jhutar, jjansky, jpasqual, ktordeur, mharbi, mjia, momran, ngalvin, pdwyer, pmendezh, pmoravec, psatpute, rrajput, sadas, saydas, smeyer, tasander, vchepkov
Target Milestone: UnspecifiedKeywords: Patch, Performance, PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2029606    

Description Hao Chang Yu 2021-11-11 07:30:57 UTC
Description of problem:

After fixing bug 1976694, I found that clients will still get connection reset by peer from the Puma server when we put even higher HTTP load.

Based on what I understand, Puma has 1024 default backlog but the default SOMAXCONN kernel parameter is only 128.

According to "man listen"
---------------------------
If the backlog argument is greater than the value in /proc/sys/net/core/somaxconn, then it is silently truncated to that value; the default value in this file is 128.  In kernels before 2.4.25, this limit was a hard coded value, SOMAXCONN, with the value 128.
---------------------------

This means the 1024 backlog will be capped to 128. We can determine this by running the following command:

# sysctl net.core.somaxconn
net.core.somaxconn = 128  <========

# ss -l | grep foreman
Netid State      Recv-Q Send-Q                                                        Local Address:Port
u_str  LISTEN     0      128    /run/foreman.sock XXXXXXX               * 0   <============ Send-Q (max listen backlog) is 128

Steps to Reproduce:

1) Run the following http on any host to generate 600 simultaneous requests to the Satellite server

irb
require 'rest_client'
600.times { Thread.new { begin; RestClient::Resource.new("https://example.satellite.com/api/v2/hosts?installed_package_name=kernel&page=1&per_page=20", user: "your admin username", password: "your password", timeout: 3600, open_timeout: 3600, verify_ssl: OpenSSL::SSL::VERIFY_NONE).get; rescue StandardError => e; p e.message; end } }


2. On the Satellite, server run the following command to monitor the backlog queue. "Recv-Q" (current listen backlog) will increase up to 129.

# watch 'ss -l | grep foreman'
Netid State      Recv-Q Send-Q                                                        Local Address:Port
u_str  LISTEN     129      128   /run/foreman.sock XXXXXXX               * 0 


3. Open another window and run the following command to monitor the queue overflow and packet drop. Queue overflow times and dropped SYN will increase once "Recv-Q" reached 129.

watch 'netstat -s | grep -i LISTEN'
    16712 times the listen queue of a socket overflowed
    17602 SYNs to LISTEN sockets dropped


Actual results:
Receive many "Connection reset by peer - SSL_connect" error on clients

Expected results:
No "Connection reset by peer - SSL_connect" error


Additional info:
I am able to avoid this issue completely after increasing the SOMAXCONN kernel parameter.

Temporary increase the somaxconn
-------------------------------------
sysctl -w net.core.somaxconn=1024
systemctl daemon-reload
systemctl restart foreman.socket foreman
-------------------------------------

# ss -l | grep foreman
u_str  LISTEN     0      1024   /run/foreman.sock XXXXXXX               * 0  <================== Send-Q (max listen backlog) is now 1024

Comment 2 Hao Chang Yu 2021-11-17 04:55:12 UTC
I think I found the actual root cause for the intermittent HTTP 502 status caused by "connection reset by peer":
------------------------------------------
# /var/log/httpd/foreman-ssl_error_ssl.log
[proxy_http:error] [pid XXXX] (104)Connection reset by peer: [client XX.XX.XX.XX:XXXX] AH01102: error reading status line from remote server httpd-UDS:0
[proxy:error] [pid XXXX] [client XX.XX.XX.XX:XXXX] AH00898: Error reading from remote server returned by /rhsm/consumers/<uuid>
-------------------------------------------

This seems like an Apache bug[1]. See https://serverfault.com/questions/206738/intermittent-error-when-using-mod-proxy-to-do-reverse-proxy-to-soap-service. There is a race condition about connection close and reuse between Proxy (Apache) and the backend server (Puma here). Also see "proxy-initial-not-pooled" in Apache doc[2]

[1] https://bz.apache.org/bugzilla/show_bug.cgi?id=37770
[2] http://httpd.apache.org/docs/2.4/mod/mod_proxy_http.html

After several tries and finally I am able to reproduce it correctly

Adding either "SetEnv proxy-nokeepalive 1" or "SetEnv proxy-initial-not-pooled 1" to the foreman httpd config solves the issue in my reproducer. In my opinion, "proxy-initial-not-pooled" is better because it is not disabling the keepalive completely.

According to my tests, I don't see any noticable performance degradation after applying the above httpd config. However, I will leave it for the performance team to verify further.

It is still good to increase the "net.core.somaxconn" in comment 0 to match the Puma backlog to avoid overflow or syn packet drop.


Script to reproduce HTTP 502 status:

require 'benchmark'
require 'rest_client'
puts Benchmark.measure { 10.times { 5.times { 250.times { Thread.new { begin; RestClient::Resource.new("https://example.satellite.com/rhsm/consumers/<consumer uuid>", user: "your admin username", password: "your password", timeout: 3600, open_timeout: 3600, verify_ssl: OpenSSL::SSL::VERIFY_NONE).get; rescue StandardError => e; p e.message; end }}; sleep 5 }; sleep 30} }

Comment 5 Pavel Moravec 2022-01-26 13:31:01 UTC
Hao++ for the BZ, reproducer and ways of fix it.

Comment 21 Eric Helms 2023-03-01 02:57:23 UTC
While we look at how we present this option to the user based on our performance data (present by default or configurable to turn on), at least Satellite 6.10+ should be able to configure this for the time being persistently using custom-hiera.

Add the following to /etc/foreman-installer/custom-hiera.yaml:

foreman::config::apache::https_vhost_options: {'setenv': 'proxy-initial-not-pooled 1'}

Comment 23 Eric Helms 2023-03-01 19:56:15 UTC
(In reply to Eric Helms from comment #21)
> While we look at how we present this option to the user based on our
> performance data (present by default or configurable to turn on), at least
> Satellite 6.10+ should be able to configure this for the time being
> persistently using custom-hiera.
> 
> Add the following to /etc/foreman-installer/custom-hiera.yaml:
> 
> foreman::config::apache::https_vhost_options: {'setenv':
> 'proxy-initial-not-pooled 1'}

On the Satellite itself, this configuration should apply this to Pulp endpoints as well given that Foreman and Pulp share a vhost.


On a Capsule, this option should be applicable via custom hiera:

pulpcore::apache::https_vhost_options: {'setenv': 'proxy-initial-not-pooled 1'}

Comment 24 Ewoud Kohl van Wijngaarden 2023-03-08 19:30:32 UTC
I'm reading through this thread, so some observations.

(In reply to Hao Chang Yu from comment #2)
> Adding either "SetEnv proxy-nokeepalive 1" or "SetEnv
> proxy-initial-not-pooled 1" to the foreman httpd config solves the issue in
> my reproducer. In my opinion, "proxy-initial-not-pooled" is better because
> it is not disabling the keepalive completely.

The proxy-initial-not-pooled has the downside of downgrading to HTTP/1.0. Note HTTP/1.1 specifically made changes around Keep-Alive: it changed from off by default to on by default. I wonder if proxy-initial-not-pooled effectively implies proxy-nokeepalive.

https://stackoverflow.com/questions/169453/bad-gateway-502-error-with-apache-mod-proxy-and-tomcat suggests that just setting ProxyTimeout is insufficient. Now it should be mentioned that they talk about Apache 2.2, but the description of the scenario is very similar. Then https://iximiuz.com/en/posts/reverse-proxy-http-keep-alive-and-502s/ explain it very clearly and even though it's a completely different stack, I suspect it's exactly what we're seeing.

It has clear mitigations:

> * Make sure the upstream (application) idle connection timeout is longer than the downstream (proxy) connection timeout

I'm not sure how feasible this is. We currently have KeepAliveTimeout 15 on Satellite, but set the proxy timeout very high on the Capsule because some requests can take very long. Setting a KeepAliveTimeout as high as we do on the Capsule feels like a disaster waiting to happen.

> * Disable connection reuse between the upstream and the proxy (turn off HTTP Keep-Alive)

I think you already observed that disabling Keep-Alive helped. And that also increases my belief that proxy-initial-not-pooled implies disabling Keep-Alive. I'd prefer proxy-nokeepalive over proxy-initial-not-pooled.

> * Employ HTTP 5xx retries on the client-side (well, they are a must-have anyway).

This would mean changing subscription-manager and I don't think that's feasible in the short term.

I'd like to investigate the HTTP/2 direction though. We recently started to support HTTP/2 so I'd actually like to investigate if we can utilize HTTP/2 by using mod_proxy_http2. It is marked as experimental, but should be available. Since HTTP/2 supports multiple streams over the same TCP socket, perhaps sidestepping the whole problem. Quoting https://www.rfc-editor.org/rfc/rfc7540.html#section-8.1.2.2:

>    This means that an intermediary transforming an HTTP/1.x message to
>   HTTP/2 will need to remove any header fields nominated by the
>   Connection header field, along with the Connection header field
>   itself.  Such intermediaries SHOULD also remove other connection-
>   specific header fields, such as Keep-Alive, Proxy-Connection,
>   Transfer-Encoding, and Upgrade, even if they are not nominated by the
>   Connection header field.

Which makes me think Keep-Alive simply doesn't exist at all in HTTP/2 as it did with HTTP/1.1: there are better mechanisms for it.

https://httpd.apache.org/docs/current/mod/mod_proxy_http2.html

> It is still good to increase the "net.core.somaxconn" in comment 0 to match
> the Puma backlog to avoid overflow or syn packet drop.

We set Backlog= parameter in foreman.socket. From man 5 systemd.socket:

> Backlog=
>    Takes an unsigned integer argument. Specifies the number of connections to queue that have not been accepted yet. This setting matters only for stream and sequential
>    packet sockets. See listen(2) for details. Defaults to SOMAXCONN (128).

However, I can confirm I that on EL8 (systemd 239) I see it has a backlog of 128. Testing the same thing on Fedora 36 (systemd 250) does set the queue to match. Certainly something surprising and makes me suspect a systemd bug, but I haven't looked into this.

Comment 25 Ewoud Kohl van Wijngaarden 2023-03-08 19:35:41 UTC
Apologies for the quick follow up on myself.

(In reply to Ewoud Kohl van Wijngaarden from comment #24)
> (In reply to Hao Chang Yu from comment #2)
> > It is still good to increase the "net.core.somaxconn" in comment 0 to match
> > the Puma backlog to avoid overflow or syn packet drop.
> 
> We set Backlog= parameter in foreman.socket. From man 5 systemd.socket:
> 
> > Backlog=
> >    Takes an unsigned integer argument. Specifies the number of connections to queue that have not been accepted yet. This setting matters only for stream and sequential
> >    packet sockets. See listen(2) for details. Defaults to SOMAXCONN (128).
> 
> However, I can confirm I that on EL8 (systemd 239) I see it has a backlog of
> 128. Testing the same thing on Fedora 36 (systemd 250) does set the queue to
> match. Certainly something surprising and makes me suspect a systemd bug,
> but I haven't looked into this.

Disregard that: https://github.com/systemd/systemd/issues/4629#issuecomment-260311719 states:

> It is clamped to that value by the kernel

On my Fedora net.core.somaxconn is 4096 so that explains why I was able to raise it using Backlog=

Comment 26 Ewoud Kohl van Wijngaarden 2023-04-24 17:41:49 UTC
I didn't share this before, but I did start work on installer support so we can try it out. https://github.com/puppetlabs/puppetlabs-apache/pull/2393 aims to add support for mod_proxy_http2 in the Puppet module we rely on. Then https://github.com/theforeman/puppet-foreman_proxy_content/pull/442 builds on that. Right now it's untested.