Bug 1976694
Summary: | Http 502 Bad Gateway and 503 service not available due to Puma not accepting connections fast enough | ||
---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Hao Chang Yu <hyu> |
Component: | Installation | Assignee: | satellite6-bugs <satellite6-bugs> |
Status: | CLOSED ERRATA | QA Contact: | Devendra Singh <desingh> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.9.0 | CC: | ahumbe, amarirom, ehelms, jjeffers, ktordeur, pcreech, pmendezh, sadas, saydas, smeyer, vcojot, yferszt |
Target Milestone: | 6.9.6 | Keywords: | Performance, PrioBumpGSS, Triaged |
Target Release: | Unused | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-09-21 14:37:26 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Hao Chang Yu
2021-06-28 04:27:56 UTC
Placing this on the Installer; however, please do move if another component is more appropriate. I tested this in Satellite 6.10 snap 6 and I am not able to reproduce the issue. It seems like Puma in Satellite 6.10 is listening on a unix socket instead of the tcp socket which fixed the connection timed out and the "connection reset by peer" errors. If I switch the Puma back to tcp socket, then the errors come back. The issue is also fixed after I switch the Puma in Satellite 6.9 to use unix socket. Therefore, my conclusion is the issue remains in Puma 5.3 but it can be fixed by using the unix socket. Verified on 6.9.6 Snap2 Verification Points: 1- Created the base version of the Satellite setup using 6.9.4 GA template. 2- Checked the content host count and it was around 40. 3- Ran the upgrade from 6.9.4 to 6.9.6 Snap2. 4- Upgrade completed successfully. 5- Checked the "connection reset by peer" in /var/log/httpd/foreman-ssl_error_ssl.log and I didn't see any error there. 6- Trigger the 300 and 400 requests from other hosts on this satellite machine. irb(main):007:0> 300.times { Thread.new { begin; RestClient::Resource.new("https://xyz.com/api/v2/hosts?installed_package_name=kernel", user: "admin", password: "xyz", timeout: 3600, open_timeout: 3600, verify_ssl: OpenSSL::SSL::VERIFY_NONE).get; rescue StandardError => e; p e.message; end } } => 300 irb(main):008:0> 400.times { Thread.new { begin; RestClient::Resource.new("https://xyz.com/api/v2/hosts?installed_package_name=kernel", user: "admin", password: "xyz", timeout: 3600, open_timeout: 3600, verify_ssl: OpenSSL::SSL::VERIFY_NONE).get; rescue StandardError => e; p e.message; end } } => 400 7- Checked the info logs and found all the requests came on that server # tail -f foreman-ssl_access_ssl.log XXX.XXX.XXX.XXX - - [13/Sep/2021:11:06:46 -0400] "GET /api/v2/hosts?installed_package_name=kernel HTTP/1.1" 200 4065 "-" "Ruby" XXX.XXX.XXX.XXX - - [13/Sep/2021:11:06:47 -0400] "GET /api/v2/hosts?installed_package_name=kernel HTTP/1.1" 200 4066 "-" "Ruby" XXX.XXX.XXX.XXX - - [13/Sep/2021:11:06:48 -0400] "GET /api/v2/hosts?installed_package_name=kernel HTTP/1.1" 200 4069 "-" "Ruby" XXX.XXX.XXX.XXX - - [13/Sep/2021:11:06:47 -0400] "GET /api/v2/hosts?installed_package_name=kernel HTTP/1.1" 200 4061 "-" "Ruby" XXX.XXX.XXX.XXX - - [13/Sep/2021:11:06:47 -0400] "GET /api/v2/hosts?installed_package_name=kernel HTTP/1.1" 200 4065 "-" "Ruby" XXX.XXX.XXX.XXX - - [13/Sep/2021:11:06:49 -0400] "GET /api/v2/hosts?installed_package_name=kernel HTTP/1.1" 200 4065 "-" "Ruby" 8- Checked the logs /var/log/httpd/foreman-ssl_error_ssl.log and didn't see any "connection reset by peer" error. # less /var/log/httpd/foreman-ssl_error_ssl*|grep -i "connection reset by peer" # For now, I am marking this bug verified based on the above verification points, if something is missing please confirm will check that part too. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Satellite 6.9.6 Async Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3628 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |