Bug 2035873 - Concurrent CV Publish fails with 502 error
Summary: Concurrent CV Publish fails with 502 error
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: 6.11.0
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
Depends On:
Blocks: 2000769
TreeView+ depends on / blocked
 
Reported: 2021-12-28 09:36 UTC by Jitendra Yejare
Modified: 2023-07-20 13:08 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-07 15:39:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulpcore issues 2250 0 None closed Gunicorn consuming excessive amounts of memory 2023-07-20 13:08:39 UTC

Description Jitendra Yejare 2021-12-28 09:36:13 UTC
Description of problem:
Concurrent / Parallel CV publishing failed with error - 
```
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/pulp/api/v3/content/rpm/packages/">GET&nbsp;/pulp/api/v3/content/rpm/packages/</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
Error message: the server returns an error
HTTP status code: 502
```

Version-Release number of selected component (if applicable):
Satellite 7.0 snap 3

How reproducible:


Steps to Reproduce:
1. Sync 5 big RHEL repos from CDN using subscription.
2. Create 6 CVs (satellite has 6 workers since I had 6 cores in satellite server) with each containing all 5 repos from step 1.
3. Start publishing 6 CVs in parallel.

Actual results:
All 6 CVs publishing errored with the same error as mentioned in the description of the bug.

Expected results:
All 6 CVs publishing should be successful using 6 pulp workers without error!

Additional info:

The stack trace from a CV-
```
Error message: the server returns an error
HTTP status code: 502
Response headers: {"Date"=>"Mon, 27 Dec 2021 15:26:35 GMT", "Server"=>"Apache", "Content-Length"=>"445", "Content-Type"=>"text/html; charset=iso-8859-1"}
Response body: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/pulp/api/v3/content/rpm/packages/">GET&nbsp;/pulp/api/v3/content/rpm/packages/</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
Error message: the server returns an error
HTTP status code: 502
Response headers: {"Date"=>"Mon, 27 Dec 2021 15:25:43 GMT", "Server"=>"Apache", "Content-Length"=>"445", "Content-Type"=>"text/html; charset=iso-8859-1"}
Response body: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/pulp/api/v3/content/rpm/packages/">GET&nbsp;/pulp/api/v3/content/rpm/packages/</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
Error message: the server returns an error
HTTP status code: 502
Response headers: {"Date"=>"Mon, 27 Dec 2021 15:25:42 GMT", "Server"=>"Apache", "Content-Length"=>"445", "Content-Type"=>"text/html; charset=iso-8859-1"}
Response body: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/pulp/api/v3/content/rpm/packages/">GET&nbsp;/pulp/api/v3/content/rpm/packages/</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
```

Comment 2 Jitendra Yejare 2022-01-17 11:42:19 UTC
The bug was raised for the satellite installed on RHEL7!

Comment 3 Grant Gainey 2022-02-21 14:52:04 UTC
OOMKiller dropped by for a visit and killed gunicorn:

distribution_trees/?repository_version=%2Fpulp%2Fapi%2Fv3%2Frepositories%2Frpm%2Frpm%2F80c56992-3363-4f42-9e6e-ef877138754e%2Fversions%2F1%2F HTTP/1.1" 200 52 "-" "OpenAPI-Generator/3.16.1/ruby"
Dec 27 10:36:51 dhcp-3-18 kernel: gunicorn invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Dec 27 10:36:51 dhcp-3-18 kernel: gunicorn cpuset=/ mems_allowed=0
Dec 27 10:36:51 dhcp-3-18 kernel: CPU: 2 PID: 52410 Comm: gunicorn Kdump: loaded Not tainted 3.10.0-1160.49.1.el7.x86_64 #1
Dec 27 10:36:51 dhcp-3-18 kernel: Hardware name: Red Hat RHEL/RHEL-AV, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
Dec 27 10:36:51 dhcp-3-18 kernel: Call Trace:
...
Dec 27 10:36:51 dhcp-3-18 kernel: Killed process 52410 (gunicorn), UID 993, total-vm:3940968kB, anon-rss:3555200kB, file-rss:0kB, shmem-rss:0kB

If you're going to do concurrent work, with large repos, on a memory-constrained system, WITH NO SWAP:

Dec 27 10:36:51 dhcp-3-18 kernel: Total swap = 0kB

you're going to have A Bad Time.

(note: "no swap" is NOT a supported Satellite configuration, and should never be used to open BZs...)

Comment 4 Grant Gainey 2022-02-22 17:07:21 UTC
Closing as dup of "sync takes more memory than it used to" BZ

*** This bug has been marked as a duplicate of bug 1994397 ***

Comment 5 Daniel Alley 2022-02-22 17:15:26 UTC
Reopening because there could be a separate issue here from #1994397

The OOM killer targeted gunicorn, and it seems like gunicorn was using ~3.5-4gb of memory, which seems excessive.

    Dec 27 10:36:51 dhcp-3-18 kernel: Killed process 52410 (gunicorn), UID 993, total-vm:3940968kB, anon-rss:3555200kB, file-rss:0kB, shmem-rss:0kB

Comment 7 Jitendra Yejare 2022-03-30 07:10:59 UTC
@bbuckingham This bug is still in a new state and it's blocking an important ONQA from verification. Can we prioritize this bug please?

Comment 8 Stephen Wadeley 2022-03-30 08:28:59 UTC
Hi @jyejare 

Have you retested this on a system with a swap file?


Thank you

Comment 9 Jitendra Yejare 2022-04-07 11:26:23 UTC
@swadeley , When I raised this bug I used SatLab system and by default satlab systems has swap file!


Still I will retake and confirm !

Comment 10 Jitendra Yejare 2022-04-07 15:39:02 UTC
Retested with satellite system with the swapfile.

Status: Closed WorksForMe

Steps to Reproduce:
1. Synced 5 big RHEL repos from CDN using subscription.
2. Created 6 CVs (satellite has 6 workers since I have 6 cores in satellite server) with each containing all 5 repos from step 1.
3. Published 6 CVs in parallel.

Actual results:
All 6 CVs were published successfully using 6 pulp workers without error!

Comment 11 pulp-infra@redhat.com 2022-11-16 14:08:15 UTC
The Pulp upstream bug status is at closed. Updating the external tracker on this bug.

Comment 12 Robin Chan 2023-03-02 15:06:45 UTC
The Pulp upstream bug status is at open. Updating the external tracker on this bug.

Comment 13 Robin Chan 2023-07-20 13:08:40 UTC
The Pulp upstream bug status is at closed. Updating the external tracker on this bug.


Note You need to log in before you can comment on or make changes to this bug.