Bug 2223165

Summary: Refresh Alternate Content Source fails with HTTP status code 502 error
Product: Red Hat Satellite Reporter: mithun kalyat <mkalyat>
Component: Alternate Content SourcesAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Satellite QE Team <sat-qe-bz-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.14.0CC: ahumbe, dalley, iballou, rlavi
Target Milestone: Unspecified   
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-26 16:11:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mithun kalyat 2023-07-16 04:32:50 UTC
Description of problem:

Configuring alternate content sources on the Capsule server to sync directly from  CDN fails with below error:

1 subtask(s) failed for task group /pulp/api/v3/task-groups/eb69bc9d-88b4-4ad6-8091-77e02c06389a/.
Errors:
 {"reason"=>"Killed by signal 9."}

Eventually:

Error message: the server returns an error
HTTP status code: 502
Response headers: {"Date"=>"Sat, 15 Jul 2023 19:06:50 GMT", "Server"=>"Apache", "Content-Length"=>"341", "Content-Type"=>"text/html; charset=iso-8859-1"}
Response body: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

Capsule is then unavailable:

Failure: ERF50-5345 [Foreman::WrappedException]: Unable to connect ([ProxyAPI::ProxyException]: ERF12-7885 [ProxyAPI::ProxyException]: Unable to fetch logs ([RestClient::Exceptions::OpenTimeout]: Timed out connecting to server) for Capsule https://capsule.example.com:9090/logs)

[root@mkalyat-sat ~]# hammer alternate-content-source info --id 2
ID:                            2
Name:                          test
Label:                         test
Description:                   
Content type:                  yum
Alternate content source type: simplified
Products:                      
 1) Id:              10
    Organization ID: 1
    Name:            Red Hat Enterprise Linux for x86_64
    Label:           Red_Hat_Enterprise_Linux_for_x86_64
Smart proxies:                 
 1) Id:              2
    Name:            capsule.example.com
    URL:             https://capsule.example.com:9090
    Download policy: on_demand


Version-Release number of selected component (if applicable):

Satellite 6.14

How reproducible:


Steps to Reproduce:

   Content --> Alternate Content Sources --> Add Source --> Simplified --> Content type 'yum' --> Select the external Capsule 

Then:

  Content > Alternate Content Sources --> Test_source --> Refresh

Actual results:

Refresh fails and Capsule went down.

Expected results:

Refresh should be successful.

Comment 4 Brad Buckingham 2023-07-17 12:09:38 UTC
Is this a regression in behavior from Satellite 6.12?  Thanks!

Comment 5 Ian Ballou 2023-07-20 18:28:16 UTC
Mithun,

An alternate content source refresh in Pulp is more or less the same thing as 'n' on_demand sync tasks happening at the same time on the capsule itself. If a user has every product in an ACS on a capsule, that capsule is thus essentially going to be "syncing" every product at refresh time. The only difference here is that the refresh doesn't download any content to disk ever. Downloads only happen during that actual capsule sync.

Long story short, refresh is going to be a pretty resource-heavy task especially with many products at once.

If you count up all of the repositories that are being refreshed on the capsule and consider the resources taken up if you were to sync that many on a Satellite at once, does it more or less match up? If so then I think we're all good here.

Comment 6 Ian Ballou 2023-07-20 18:35:32 UTC
To add to my Comment 5, ACS refresh on a capsule shouldn't take more RAM than a Library capsule sync.

Comment 9 Ian Ballou 2023-07-25 13:25:42 UTC
It makes sense to me that refreshing RHEL 8 + RHEL 9 repositories would OOM on a capsule with only 7 GB of RAM.  I'm going to test that we can refresh those repositories with the minimum suggested RAM on a capsule. If we can, I will close this BZ out as NOTABUG.

Comment 10 Ian Ballou 2023-07-25 20:51:28 UTC
I wasn't able to refresh these 4 repositories with 12 GB of RAM.  Now I'm questioning though if a capsule could even sync those 4 repos together with only 12 GB RAM.

Daniel, can you (or someone from Pulp) help give us the final word on if it's unreasonable to expect a machine with 12 GB of RAM to refresh 4 sizeable ACSs at the same time? The 4 repos are RHEL 8 AppStream + BaseOS and RHEL 9 AppStream + BaseOS.

Comment 11 Daniel Alley 2023-07-25 23:54:43 UTC
You can also run this a couple of times throughout the process until the failure, to see which processes are using up the available memory.

printf "%s\n\n" "`top -b -o +%MEM | head -n 22`" >> memory_log.txt

Comment 12 Daniel Alley 2023-07-25 23:57:18 UTC
>> Daniel, can you (or someone from Pulp) help give us the final word on if it's unreasonable to expect a machine with 12 GB of RAM to refresh 4 sizeable ACSs at the same time? The 4 repos are RHEL 8 AppStream + BaseOS and RHEL 9 AppStream + BaseOS.

It depends on how much other stuff is running in the background I suppose.  Full satellite has a minimum memory requirement of 20gb. I'd like to see some hard numbers first w/r/t how much memory Pulp is using during the refresh, re: my previous comment.

Comment 13 Ian Ballou 2023-07-26 16:11:37 UTC
I'm closing this as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2219885

Both are in regards to refreshing repositories on capsules and running out of RAM.

*** This bug has been marked as a duplicate of bug 2219885 ***