Bug 1631590

Summary: Candlepin throws 500 Internal Server Error for more than 40+ guests
Product: Red Hat Satellite Reporter: Eko <hsun>
Component: CandlepinAssignee: Justin Sherrill <jsherril>
Status: CLOSED ERRATA QA Contact: Jitendra Yejare <jyejare>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.4CC: ahumbe, crog, dpeess, egolov, ehelms, jcallaha, jyejare, khowell, ldai, mmccune, omaciel, pcreech
Target Milestone: 6.4.0Keywords: PrioBumpGSS, Regression, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: candlepin-2.4.8-1,tfm-rubygem-katello-3.7.0.41-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1632761 1632764 1633252 (view as bug list) Environment:
Last Closed: 2018-10-16 19:16:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1632764    
Bug Blocks: 1586210    

Comment 4 Jitendra Yejare 2018-09-21 08:00:39 UTC
Hi Eko,


The Package Versions:

virt-what-1.18-4.el7.x86_64
tfm-rubygem-hammer_cli_foreman_virt_who_configure-0.0.3-2.el7sat.noarch
tfm-rubygem-foreman_virt_who_configure-0.2.2-1.el7sat.noarch



Also, I don't see any logs under satellites /var/log/rhsm/rhsm.log during posting.



Note:
------
I can easily post upto 30 virt guests, but the issue comes when it is more than or equal to 40.

Comment 5 Eko 2018-09-21 08:18:08 UTC
Hi Jitendra,
it's not virt-what package, it should be virt-who, please check again.

and /var/log/rhsm/rhsm.log should be in the host which virt-who was installed.

Comment 6 Eko 2018-09-21 08:55:28 UTC
After discussing with Jitendra, virt-who package never be installed and used for this issue, so I'm afraid it's not a virt-who bug.

According to the error log message, I will move it to the candlepin component to check again.

Comment 8 Mike McCune 2018-09-21 20:58:34 UTC
This does appear to be natively in Candlepin, after posting directly to the virt-who API endpoint the error occurs easily. 

We can tune around this via configs in server.xml, I had to get up into the 100MB range before 100 hosts would work:

               maxHttpHeaderSize="100000"

that is quite a bit more than the default of 4MB which indicates to me that something isn't really working correctly in formulating the response back to the API call.

Customers with 1000 virtual machines in a virt-who transaction would overwhelm the server and require 1G+ ram in the response which .. is a bit much for a HTTP response.

We need to examine how we are formulating the response headers in this API call.

Comment 9 Mike McCune 2018-09-21 21:04:17 UTC
Reproducer info:

I used https://hub.docker.com/r/jacobcallahan/genvirt/ to generate the API call.

Comment 13 Justin Sherrill 2018-09-24 16:33:29 UTC
Created redmine issue https://projects.theforeman.org/issues/25026 from this bug

Comment 14 Brad Buckingham 2018-09-24 20:33:47 UTC
The upstream katello PR has now merged as well.  Moving the BZ to POST.

Comment 17 Jitendra Yejare 2018-09-27 14:13:07 UTC
FailedQA!

Steps:
--------

1. Post 1000/300/200 hypervisors json to rhsm/hypervisors using https://hub.docker.com/r/jacobcallahan/genvirt


Observation:
--------------

500 ISE Error : https://pastebin.com/YrpBeArD


Note:
------------

The post of 100 hypervisors json works.


Already discussed and shared details with jsherrill, crog.

Comment 20 Brad Buckingham 2018-09-28 12:08:12 UTC
PR merged upstream; therefore, moving BZ to POST.

Comment 21 Jitendra Yejare 2018-09-28 20:50:02 UTC
Tested the patch downstrean on RC1(snap 24), It works nicely.

Tried posting 200 and max 1000 at a time, nothing breaks works perfect.

Comment 23 Mike McCune 2018-10-02 16:25:04 UTC
tested this with 5k hosts without error:

#   docker run -e "SATHOST=sat-r220-01.lab.eng.rdu2.redhat.com" -e "COUNT=5000" jacobcallahan/genvirt
...
Generating data with 5000 hosts.
Submitting data to sat-r220-01.lab.eng.rdu2.redhat.com. This may take a while...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  771k    0     4  100  771k      0    494  0:26:39  0:26:38  0:00:01     0
nullUnregistering from Satellite
Unregistering from: sat-r220-01.lab.eng.rdu2.redhat.com:443/rhsm
System has been unregistered.
Done!

Comment 25 Bryan Kearney 2018-10-16 19:16:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927