Bug 1297683

Summary: "Error: Request Timeout" from hammer when asked to show >400 content hosts
Product: Red Hat Satellite Reporter: Christian Horn <chorn>
Component: WebUIAssignee: Andrew Kofink <akofink>
Status: CLOSED DUPLICATE QA Contact: Katello QA List <katello-qa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1.4CC: akofink, bbuckingham, bkearney, dlobatog, ehelms, erinn.looneytriggs, inecas, jsherril, mmccune, smeyer, sshtein, sthirugn
Target Milestone: UnspecifiedKeywords: Performance, Triaged
Target Release: Unused   
Hardware: All   
OS: Linux   
URL: http://projects.theforeman.org/issues/16010
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1426378 (view as bug list) Environment:
Last Closed: 2017-03-05 09:25:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1296845, 1426378    

Description Christian Horn 2016-01-12 08:49:47 UTC
Description of problem:
time hammer content-host list --organization <orgname>
Error: Request Timeout
real    2m2.340s
user    0m1.179s
sys     0m0.220s

Version-Release number of selected component (if applicable):
satellite 6.1.4

How reproducible:
unknown
(in this environment here with >500 content hosts the issue always exists)

Steps to Reproduce:
1. register more than 500 content hosts
2. hammer content-host list --organization <orgname>
3.

Actual results:
Error: Request Timeout

Expected results:
<result should be displayed>

Additional info:
- using ":request_timeout: -1", the result appears
- enforcing paging, the result also is displayed.  I.e. "hammer content-host list --organization <orgname> --per-page 300".  Using "--per-page 500", the timeout is hit again.
- The time for execution suggests that the code contains a bug.  From what I understand, this should be a simple database query.  Instead it takes a very long time:
--------
[root@sat6 ~]# time hammer content-host list --organization orgname
Error: Request Timeout

real    2m2.340s
user    0m1.179s
sys     0m0.220s

[root@sat6 ~]# time hammer content-host list --organization orgname --per-page 300
...
real    0m49.191s
user    0m5.763s
sys     0m0.240s

[root@sat6 ~]# time hammer content-host list --organization orgname --per-page 400
...
real    1m8.274s
user    0m9.259s
sys     0m0.250s

[root@sat6 ~]# time hammer content-host list --organization orgname --per-page 450
Error: Request Timeout

real    2m1.703s
user    0m1.100s
sys     0m0.105s

[root@sat6 ~]# time hammer content-host list --organization orgname --per-page 500
Error: Request Timeout

real    2m1.365s
user    0m1.040s
sys     0m0.087s

[root@sat6 ~]# time hammer content-host list --organization orgname --search "be*" (this should yield 470 machines)
Error: Request Timeout

real    2m1.184s
user    0m1.021s
sys     0m0.108s

[root@sat6 ~]# time hammer content-host list --organization orgname --search "a*" (this should yield about 25 machines)
...
real    0m4.573s
user    0m1.168s
sys     0m0.137s
--------

Comment 1 Christian Horn 2016-01-12 08:52:21 UTC
I suspect issues in the code, a quick fix would be if hammer would atleast use paging by itself.  I.e. paging to 400 results by default.

Comment 5 Bryan Kearney 2016-08-04 20:18:18 UTC
Moving 6.2 bugs out to sat-backlog.

Comment 6 Andrew Kofink 2016-08-08 21:08:59 UTC
Created redmine issue http://projects.theforeman.org/issues/16010 from this bug

Comment 7 Bryan Kearney 2016-08-09 00:11:13 UTC
Upstream bug component is WebUI

Comment 8 Andrew Kofink 2016-08-09 11:27:47 UTC
How is this WebUI when it deals with Hammer/API?

Comment 9 Andrew Kofink 2016-08-09 11:54:54 UTC
It seems like rendering the view takes the majority of the time:
Completed 200 OK in 68556ms (Views: 52481.4ms | ActiveRecord: 14506.9ms)

Comment 10 Andrew Kofink 2016-08-09 13:37:57 UTC
Performance improvement can be gained by including the proper tables (N+1 queries) at the beginning, rather than requesting more data from the DB in the view layer:

Before:
2016-08-09T09:28:22 ef4eff65 [app] [I] Completed 200 OK in 30798ms (Views: 28758.6ms | ActiveRecord: 1144.1ms)
2016-08-09T09:30:29 ef4eff65 [app] [I] Completed 200 OK in 29078ms (Views: 27504.6ms | ActiveRecord: 1114.1ms)
2016-08-09T09:31:04 ef4eff65 [app] [I] Completed 200 OK in 29546ms (Views: 27802.8ms | ActiveRecord: 1237.9ms)

Average: 29807.3ms

After:
2016-08-09T09:24:13 ef4eff65 [app] [I] Completed 200 OK in 24099ms (Views: 22732.9ms | ActiveRecord: 705.7ms)
2016-08-09T09:25:48 ef4eff65 [app] [I] Completed 200 OK in 22791ms (Views: 21248.9ms | ActiveRecord: 621.4ms)
2016-08-09T09:26:14 ef4eff65 [app] [I] Completed 200 OK in 21944ms (Views: 20642.8ms | ActiveRecord: 683.7ms)

Average: 22944.6

That's 23% faster.

Comment 11 Bryan Kearney 2016-08-09 19:17:56 UTC
Andrew, upstream bug is set to WebUI, so that was copied to this bug.

Comment 13 Bryan Kearney 2016-11-04 22:20:18 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/16010 has been resolved.

Comment 16 Andrew Kofink 2016-11-11 13:00:45 UTC
Bryan, it looks like the release for the Foreman issue (http://projects.theforeman.org/issues/16646) is set to 1.14.

Comment 18 Daniel Lobato Garcia 2017-01-27 08:46:21 UTC
Setting as triaged as there's a fix already upstream

Comment 19 Satellite Program 2017-02-23 21:08:09 UTC
Please add verifications steps for this bug to help QE verify

Comment 20 Andrew Kofink 2017-02-23 21:23:50 UTC
Verification could be to look for N+1 warnings in the production.log while requesting the hosts page before applying. The warnings should not show up after applying the fix, and you should be able to notice a speed improvement in the response for a Satellite with hundreds of hosts.

Comment 21 Shimon Shtein 2017-03-05 09:25:09 UTC
Cherrypicks provided in https://bugzilla.redhat.com/show_bug.cgi?id=1419667

*** This bug has been marked as a duplicate of bug 1419667 ***