Bug 1592480

Summary: reports do not generate with timeout errors in logs
Product: Red Hat CloudForms Management Engine Reporter: Felix Dewaleyne <fdewaley>
Component: ReportingAssignee: Nick LaMuro <nlamuro>
Status: CLOSED CURRENTRELEASE QA Contact: Niyaz Akhtar Ansari <nansari>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.8.0CC: cpelland, fdewaley, jocarter, jrafanie, nansari, nlamuro, obarenbo, simaishi
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: 5.10.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: 5.10.0.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1594027 1594028 (view as bug list) Environment:
Last Closed: 2019-02-11 14:02:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1594027, 1594028    

Description Felix Dewaleyne 2018-06-18 15:36:11 UTC
Description of problem:
reports do not generate with timeout errors in logs

Version-Release number of selected component (if applicable):
5.8.2.3


How reproducible:
all the time in customer environment. reproducer env couldn't.

Steps to Reproduce:
1. queue the generation of the affected report
2.
3.

Actual results:
[----] I, [2018-06-18T11:39:19.776123 #1992:1047134]  INFO -- : MIQ(MiqQueue.put) Message id: [99000001488620],  id: [], Zone: [default], Role: [], Server: [cb8105c0-06a2-11e8-9261-005056b70a03], Ident: [generic], Target id: [], Instance
 id: [], Task id: [], Command: [Session.check_session_timeout], Timeout: [600], Priority: [90], State: [ready], Deliver On: [], Data: [], Args: []
[----] I, [2018-06-18T11:39:21.551177 #1693:1047134]  INFO -- : MIQ(MiqServer#populate_queue_messages) Fetched 1 miq_queue rows for queue_name=generic, wcount=4, priority=200
[----] I, [2018-06-18T11:39:25.363208 #24923:1047134]  INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [99000001488620], MiqWorker id: [99000000043960], Zone: [default], Role: [], Server: [cb8105c0-06a2-11e8-9261-
005056b70a03], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Session.check_session_timeout], Timeout: [600], Priority: [90], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [6.005005463] seco
nds
[----] I, [2018-06-18T11:39:25.363385 #24923:1047134]  INFO -- : MIQ(MiqQueue#deliver) Message id: [99000001488620], Delivering...
[----] E, [2018-06-18T11:39:25.692817 #14871:1047134] ERROR -- : MIQ(MiqQueue#deliver) Message id: [99000001488195], timed out after 3600.690110955 seconds.  Timeout threshold [3600]
[----] I, [2018-06-18T11:39:25.692999 #14871:1047134]  INFO -- : MIQ(MiqReportingWorker::Runner#deliver_queue_message) MIQ(MiqReportingWorker::Runner) Reconnecting to DB after timeout error during queue deliver
[----] I, [2018-06-18T11:39:25.769606 #14871:1047134]  INFO -- : MIQ(MiqQueue#delivered) Message id: [99000001488195], State: [timeout], Delivered in [3600.767041134] seconds
[----] I, [2018-06-18T11:39:25.915681 #14871:1047134]  INFO -- : MIQ(MiqReportingWorker#log_status) [Reporting Worker] Worker ID [99000000043985], PID [14871], GUID [483dbfda-71ca-11e8-b01a-005056b70a03], Last Heartbeat [2018-06-18 10:39
:24 +0300], Process Info: Memory Usage [4891709440], Memory Size [5253500928], Proportional Set Size: [4695962000], Memory % [29.37], CPU Time [329794.0], CPU % [0.58], Priority [27]
[----] E, [2018-06-18T11:39:25.916075 #14871:1047134] ERROR -- : MIQ(MiqReportingWorker::Runner) ID [99000000043985] PID [14871] GUID [483dbfda-71ca-11e8-b01a-005056b70a03] Exiting worker due to timeout error Worker exiting.
[----] I, [2018-06-18T11:39:25.946433 #24923:1047134]  INFO -- : MIQ(Session.purge) purged stale session data, 28 entries deleted

Expected results:
the report is generated

Additional info:

Comment 15 Nick LaMuro 2018-06-20 20:36:24 UTC
Proposed patch that should solve the issue:

https://github.com/ManageIQ/manageiq/pull/17615

Comment 17 CFME Bot 2018-06-21 19:46:28 UTC
New commits detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/0ad75c2117faa8ce3678932fc8d383429d55b4e5
commit 0ad75c2117faa8ce3678932fc8d383429d55b4e5
Author:     Nick LaMuro <nicklamuro>
AuthorDate: Tue Jun 19 20:03:05 2018 -0400
Commit:     Nick LaMuro <nicklamuro>
CommitDate: Tue Jun 19 20:03:05 2018 -0400

    Add MiqReport#all_custom_attributes_are_virtual_sql_attributes?

    https://bugzilla.redhat.com/show_bug.cgi?id=1592480

    This method is responsible for checking if the virtual_attributes in the
    Rbac query will contain all of the custom_attributes virtual_attributes.

    To be used to determine if we can safely remove:

        { :custom_attributes => {} }

    From the `include` hash on MiqReport, avoiding needing to load a lot of
    records.

 app/models/miq_report.rb | 5 +
 1 file changed, 5 insertions(+)


https://github.com/ManageIQ/manageiq/commit/e7fd484acc9813462744046dd8b464a08b443527
commit e7fd484acc9813462744046dd8b464a08b443527
Author:     Nick LaMuro <nicklamuro>
AuthorDate: Tue Jun 19 20:03:48 2018 -0400
Commit:     Nick LaMuro <nicklamuro>
CommitDate: Tue Jun 19 20:03:48 2018 -0400

    Remove custom_attributes from MiqReport includes

    https://bugzilla.redhat.com/show_bug.cgi?id=1592480

    Conditionally removes `:custom_attributes => {}` from the includes
    clause if all of the `custom_attribute` columns happen to be in the
    select statement (the `@va_sql_cols` value).

    When the `custom_attribute` `virtual_attributes` can be converted to
    Arel, this will have some value, but for now this will mostly do nothing

 app/models/miq_report/generator.rb | 7 +
 1 file changed, 7 insertions(+)


https://github.com/ManageIQ/manageiq/commit/065405d49975c80b3b852ddb21d413719bec976e
commit 065405d49975c80b3b852ddb21d413719bec976e
Author:     Nick LaMuro <nicklamuro>
AuthorDate: Tue Jun 19 20:06:35 2018 -0400
Commit:     Nick LaMuro <nicklamuro>
CommitDate: Tue Jun 19 20:06:35 2018 -0400

    Convert custom_attributes to arel virtual_attributes

    https://bugzilla.redhat.com/show_bug.cgi?id=1592480

    This change allows `custom_attributes` to be accessible via SQL.

    Most of the additions are found in the `custom_attribute_arel`, included
    by `CustomAttributeMixin`, and will simply generate the Arel equivalent
    necessary for determined the `virtual_attribute` value via SQL.

    Other notable changes:

    - The method defined for accessing the `virtual_attribute` from the
      instance will now check the attributes hash prior to executing the
      rest of the method, and return something if it has a key
    - Some shared values calculated in both the method definition and would
      be in the `custom_attribute_arel` method have been extracted out so it
      is only calculated once (minor performance benefit).

 app/models/mixins/custom_attribute_mixin.rb | 34 +-
 1 file changed, 28 insertions(+), 6 deletions(-)

Comment 21 Nick LaMuro 2018-07-04 01:05:57 UTC
Steps for replicating this bug can be found here:

https://gist.github.com/NickLaMuro/87dddcfbd549b03099f8e55f632b2b57#file-bz_1592480_db_replication_script-rb

* * *

To replicate this bug, you can run the following script from below to populate a fresh database with data to replicate this bug locally:

    $ bin/rails r bz_1588082_db_replication_script.rb


From there, the easiest way to confirm the bug is fixed or not is to run the following in a console:

    $ bin/rails c
    irb> r = MiqReport.where(:name => "BZ 1592480 Example Report").first
    irb> puts Benchmark.measure { r.queue_generate_table(:report_sync => true) }

Without the patch from https://github.com/ManageIQ/manageiq/pull/17615 (or it's backported equivalent), this will take a large amount of memory and over 5 minutes to complete.

With the patch in place, the memory stays relatively low and only takes about 1 minute to complete (depending on DB latency, CPU power, etc.)