Description of problem: reports do not generate with timeout errors in logs Version-Release number of selected component (if applicable): 5.8.2.3 How reproducible: all the time in customer environment. reproducer env couldn't. Steps to Reproduce: 1. queue the generation of the affected report 2. 3. Actual results: [----] I, [2018-06-18T11:39:19.776123 #1992:1047134] INFO -- : MIQ(MiqQueue.put) Message id: [99000001488620], id: [], Zone: [default], Role: [], Server: [cb8105c0-06a2-11e8-9261-005056b70a03], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Session.check_session_timeout], Timeout: [600], Priority: [90], State: [ready], Deliver On: [], Data: [], Args: [] [----] I, [2018-06-18T11:39:21.551177 #1693:1047134] INFO -- : MIQ(MiqServer#populate_queue_messages) Fetched 1 miq_queue rows for queue_name=generic, wcount=4, priority=200 [----] I, [2018-06-18T11:39:25.363208 #24923:1047134] INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [99000001488620], MiqWorker id: [99000000043960], Zone: [default], Role: [], Server: [cb8105c0-06a2-11e8-9261- 005056b70a03], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Session.check_session_timeout], Timeout: [600], Priority: [90], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [6.005005463] seco nds [----] I, [2018-06-18T11:39:25.363385 #24923:1047134] INFO -- : MIQ(MiqQueue#deliver) Message id: [99000001488620], Delivering... [----] E, [2018-06-18T11:39:25.692817 #14871:1047134] ERROR -- : MIQ(MiqQueue#deliver) Message id: [99000001488195], timed out after 3600.690110955 seconds. Timeout threshold [3600] [----] I, [2018-06-18T11:39:25.692999 #14871:1047134] INFO -- : MIQ(MiqReportingWorker::Runner#deliver_queue_message) MIQ(MiqReportingWorker::Runner) Reconnecting to DB after timeout error during queue deliver [----] I, [2018-06-18T11:39:25.769606 #14871:1047134] INFO -- : MIQ(MiqQueue#delivered) Message id: [99000001488195], State: [timeout], Delivered in [3600.767041134] seconds [----] I, [2018-06-18T11:39:25.915681 #14871:1047134] INFO -- : MIQ(MiqReportingWorker#log_status) [Reporting Worker] Worker ID [99000000043985], PID [14871], GUID [483dbfda-71ca-11e8-b01a-005056b70a03], Last Heartbeat [2018-06-18 10:39 :24 +0300], Process Info: Memory Usage [4891709440], Memory Size [5253500928], Proportional Set Size: [4695962000], Memory % [29.37], CPU Time [329794.0], CPU % [0.58], Priority [27] [----] E, [2018-06-18T11:39:25.916075 #14871:1047134] ERROR -- : MIQ(MiqReportingWorker::Runner) ID [99000000043985] PID [14871] GUID [483dbfda-71ca-11e8-b01a-005056b70a03] Exiting worker due to timeout error Worker exiting. [----] I, [2018-06-18T11:39:25.946433 #24923:1047134] INFO -- : MIQ(Session.purge) purged stale session data, 28 entries deleted Expected results: the report is generated Additional info:
Proposed patch that should solve the issue: https://github.com/ManageIQ/manageiq/pull/17615
New commits detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/0ad75c2117faa8ce3678932fc8d383429d55b4e5 commit 0ad75c2117faa8ce3678932fc8d383429d55b4e5 Author: Nick LaMuro <nicklamuro> AuthorDate: Tue Jun 19 20:03:05 2018 -0400 Commit: Nick LaMuro <nicklamuro> CommitDate: Tue Jun 19 20:03:05 2018 -0400 Add MiqReport#all_custom_attributes_are_virtual_sql_attributes? https://bugzilla.redhat.com/show_bug.cgi?id=1592480 This method is responsible for checking if the virtual_attributes in the Rbac query will contain all of the custom_attributes virtual_attributes. To be used to determine if we can safely remove: { :custom_attributes => {} } From the `include` hash on MiqReport, avoiding needing to load a lot of records. app/models/miq_report.rb | 5 + 1 file changed, 5 insertions(+) https://github.com/ManageIQ/manageiq/commit/e7fd484acc9813462744046dd8b464a08b443527 commit e7fd484acc9813462744046dd8b464a08b443527 Author: Nick LaMuro <nicklamuro> AuthorDate: Tue Jun 19 20:03:48 2018 -0400 Commit: Nick LaMuro <nicklamuro> CommitDate: Tue Jun 19 20:03:48 2018 -0400 Remove custom_attributes from MiqReport includes https://bugzilla.redhat.com/show_bug.cgi?id=1592480 Conditionally removes `:custom_attributes => {}` from the includes clause if all of the `custom_attribute` columns happen to be in the select statement (the `@va_sql_cols` value). When the `custom_attribute` `virtual_attributes` can be converted to Arel, this will have some value, but for now this will mostly do nothing app/models/miq_report/generator.rb | 7 + 1 file changed, 7 insertions(+) https://github.com/ManageIQ/manageiq/commit/065405d49975c80b3b852ddb21d413719bec976e commit 065405d49975c80b3b852ddb21d413719bec976e Author: Nick LaMuro <nicklamuro> AuthorDate: Tue Jun 19 20:06:35 2018 -0400 Commit: Nick LaMuro <nicklamuro> CommitDate: Tue Jun 19 20:06:35 2018 -0400 Convert custom_attributes to arel virtual_attributes https://bugzilla.redhat.com/show_bug.cgi?id=1592480 This change allows `custom_attributes` to be accessible via SQL. Most of the additions are found in the `custom_attribute_arel`, included by `CustomAttributeMixin`, and will simply generate the Arel equivalent necessary for determined the `virtual_attribute` value via SQL. Other notable changes: - The method defined for accessing the `virtual_attribute` from the instance will now check the attributes hash prior to executing the rest of the method, and return something if it has a key - Some shared values calculated in both the method definition and would be in the `custom_attribute_arel` method have been extracted out so it is only calculated once (minor performance benefit). app/models/mixins/custom_attribute_mixin.rb | 34 +- 1 file changed, 28 insertions(+), 6 deletions(-)
Steps for replicating this bug can be found here: https://gist.github.com/NickLaMuro/87dddcfbd549b03099f8e55f632b2b57#file-bz_1592480_db_replication_script-rb * * * To replicate this bug, you can run the following script from below to populate a fresh database with data to replicate this bug locally: $ bin/rails r bz_1588082_db_replication_script.rb From there, the easiest way to confirm the bug is fixed or not is to run the following in a console: $ bin/rails c irb> r = MiqReport.where(:name => "BZ 1592480 Example Report").first irb> puts Benchmark.measure { r.queue_generate_table(:report_sync => true) } Without the patch from https://github.com/ManageIQ/manageiq/pull/17615 (or it's backported equivalent), this will take a large amount of memory and over 5 minutes to complete. With the patch in place, the memory stays relatively low and only takes about 1 minute to complete (depending on DB latency, CPU power, etc.)