Looking for hung puma threads (having an ODD number of Started|Completed lines) (A Started without a Completed): zgrep -E 'Started|Completed' production.log | grep -Eo "#[0-9]+[:a-f0-9]+" | sort | uniq -c Produces this list of "ODD" PID and TIDs: 169 #21051:fa9330 167 #21055:1640c98 1 #21137:13925a0 1 #21148:14e5bc8 1 #21154:10b9c5c 7 #22532:10aad9c 5 #22538:a39cc4 21 #22553:3b7fdd8 1 #22630:9037c4 1 #22630:903864 1 #22630:903a08 1 #22630:903aa8 1 #22638:8f1434 1 #22643:d87340 13 #23492:37816c8 25 #23666:3714b54 25 #23666:3714d5c 11 #26805:37508fc 13 #26905:37804bc 1 #30307:b8ad6c 5 #31234:3781920
Then grep'ing for what these pid/tid combinations last did: grep "#21051:fa9330" production.log | tail -n 3 ; echo; echo; grep "#21055:1640c98" production.log | tail -n 3 ; echo; echo; grep "#21137:13925a0" production.log | tail -n 3 ; echo; echo; grep "#21148:14e5bc8" production.log | tail -n 3 ; echo; echo; grep "#21154:10b9c5c" production.log | tail -n 3 ; echo; echo; grep "#22532:10aad9c" production.log | tail -n 3 ; echo; echo; grep "#22538:a39cc4" production.log | tail -n 3 ; echo; echo; grep "#22553:3b7fdd8" production.log | tail -n 3 ; echo; echo; grep "#22630:9037c4" production.log | tail -n 3 ; echo; echo; grep "#22630:903864" production.log | tail -n 3 ; echo; echo; grep "#22630:903a08" production.log | tail -n 3 ; echo; echo; grep "#22630:903aa8" production.log | tail -n 3 ; echo; echo; grep "#22638:8f1434" production.log | tail -n 3 ; echo; echo; grep "#22643:d87340" production.log | tail -n 3 ; echo; echo; grep "#23492:37816c8" production.log | tail -n 3 ; echo; echo; grep "#23666:3714b54" production.log | tail -n 3 ; echo; echo; grep "#23666:3714d5c" production.log | tail -n 3 ; echo; echo; grep "#26805:37508fc" production.log | tail -n 3 ; echo; echo; grep "#26905:37804bc" production.log | tail -n 3 ; echo; echo; grep "#30307:b8ad6c" production.log | tail -n 3 ; echo; echo; grep "#31234:3781920" production.log | tail -n 3 ; echo; echo; Produces 7 never completing gtl_type=list view requests like: [----] I, [2018-05-30T08:30:13.708572 #23492:37816c8] INFO -- : Started POST "/vm_infra/report_data" for 127.0.0.1 at 2018-05-30 08:30:13 -0400 [----] I, [2018-05-30T08:30:13.713350 #23492:37816c8] INFO -- : Processing by VmInfraController#report_data as HTML [----] I, [2018-05-30T08:30:13.713494 #23492:37816c8] INFO -- : Parameters: {"model_name"=>"ManageIQ::Providers::InfraManager::Vm", "model"=>"ManageIQ::Providers::InfraManager::Vm", "active_tree"=>"vms_filter_tree", "explorer"=>true, "additional_options"=>{"named_scope"=>nil, "gtl_dbname"=>nil, "model"=>"ManageIQ::Providers::InfraManager::Vm", "match_via_descendants"=>nil, "parent_id"=>nil, "parent_class_name"=>nil, "parent_method"=>nil, "association"=>nil, "view_suffix"=>nil, "row_button"=>nil, "menu_click"=>nil, "sb_controller"=>nil, "listicon"=>nil, "embedded"=>nil, "showlinks"=>nil, "policy_sim"=>nil, "in_a_form"=>nil, "lastaction"=>"show_list", "display"=>nil, "gtl_type"=>"list", "supported_features_filter"=>nil, "clickable"=>nil}, "vm"=>{}} 4 never completing gtl_type=grid view requests like: [----] I, [2018-05-30T08:02:45.641007 #21051:fa9330] INFO -- : Started POST "/vm_infra/report_data" for 127.0.0.1 at 2018-05-30 08:02:45 -0400 [----] I, [2018-05-30T08:02:45.644868 #21051:fa9330] INFO -- : Processing by VmInfraController#report_data as HTML [----] I, [2018-05-30T08:02:45.644987 #21051:fa9330] INFO -- : Parameters: {"model_name"=>"ManageIQ::Providers::InfraManager::Vm", "model"=>"ManageIQ::Providers::InfraManager::Vm", "active_tree"=>"vms_filter_tree", "explorer"=>true, "additional_options"=>{"named_scope"=>nil, "gtl_dbname"=>nil, "model"=>"ManageIQ::Providers::InfraManager::Vm", "match_via_descendants"=>nil, "parent_id"=>nil, "parent_class_name"=>nil, "parent_method"=>nil, "association"=>nil, "view_suffix"=>nil, "row_button"=>nil, "menu_click"=>nil, "sb_controller"=>nil, "listicon"=>nil, "embedded"=>nil, "showlinks"=>nil, "policy_sim"=>nil, "in_a_form"=>nil, "lastaction"=>"show_list", "display"=>nil, "gtl_type"=>"grid", "supported_features_filter"=>nil, "clickable"=>nil}, "vm"=>{}}
Hi Ryan, I updated the bug with some things I found. 1) Note, the frozen hash problem could be unrelated although I'd love to have an environment to recreate this. If they can recreate it or we have their database and can recreate it in house, please open a new BZ and link to this one and another one: https://bugzilla.redhat.com/show_bug.cgi?id=1507131#c6 2) For the timing out search/UI requests. Is this is a new problem? Did the number of vms suddenly grow? I don't know that we've changed anything here to regress performance wise. 3) We'll need to better understand where the performance problem is with their database if possible. Note, some of these processes are showing up in the evm.log with memory threshold exceeding, which is probably why they never finish: [----] W, [2018-05-30T08:51:48.757682 #22101:4cf108] WARN -- : MIQ(MiqServer#validate_worker) Worker [MiqUiWorker] with ID: [xxx], PID: [30307], GUID: [51515fe9-ed77-48d6-9c4b-92ceedb4ff09] process memory usage [1430404000] exceeded limit [1073741824], requesting worker to exit
Ryan, we have a database in house that seems to exhibit similar behavior, can you have the customer provide the output from rails console against their database so we can ensure our recreation is what they're seeing: [Vm, Host, Storage, ExtManagementSystem, OperatingSystem, Hardware, Tagging, Tag, Network].each {|klass| puts "#{klass} count: #{klass.count}" }; nil
Have a patch in place that I think will address the issue: https://github.com/ManageIQ/manageiq/pull/17562
Clearing needinfo... Nick's PR has been merged on master and is marked for backport.
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/1cad1ab73b477a5ffcf0af426ec1a0c163fe073d commit 1cad1ab73b477a5ffcf0af426ec1a0c163fe073d Author: Nick LaMuro <nicklamuro> AuthorDate: Fri Jun 8 18:58:54 2018 -0400 Commit: Nick LaMuro <nicklamuro> CommitDate: Fri Jun 8 18:58:54 2018 -0400 First pass at MiqExpression 'includes' as SQL This sets up support MiqExpression to introspect the current model of the expression fragment to see if there are any methods defined there that can help allow SQL to be executed in MiqExpression for that given field. Vm.ipaddresses has been the first attempt at this. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1588082 app/models/vm_or_template.rb | 33 + lib/miq_expression.rb | 14 + spec/lib/miq_expression_spec.rb | 31 + spec/models/vm_or_template_spec.rb | 24 + 4 files changed, 102 insertions(+)
To replicate this bug, you can run the following script to populate a fresh database with data to replicate the issue: https://gist.github.com/NickLaMuro/1795ac3a02b7db76f770aa1699d3e1ab To run script, you can do the following in the project dir on an appliance: $ bin/rails r bz_1588082_db_replication_script.rb * * * To replicate the issue, the easiest way is through the UI: 1. Go to `Compute` > `Infrastructure` > `VirtualMachines` 2. Select "All VMs" from the tree on the left 3. Click the down arrow to the right of the search box 4. Create the following query (replace "XXX.XXX.XXX" with something that makes sense for your environment) Virtual Machine : IP Addresses INCLUDES ANY ['XXX.XXX.XXX'] The script will create random IPs resulting between the 10.XXX.XXX.XXX and 11.XXX.XXX.XXX, with around 25% not having any IP at all. You can start with just "10." in the search criteria and should end up with around 35% of your records in the totals.
Verified in 5.10.0.4. IP Adresses advanced search worked fast and successfully on 20k VMs.