Description of problem: Even if it is 300 hosts or 200 hosts, We can generate reports from Template like "Host - Registered Content Hosts" or "Host - Statuses" in a couple of minutes but if a custom template will be created that only parses five to ten host facts, then generation of the result for that report template could take 30minutes to 1 hour against the same set of hosts. Version-Release number of selected component (if applicable): Satellite 6.11.4 How reproducible: In the Customer's environment for 2000+ hosts with a good amount of sub-man & ansible facts in each Steps to Reproduce: NA ( Please check the reproducer details in the private comment ) Actual results: Host - Statuses -> Has the least amount of data to be collected --> Took 30 seconds Host - Registered Content Hosts -> Has more data to collect for each system including applicable errata names and subscriptions --> Took 3 minutes and then gave the results. SSS_Access_Control -> Has only 8 ansible facts to collect from each host --> Took 28 minutes Expected results: The "SSS_Access_Control" report template or any other "Host Facts" related templates, should not take that huge amount of time to get the resulting CSV data. It should be completed withing 1 - 5 minutes only ( if host count is 200 - 2000 ) . Additional info: NA
JFYI, I created this file in ruby to collect the exact same data as the SSS_Access_Control report template for a limited number or hosts, via rake console. # cat hostinfo.rb conf.echo=false require 'csv' file = "/tmp/host_data.csv" hosts = Host.where(:operatingsystem_id => 35).order(:id) column_headers = ["CTD", "TimeStamp", "Model", "MAC", "IP_address", "Owner", "building_name", "hostname"] CSV.open(file, 'w', write_headers: true, headers: column_headers) do |writer| hosts.each do |h| writer << [h.facts['ansible_local::gls_ansible_ctd_posture::_ctd'], h.facts['ansible_local::gls_ansible_timestemp::_timestemp'], h.facts['ansible_local::gls_ansible_model::_model'], h.facts['ansible_local::gls_ansible_mac::_mac_address'], h.facts['ansible_local::gls_ansible_ipaddress::_gls_ansible_ipaddress'], h.facts['ansible_local::gls_ansible_owner::_owner'], h.facts['ansible_local::gls_ansible_lrt::_lab_lrt'], h.facts['ansible_local::gls_ansible_hostname::_gls_ansible_hostname']] end end And executed it i.e. # time cat hostinfo.rb | foreman-rake console Loading production environment (Rails 6.0.4.7) Switch to inspect mode. conf.echo=false require 'csv' file = "/tmp/host_data.csv" hosts = Host.where(:operatingsystem_id => 35).order(:id) column_headers = ["CTD", "TimeStamp", "Model", "MAC", "IP_address", "Owner", "building_name", "hostname"] CSV.open(file, 'w', write_headers: true, headers: column_headers) do |writer| hosts.each do |h| writer << [h.facts['ansible_local::gls_ansible_ctd_posture::_ctd'], h.facts['ansible_local::gls_ansible_timestemp::_timestemp'], h.facts['ansible_local::gls_ansible_model::_model'], h.facts['ansible_local::gls_ansible_mac::_mac_address'], h.facts['ansible_local::gls_ansible_ipaddress::_gls_ansible_ipaddress'], h.facts['ansible_local::gls_ansible_owner::_owner'], h.facts['ansible_local::gls_ansible_lrt::_lab_lrt'], h.facts['ansible_local::gls_ansible_hostname::_gls_ansible_hostname']] end end real 8m22.559s user 7m11.535s sys 0m5.617s Around 403 entries are there in that file ( some being blanks ) i.e. # wc -l /tmp/host_data.csv 403 /tmp/host_data.csv So yeah, It took nearly the same amount of time as the TC report template i.e. 8 - 10 minutes , on the exact same set of hosts.
Hello, I've added few optimizations to the template, take a look at the reproducer machine. The most important one is, when host.facts is called, it always fetches all facts from the DB. Given we access this several times for each host, I added a cache to the variable for a particular host. It can still take quite some time, since it always loads all host facts (~281 per one host), instead of just those 8 we're interested in. We can't improve it in the template itself, without adding more optimizations to load just specific facts. OTOH I'd personally discourage from using facts directly, if possible, native attributes should be used. E.g. instead of relying on custom ansible fact for IP, customers should use host.ip (which is based on all fact sources information like puppet and subscription-manager). It's also much faster than reading it from fact values storage (which means joining 2 SQL tables which tend to be very large). With the optimization I can render the report in 15 minutes for the entire inventory (nearly 3k hosts), I'm sure it would be much faster on aforementioned 386 systems. Please share such update (and primarily the new version of the template) with the customer and let us know whether more optimizations are necessary.
Just found the original filter, it takes 95 seconds to generated for those 386 hosts on the reproducer.
Hello, That filter was only used by me to test on a small number of hosts but I do agree It has improved quite a lot. I believe the customer should be able to fetch these 'MAC': host_facts['ansible_local::gls_ansible_mac::_mac_address'], 'IP_address': host_facts['ansible_local::gls_ansible_ipaddress::_gls_ansible_ipaddress'], 'hostname': host_facts['ansible_local::gls_ansible_hostname::_gls_ansible_hostname'] using host.mac , host.ip and host.name instead But for the rest, he will still need to fetch the value from facts only. I will report back here with the response from the customer once i have shared these details with him. Thanks again for looking into the reproducer. -- Sayan