Bug 1800870 - "Ansible report timeout" setting is ignored as report origin is not properly set
Summary: "Ansible report timeout" setting is ignored as report origin is not properly set
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Ansible - Configuration Management
Version: 6.7.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: 6.9.0
Assignee: Ondřej Pražák
QA Contact: Lukas Pramuk
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-08 14:36 UTC by Pavel Moravec
Modified: 2024-03-25 15:40 UTC (History)
15 users (show)

Fixed In Version: tfm-rubygem-foreman_ansible-4.0.3.9-1,foreman-1.24.1.31-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1884033 (view as bug list)
Environment:
Last Closed: 2021-04-21 13:12:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 22804 0 Normal Closed Origin for report is not set reliably 2021-02-16 12:37:53 UTC
Foreman Issue Tracker 30408 0 Normal Closed ReportScanner should get all raw report data in case those could be useful. 2021-02-16 12:37:53 UTC
Github theforeman foreman-ansible-modules pull 870 0 None closed Add reporter to report logs sent by foreman callback 2021-02-16 12:37:52 UTC
Red Hat Bugzilla 1825761 0 medium CLOSED Ansible Role execution reports do not show Ansible Icon 2023-12-15 17:43:20 UTC
Red Hat Knowledge Base (Solution) 4801111 0 None None None 2020-02-08 14:52:58 UTC

Description Pavel Moravec 2020-02-08 14:36:17 UTC
Description of problem:
"Ansible report timeout" settings should set the timeout of "Out of sync" host. But the setting does not seem to be honoured. Rather, _double_ value of "Out of sync interval" is used.


Version-Release number of selected component (if applicable):
Sat 6.6.1


How reproducible:
100%


Steps to Reproduce:
1. Assign an Ansible Role to a Host
2. Apply the Role(s) via REX to the Host and remember the time
3. Click to the Host details repeatedly
4. Meantime, modify "Ansible report timeout" (Ansible tab) and "Out of sync interval" (General tab) in Settings and check what settings makes the Host "Out of sync".


Actual results:
"Ansible report timeout" is ignored. Rather double value of "Out of sync interval" is applied.


Expected results:
"Ansible report timeout" to be applied/used.


Additional info:
I think there are two bugs in /usr/share/foreman/app/models/host_status/configuration_status.rb :

    def out_of_sync?
      if (host && !host.enabled?) || no_reports? || out_of_sync_disabled?
        false
      else
        !reported_at.nil? && reported_at < (Time.now.utc - expected_report_interval)
      end
    end

    def expected_report_interval
      (reported_origin_interval + default_report_interval).minutes
    end

    def reported_origin_interval
      if last_report.origin &&
         (interval = Setting[:"#{last_report.origin.downcase}_interval"])
        interval.to_i
      else
        default_report_interval
      end
    end

..

    def default_report_interval
      Setting[:outofsync_interval]
    end


Bug 1: why expected_origin_interval = reported_origin_interval + default_report_interval ? Why _adding_ default? That often means the default is counted *twice*

Bug 2: why default_report_interval loads outofsync_interval Setting, while "Ansible report timeout" should be loaded here? B'cos status.last_report.origin is nil while it should be 'ansible' (to load Setting['ansible_timeout']. Dunno where in code we forget to set it.

The Bug 2 also means that "out_of_sync_disabled?" is evaluated to false, since:

    def out_of_sync_disabled?
      if last_report.origin
        Setting[:"#{last_report.origin.downcase}_out_of_sync_disabled"]
      else
        false
      end
    end



Some supporting data from rake console from a reproducer with outofsync_interval set to 60mins:

irb(main):041:0> host = Host.find_by_name("pmoravec-rhel76.gsslab.brq2.redhat.com")
=> #<Host::Managed id: 5, name: "pmoravec-rhel76.gsslab.brq2.redhat.com", last_compile: "2020-02-08 12:17:14", last_report: "2020-02-08 12:18:00", updated_at: "2020-02-08 12:18:01", created_at: "2020-02-03 12:35:58", root_pass: nil, architecture_id: 1, operatingsystem_id: 1, environment_id: nil, ptable_id: nil, medium_id: nil, build: false, comment: "", disk: nil, installed_at: nil, model_id: 1, hostgroup_id: nil, owner_id: 4, owner_type: "User", enabled: true, puppet_ca_proxy_id: nil, managed: false, use_image: nil, image_file: nil, uuid: nil, compute_resource_id: nil, puppet_proxy_id: nil, certname: nil, image_id: nil, organization_id: 1, location_id: 2, type: "Host::Managed", otp: nil, realm_id: nil, compute_profile_id: nil, provision_method: nil, grub_pass: "", discovery_rule_id: nil, global_status: 2, lookup_value_matcher: "fqdn=pmoravec-rhel76.gsslab.brq2.redhat.com", pxe_loader: nil, initiated_at: nil, build_errors: nil, openscap_proxy_id: nil>
irb(main):042:0> status = HostStatus::Status.where(:host_id => host.id, :type => 'HostStatus::ConfigurationStatus').count
=> 1
irb(main):043:0> status = HostStatus::Status.where(:host_id => host.id, :type => 'HostStatus::ConfigurationStatus').first
=> #<HostStatus::ConfigurationStatus id: 67, type: "HostStatus::ConfigurationStatus", status: 352321536, host_id: 5, reported_at: "2020-02-08 12:18:00">
irb(main):044:0> status.last_report.origin
=> nil
irb(main):045:0> status.expected_report_interval
=> 120 minutes
irb(main):046:0> Setting[:outofsync_interval]
=> 60
irb(main):047:0>

Comment 3 Pavel Moravec 2020-02-08 14:56:40 UTC
FYI there is similar upstream bug https://projects.theforeman.org/issues/25463 that rather describes opposite problem..? (imho close #25463 or recycle it for this BZ)

Comment 4 Ondřej Ezr 2020-07-13 10:10:17 UTC
This is higly likely being the same issue as BZ#1825761 as if the report status is not reported, the interval is not taken from its source.

The adding of values is probably meant as valid_for_minutes + out_of_sync so it doesn't go out_of_sync right after the report is invalid. It should not fallback to the default then probably :)

Comment 5 Ondřej Ezr 2020-07-13 11:29:41 UTC
*** Bug 1825761 has been marked as a duplicate of this bug. ***

Comment 6 Bryan Kearney 2020-07-13 12:01:16 UTC
Upstream bug assigned to oprazak

Comment 7 Bryan Kearney 2020-07-13 12:01:20 UTC
Upstream bug assigned to oprazak

Comment 8 Bryan Kearney 2020-08-03 12:01:14 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/22804 has been resolved.

Comment 9 Ondřej Ezr 2020-08-03 19:45:00 UTC
This needs to pull in new version of the collection on the Capsule (eighter by updating ansible to upcoming release > 2.10.1) or manually.
The PR in question, that needs to get in https://github.com/theforeman/foreman-ansible-modules/pull/870#issuecomment-667656768

Comment 14 Lukas Pramuk 2020-10-22 11:41:20 UTC
FailedQA.

@Satellite 6.7.5 Snap1
tfm-rubygem-foreman_ansible-4.0.3.9-1.el7sat.noarch
foreman-1.24.1.31-1.el7sat.noarch

used reproducer described in comment#0:

>>> Host never gets out of sync now

>>> All ansible reports (from Play Roles, Ansible Command) have origin N/A

Comment 15 Ondřej Pražák 2020-10-22 13:27:09 UTC
Needs an update of Ansible. Alternatively, newer version of callback can be pulled in manually as mentioned in comment#9

Comment 16 Marek Hulan 2020-10-22 13:31:14 UTC
This must be tested with the callback from the new foreman-ansible-modules, this is not part of Satellite 6.7 as stated in comment 9 and 10. We can't backport the callback to older ansible 2.8 version.

In order to use the newer callback, do this

cd /usr/lib/python2.7/site-packages/ansible/plugins/callback/
mv foreman.py foreman.py.orig
wget https://raw.githubusercontent.com/theforeman/foreman-ansible-modules/develop/plugins/callback/foreman.py

and retest.

The origin should be set afterwards, therefore the status will be updated based on these Ansible reports.

This has been added to foreman-ansible-modules 1.1.0. 1.3.0 is published at automation hub if you want to get it from official source. I'm not sure, what exact version is built to newer ansible version. Moving back to ON_QA.

Comment 18 Evgeni Golov 2020-10-23 06:46:21 UTC
(In reply to Marek Hulan from comment #16)
> This must be tested with the callback from the new foreman-ansible-modules,
> this is not part of Satellite 6.7 as stated in comment 9 and 10. We can't
> backport the callback to older ansible 2.8 version.
> 
> In order to use the newer callback, do this
> 
> cd /usr/lib/python2.7/site-packages/ansible/plugins/callback/
> mv foreman.py foreman.py.orig
> wget
> https://raw.githubusercontent.com/theforeman/foreman-ansible-modules/develop/
> plugins/callback/foreman.py
> 
> and retest.

FWIW, this won't work as you can't use a collection callback on 2.8.
You'd have to "uncollection" it. (change the two theforeman.foreman.foreman names to "foreman").

Comment 22 Lukas Pramuk 2021-03-10 23:39:55 UTC
VERIFIED.

@Satellite 6.9.0 Snap16
foreman-2.3.1.14-1.el7sat.noarch
tfm-rubygem-foreman_ansible-6.1.1-1.el7sat.noarch
ansible-collection-redhat-satellite-2.0.1-1.el7sat.noarch
ansible-2.9.16-1.el7ae.noarch

by the manual reproducer based on description in https://bugzilla.redhat.com/show_bug.cgi?id=1884033#c0 and https://bugzilla.redhat.com/show_bug.cgi?id=1884033#c6:

1) Update foreman callback by installing ansible-collection-redhat-satellite and changing callback name from "foreman" to "redhat.satellite.foreman"

# satellite-maintain packages install ansible-collection-redhat-satellite
# sed -i 's/^callback_whitelist\s*=.*/callback_whitelist = redhat.satellite.foreman/' /etc/foreman-proxy/ansible.cfg

(expect this setting being overrun by an installer run)

>>> foreman callback is shipped by ansible so Satellite has no control over when it is updated
 
2) Change the Settings > General > Out of sync interval = 5 min , Ansible > Ansible report timeout = 10 mins
3) Assign an Ansible Role to a Host
4) Apply the Role(s) via REX to the Host

5) Refresh the Host details repeatedly to watch "Configuration" status turning from "Active" to "Out of sync"

>>> "Out of sync" status worn after "Out of sync interval" + "Ansible report timeout" time = 15 mins

6) Yet check the origin of the generated report by clicking [Reports] button at the host details page

>>> origin is set to "ansible" (A) icon, instead of "N/A"

>>> was hesitant to verify this BZ as the fix requires callback update which at the moment are manual steps since the foreman callback is currently shipped by ansible and not Satellite but I expect in near future "foreman" callback being replaced by "redhat.satellite.foreman" callback

Comment 25 errata-xmlrpc 2021-04-21 13:12:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.9 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1313


Note You need to log in before you can comment on or make changes to this bug.