Bug 1836396 - [RFE] Expand reporting to include servers that haven't checked in recently
Summary: [RFE] Expand reporting to include servers that haven't checked in recently
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Reporting
Version: 6.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Lukáš Hellebrandt
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-15 19:28 UTC by Gary Scarborough
Modified: 2021-04-28 21:49 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-19 15:29:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1842935 0 unspecified CLOSED [RFE] Generate the report of patching results after the patching 2023-01-20 00:58:03 UTC

Internal Links: 1842935

Description Gary Scarborough 2020-05-15 19:28:40 UTC
Can we expand the built in reporting in Satellite to include a report on failed tasks and a report for hosts that havent checked in recently?

Comment 1 Marek Hulan 2020-05-18 08:22:32 UTC
These are two separate RFEs, one for adding a report for failed tasks, second for inactive hosts. Please open the new BZ for the second report and update this BZ to track just one. In both BZs, also specify what fields (columns) user wants to see and potentially what inputs/customization it should allow. Especially for "haven't checked in recently" also specify, what information to base this on. There's last check in information based on puppet report, package profile upload, facts upload from subscription manager and possibly other.

Good thing is, with 6.7.0, the inactive hosts should be already possible, without code changes. For failed tasks, what is the value of explicit report? The tasks page has now better overview capabilities.

Related upstream issue - https://projects.theforeman.org/issues/28776

Comment 2 Marek Hulan 2020-05-18 08:39:49 UTC
Also there are already similar existing requests:

inactive hosts
- https://bugzilla.redhat.com/show_bug.cgi?id=1731235 [RFE] Create Report Template to list inactive hosts
- https://bugzilla.redhat.com/show_bug.cgi?id=1792187 [RFE] last_checkin report template

so please review if BZ 1731235 covers your needs, if not, add additional requests to BZ 1792187 and reuse current BZ for just failed tasks report.

Comment 3 JB 2020-05-18 13:50:30 UTC
It appears that  BZ 1731235 may resolve this, but its unclear as it is closed, from what I can tell, ( I am no expert in Redhat Bugzillas).

In Satellite 5 we had a report that was emailed out, that would list inactive systems. Does this provide that?

Comment 4 Marek Hulan 2020-05-19 06:56:39 UTC
BZ 1731235 tracks adding access to `last_checkin(host)` in the template, which is based on the subscription-manager last check in. This is merged in upstream, so it should appear in the next Satellite release. Meaning users will be able to write such report or extend existing ones with this info. The built-in report itself is tracked in BZ 1792187 and is not yet implemented, however it is easy to implement now.

Please note that every report result can be scheduled and emailed out, that's a generic feature of Satellite 6 reporting.

Comment 5 JB 2020-05-19 13:24:00 UTC
I have an example from Satellite 5:


This is the Red Hat Satellite Status Report for your account satadmin, as of 5/7/20 11:00:00 PM CDT.

This email will be sent when any of the following apply:

1.  The system fails to check in with Red Hat Satellite within a 24-hour window.
2.  The system registers scheduled action activity.

         



Systems Not Checking In:
-------------------------
The following systems recently stopped checking in with Red Hat Satellite:

System Id    System Name                    Last Checkin
1000147752   servername                     2020-05-06 16:22:33.386186
1000147751   servername                     2020-05-06 16:21:37.478874
1000011884   servername                     2020-05-06 16:20:45.921246
1000011900   servername                     2020-05-06 16:19:53.289969

Please note that inactive systems cannot receive any updates.

Follow this url to see the full list of inactive systems:
This was a direct link to Satellite 5 that could show all inactive systems, as it had a page dedicated to it.

Comment 6 Marek Hulan 2020-05-22 08:57:32 UTC
Thanks for the full example, that helps me better understand, where the current gaps are. One more thing I need to clarify. In Sat 6, the architecture is different and there are more systems "checkin" types. E.g. subscription-manager checkin, whenever yum operation is triggered. Puppet agent run (every 30 minutes by default) is another example. Ansible run or  OpenSCAP scan may be another. It depends on what functionality users configure and use. So I'd like to understand, if last checking should take all such activities into account or you're e.g. interested in one sort of system update. In other words, would you expect system to be listed in this report if e.g. we don't get rpm packages profile for more than 24 hours but we've seen puppet agent on this system actively running so we know the machine is alive.

Comment 7 JB 2020-05-22 13:19:16 UTC
From what I know puppet is being removed from Satellite. Also I dont use it. I think I am relying on subscription manager checkin, but my example would be this. I have a VM that for whatever reason went offline, Satellite will then do X to say its not available for patching, or any other management. I should get a report that triggers its no longer available to Satellite. However I understand that everyone has a different setup, so you may need multiple solutions in place. Either way I should know in a simple way when a server is not checking in for any one of the features I require.

Comment 9 Marek Hulan 2020-05-29 10:57:54 UTC
Alright. So I went ahead and created the report just to list inactive hosts based on the subscription-manager checkin. The interval is by default 4 hours. User can customize the inactivity interval through report input. As discussed above, required functionality may differ per user, so we may need to extend this furter, but it should already address the use case described in here.

We can either add scheduling of reocurring report in future (there's already a tracking BZ for that) or generate and send the result as email automatically after patching. Meanwhile, once this hits published version of Satellite, I'd advise to use cron to check if there are any inactive hosts. I can share the full curl example, the search syntax to find such hosts is following.

>  last_checkin < "4 hours ago"

And if at least one is found through that API request, spawn another API request to generate this new report and send it out via email. Again, I could construct a curl example.

Comment 10 JB 2020-05-29 13:11:36 UTC
The only issue I have found with :  last_checkin < "4 hours ago"
Is that it also reports ESXi hosts that I have in the GUI. You need to add additional filters so that those Vmware hosts don't show up.

last_checkin  < Today and  os = RedHat

Also the report would  need to be ran before patching. Ideally it would be nice to have it run daily. At any time a server can go offline and we would have no way otherwise of knowing. I would then schedule patching and have no idea that server is not available. IN Satellite 5 this ran as a daily report with an email of any servers that did not check in. As well there was a web page dedicated to servers not checking in and it could be added to the dashboard, so one could logon and instantly know if a server is not reporting in. Without any of those features I might as well still logon run this command and check myself manually. It should be automate with emails generated when one does not check in after 4 hours. Not just an after the fact thought. That does me no good, and is no better than where we were before.

Comment 11 Marek Hulan 2020-06-01 09:18:42 UTC
Thanks again for the feedback. I'll update the new report template so additional condition can be specified, that's a good improvement.

> I would then schedule patching and have no idea that server is not available. IN Satellite 5 this ran as a daily report with an email of any servers that did not check in.

Reocurring scheduling, e.g. daily, is tracked under BZ 1838517. That is a bigger feature, however we may also just create a new user mail notification type (to be found at User -> My Account -> Email Preferences). That would be similar to e.g. "Discovered summary" notification. If you prefer that option, please let me know, I'd open a new RFE to add such notification with "Daily" option.

> As well there was a web page dedicated to servers not checking in and it could be added to the dashboard, so one could logon and instantly know if a server is not reporting in.

In Satellite 6, this can be achieved similarly. If you visit the Hosts -> All Hosts page, there's a filter bar above the table. You can enter the search term there, which you of course already know. However you can also bookmark this search, see bookmarks dropdown next to the Search button. The first option is "Bookmark this search". Then whenever you want to look at the current inactive lists, you just select the bookmark. You can build more complex search queries e.g. to filter our hypervisors, you can add "and hypervisor = false" (though your os based condition is in this case better fitting I'd say).

> Without any of those features I might as well still logon run this command and check myself manually. It should be automate with emails generated when one does not check in after 4 hours. Not just an after the fact thought. That does me no good, and is no better than where we were before.

Understood, I'm trying to split requests to respective BZs, so we add functionality in small logical and trackable pieces. I see that resolving just this BZ won't address your whole RFE, I'll link that to related BZs based on the above.

Comment 12 JB 2020-06-01 12:53:10 UTC
Yes sorry. I apologize for blurring the two BZ's. Yes this would be great to have a report after the job has run to notify if there has been error or not. I look forward to testing this.

Comment 13 Marek Hulan 2020-06-02 12:18:16 UTC
No need to appologize, I have opened a new BZ 1842935 and linked it with this one, plus attached the customer case to it. This new BZ tracks triggering the report rendering after the patching.

Comment 14 Bryan Kearney 2020-06-19 15:29:54 UTC
I am closing this out becuase reading back, Marek has guided this to several other bugs:

Inactive Hosts: https://bugzilla.redhat.com/show_bug.cgi?id=1731235 (will be delivered with 6.8)
Last Checking Report: https://bugzilla.redhat.com/show_bug.cgi?id=1792187 (not slated for a release)
Just Patched report: https://bugzilla.redhat.com/show_bug.cgi?id=1842935 (not slated for a release)


Note You need to log in before you can comment on or make changes to this bug.