Bug 621363 - Monitoring Details
Monitoring Details
Status: CLOSED WORKSFORME
Product: Red Hat Update Infrastructure for Cloud Providers
Classification: Red Hat
Component: Documentation (Show other bugs)
1.1
All Linux
low Severity medium
: ---
: ---
Assigned To: Lana Brindley
wes hayutin
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-04 16:31 EDT by Jay Dobies
Modified: 2011-07-05 20:05 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-07-05 20:05:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jay Dobies 2010-08-04 16:31:23 EDT
There is some information missing on the wiki that seeded the documentation with regard to monitoring.

The information in the docs about rhua summary is good. However there is an additional feature that's less user visible.

At regular intervals (configured to every 2 minutes by default), the same tests as run for rhua summary are run by a cron script. However, instead of the user-readable output, a file is produced that contains the results.

This file is /tmp/cloude-healthcheck  (for what it's worth, that's not a typo in the word "cloud", it's an artifact of the fact that we started out calling this cloude for "Cloud Enablement").

This file will contain two lines:
- An integer describing the state of the instance. 0 indicates everything is running successfully. A non-zero code indicates a problem.
- A timestamp of when the file was last written (expressed in seconds since the epoch, in UTC).

Customers can log into the RHUA, look at this file, and based on the status code report if the RHUI is functioning correctly or not.

A little more detail on the status value (not sure how much of this you want to include or how to best phrase it). It is a bitwise OR of all of the tests. Thus it will indicate which components of the RHUI failed. The bit mappings are as follows:

1 - The channel synchronize from RHN failed.
2 - The mirror list on the RHUA is inaccessible.
4 - The RHUA could not access one or more CDS instances via HTTPS.
8 - The RHUA could not connect via yum to one or more CDS instances.
Comment 1 John Ha 2010-08-05 17:17:31 EDT
Added these to the Monitoring chapter in the latest draft on docs stage:

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Update_Infrastructure/1.1/html/Deployment_Guide/index.html

Thanks for the additions.
Comment 2 wes hayutin 2010-08-18 12:34:55 EDT
8.1. Automated Monitoring
The Red Hat Update Infrastructure also features an automated monitoring feature. At regular intervals (configured by default to every 2 minutes), the same tests available in rhua summary are scheduled to run via a cron script. The script output is then written to the /tmp/cloude-healthcheck file which contain the following information:
An integer describing the state of the instance. 0 indicates everything is running successfully. A non-zero code indicates an issue encountered.
A timestamp of when the file was last written (expressed in seconds since the epoch, in UTC).
To view the status of this automated monitoring, log into the RHUA and read the file for the status code report.
The status values are as follows
1 — The channel synchronization from RHN has failed.
2 — The mirror list on the RHUA is inaccessible.
4 — The RHUA could not access one or more CDS instances via HTTPS.
8 — The RHUA could not connect via yum to one or more CDS instances.

Note You need to log in before you can comment on or make changes to this bug.