Bug 722584

Summary: Chapter 11. Monitoring the RHUI Updates
Product: Red Hat Update Infrastructure for Cloud Providers Reporter: Jay Dobies <jason.dobies>
Component: DocumentationAssignee: Lana Brindley <lbrindle>
Status: CLOSED CURRENTRELEASE QA Contact: wes hayutin <whayutin>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.0CC: kbidarka, mhideo, sghai, tsanders
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-29 04:56:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jay Dobies 2011-07-15 17:53:26 UTC
Sorry this is so late, it kinda slipped my mind that I had to write this up for you.

A lot of this goes away; it's a significantly thinner UI now and you've covered a lot of it already.

Previously, there were two commands: one user-readable and one programmatic. The thing is, most of the user-readable stuff has been moved into the synchronization screen. That shows the results of the most recent repo and CDS syncs, as well as if the CDS is up or not.

So if you want to treat this in terms of a migration from the 1.2 monitoring docs to 2.0, you can mention that all the user-readable stuff has been moved and covered in Chapter 9.

The programmatic interface is simpler now too. The user calls a command on the rhui-manager. The exit code from that command indicates the health of the 

=====================
$ rhui-manager --username admin --password admin status

$ echo $?
0
=====================

Users should automate running that command and inspecting the exit code to determine if the RHUI is running correctly or not.

Remove the list of exit code translations, they aren't valid anymore. It's zero = happy, non-zero = someone gets a text message at the wee hours of the morning.

Again if you want to talk in terms of migration from 1.2 to 2.0, there's no timestamp field anymore since you're running the command on your own. Previously it was run from a cron job.

Also, you might want to mention that they really should use the "--username" and "--password" flags as indicated in the screenshot above. This will bypass the authentication certificate, which is only valid for a week at which point they'd need to log in again. So rather than having to keep updating that cert on their own, they can just pass the credentials every time and bypass it entirely.

(let me know if that last paragraph wasn't clear, I'm having trouble thinking straight today)

Comment 1 Jay Dobies 2011-07-15 17:54:53 UTC
While you're in there, add the following log files:


/var/log/pulp-cds/gofer.log
Location on CDS instances to look for errors related to the CDS function.

/var/log/gofer/agent.log
Location on CDS instances to look for communication-related errors, such as if the CDS cannot be registered. This will contain information letting the user know if the CDS was able to successfully connect to the QPID broker running on the RHUA.

Comment 2 Lana Brindley 2011-07-20 04:17:39 UTC
Updated in Revision 1-22. Please review on the stage.

LKB

Comment 3 Sachin Ghai 2011-07-20 06:25:37 UTC
Verified at stage doc at:

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Update_Infrastructure/2.0/html-single/Installation_Guide/index.html#part-Installation_Guide-Installation.

Chapter11 is updated with complete info described in comment 0 and comment 1.


However, I'm curious to know about following statement as I never used Nagios. Not sure if its valid in RHUI's context.

>>Status and monitoring information can also be obtained in a machine-readable state, for use with automated monitoring solutions such as Nagios. 

Any comments ?

Comment 4 Lana Brindley 2011-07-25 23:27:03 UTC
That nagios comment came from tech review feedback from Jay. Jay, can you confirm the accuracy of this statement, please?

LKB

Comment 5 Sachin Ghai 2011-07-26 06:05:49 UTC
Ah..okay.. if its from tech review and from Jay, then I'm fine with this statement. Thanks for checking this. 

Moving this to verified.

Comment 6 Jay Dobies 2011-07-26 13:44:18 UTC
Here was my thinking. We don't want the customer to have to parse a human readable status screen in order to hook into whatever monitoring solution they have. So the machine-readable state is simply a number. If that number is non-zero, they need to panic.

Comment 7 Lana Brindley 2011-07-29 04:56:58 UTC
This book is now available at http://docs.redhat.com/docs/en-US/Red_Hat_Update_Infrastructure/2.0/html/Installation_Guide/index.html

Please raise a new bug for any further changes.

LKB