Bug 447637
Summary: | When a monitoring alert for CPU utilization is tripped, it never reverts | ||
---|---|---|---|
Product: | Red Hat Satellite 5 | Reporter: | Thomas Cameron <tcameron> |
Component: | Monitoring | Assignee: | Tomas Lestach <tlestach> |
Status: | CLOSED NOTABUG | QA Contact: | Preethi Thomas <pthomas> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 510 | CC: | cperry |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-04-20 11:12:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 463877 |
Description
Thomas Cameron
2008-05-20 21:54:55 UTC
Perl code is in: trunk/eng/monitoring/PerlModules/NP/Probe/DataSource/UnixCommand.pm test routine to run: sub cpu { my $command = '/usr/bin/vmstat 5 2'; my $results = `$command`; print "results: $results"; my @lines = split("\n", $results); my @out; @out = split(' ', $lines[3]); my $cpu_pct_used; foreach my $o (@out) { print $o . "|"; } if ($lines[1] =~ /.*st$/) { print "out[-4]: " . $out[-4] . " out[-5]: " . $out[-5] . "\n"; $cpu_pct_used = $out[-4] + $out[-5]; } else { print "out[-2]: " . $out[-2] . " out[-3]: " . $out[-3] . "\n"; $cpu_pct_used = $out[-2] + $out[-3]; } $cpu_pct_used = $out[-2] + $out[-3]; return $cpu_pct_used; } my $pct = cpu(); print "CPU: $pct\n"; 1; Need to understand if the client is not sending the right information, or if as I suspect, it is a display issue with UI. Would suggest Tomas to look at replication, not sure if he will be able to fully track this down, but it is an interesting (to me) bug to track down. Cliff The client sends the correct information (verified using rhn-runprobe). UI displays also the correct information. The only issue that could be confusing is the next probe schedule. In case the probe status is not OK (f.e. when a threshold is reached), longer delay is set for the next probe schedule. Behaviour is correct in: Satellite-5.3.0-RHEL5-re20090403.2 Example: Logged probe events (with "Probe Check Interval" set to 1minute) available on WEB UI in CSV format: Id Data Time Metric 1-43-pctused 0 04/17/09 03:37 PM pctused 1-43-pctused 0 04/17/09 03:38 PM pctused 1-43-pctused 100 04/17/09 03:39 PM pctused 1-43-pctused 0 04/17/09 03:44 PM pctused 1-43-pctused 0 04/17/09 03:46 PM pctused 1-43-pctused 0 04/17/09 03:47 PM pctused 1-43-pctused 1 04/17/09 03:48 PM pctused 1-43-pctused 0 04/17/09 03:49 PM pctused 1-43-pctused 1 04/17/09 03:50 PM pctused 1-43-pctused 6 04/17/09 03:51 PM pctused 1-43-pctused 1 04/17/09 03:53 PM pctused 1-43-pctused 3 04/17/09 03:54 PM pctused 1-43-pctused 100 04/17/09 03:55 PM pctused 1-43-pctused 3 04/17/09 04:01 PM pctused 1-43-pctused 4 04/17/09 04:02 PM pctused 1-43-pctused 1 04/17/09 04:03 PM pctused 1-43-pctused 3 04/17/09 04:04 PM pctused 1-43-pctused 3 04/17/09 04:05 PM pctused 1-43-pctused 2 04/17/09 04:06 PM pctused 1-43-pctused 5 04/17/09 04:08 PM pctused 1-43-pctused 100 04/17/09 04:09 PM pctused 1-43-pctused 4 04/17/09 04:14 PM pctused I set probe thresholds according to the description (10% warn and 30% critical). |