Bug 498468 - Monitoring, RHN Satellite: Disk Space probe doesnt work w/ selinux in permissive
Summary: Monitoring, RHN Satellite: Disk Space probe doesnt work w/ selinux in permissive
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite 5
Classification: Red Hat
Component: Monitoring
Version: 530
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Tomas Lestach
QA Contact: wes hayutin
URL: http://grandprix.rhndev.redhat.com/rh...
Whiteboard:
Depends On:
Blocks: 463877
TreeView+ depends on / blocked
 
Reported: 2009-04-30 16:02 UTC by wes hayutin
Modified: 2009-09-10 18:49 UTC (History)
3 users (show)

Fixed In Version: sat530
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-10 18:49:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description wes hayutin 2009-04-30 16:02:02 UTC
Description of problem:

4/24.1 

Probe(s) assigned to system have an UNKNOWN status   	 RHN Satellite: Disk Space   	 Filesystem: Cannot find match for "/dev/hda2"  	 Suite
Probe(s) assigned to system have an UNKNOWN status 	RHN Satellite: Disk Space Virtual 	Filesystem: Cannot find match for "/dev/xvda2" 


recreate:
1. setup monitoring w/ server and clients in permissive selinux mode
2. register another satellite as a client
3. create an RHN Satellite : Disk Space probe, and push the scout config

tried on two clients


    Device Boot      Start         End      Blocks   Id  System
/dev/xvda1   *           1          13      104391   83  Linux
/dev/xvda2              14        5221    41833260   8e  Linux LVM
[root@riverraid .ssh]# 

Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1   *           1          13      104391   83  Linux
/dev/hda2              14        7296    58500697+  8e  Linux LVM
[root@rlx-0-10 .ssh]#

Comment 1 Clifford Perry 2009-05-07 20:04:16 UTC
Not sure if this is user error or something wrong with the probe. I thought disk check ran '/bin/df -k' - maybe user is entering /dev/hda2 and its thus trying '/bin/df -k /dev/hda2' and failing. So, need to figure out if probe works with defaults, or if user error.

To test, Satellite with Monitoring enabled, go to any registered system, give it monitoring entitlement, create a RHN Satellite :: Disk Space  probe. Schedule/perform a Scout Config Push... wait for it to be completed. 

Troubleshooting section of Reference Guide for Monitoring lists how to run probes manually. If you get stuck, poke someone to help, inc myself if needed. 
Hopefully Milan or Mirek can help locally if needed :)

Cliff.

Comment 2 wes hayutin 2009-05-07 20:16:05 UTC
This bug pertains to a satellite w/ selinux in permissive.  When the server is in enforcing, you get the errors described here:
https://bugzilla.redhat.com/show_bug.cgi?id=498458

Comment 3 Clifford Perry 2009-05-07 20:18:26 UTC
This did indeed seem to function with a 5.1 Satellite:

https://rlx-1-16.rhndev.redhat.com/rhn/systems/details/probes/ProbeDetails.do?probe_id=24&sid=1000010073

Cliff

Comment 4 Tomas Lestach 2009-05-21 10:18:47 UTC
I'd say, it's a misunderstanding of the "RHN Satellite" probe group. These probes get executed locally on the satellite itself, not on registered clients. So no remote connections, no rhnmd, no clients necessary - just executing the probe commands locally.
Let's try to run "df -k" directly on the satellite to check if you'll get any "/dev/hda2" or "/dev/xvda2" device.

I agree the RHN Satellite probes are confusing, because they are "assigned" to registered clients even if they have nothing in common with them.
I created a BZ#501918 for that.

Comment 5 wes hayutin 2009-05-21 12:04:19 UTC
the comment in #4 is not correct.

For the probe to get executed locally the satellite itself would have to be registered to itself.  Registering a satellite to itself as far as I know is not supported, and not practical.

Comment 6 Tomas Lestach 2009-05-21 12:35:57 UTC
I definitely do not mean registering a satellite to itself.

try to run as nocpulse user on the satellite:

# rhn-runprobe --probe <your RHN Satellite: Disk Space id> --log all=3
and compare the output from the probe command (NOCpulse::Probe::Shell::Unix::read_result stdout) with
# df -k

You shall get the same disk space usage ...

Comment 8 wes hayutin 2009-05-28 14:33:45 UTC
ok.. this is working
2009-05-28 10:31:37 
============================================================
CRITICAL: Filesystem /dev/mapper/VolGroup00-LogVol00 (/): Space used 18,348 MB (above critical threshold of 85 MB); Filesystem pct used 56%; Space available 14,958 MB
============================================================
-bash-3.2$ 

What is confusing here is that all the probes need to have a client attached to them including RHN Satellite Probes, howerver the probe is actually executed on the satellite...

bad form for us, and probably not documented.

but it passes

Comment 9 wes hayutin 2009-05-28 14:36:21 UTC
heh.. I take that back.. it *is* documented

 Click create new probe and select the Satellite Probe Command Group. Next, complete the remaining fields as you would for any other probe. Refer to Section 7.5.1, “Managing Probes” for instructions.

Although the RHN Server appears to be monitored by the client system, the probe is actually run from the server on itself. Thresholds and notifications work normally.

Comment 10 John Matthews 2009-08-04 17:14:04 UTC
Moving to RELEASE_PENDING

steps below
Client: dhcp77-100
Satellite: Stage ISO 7/24 rhndev1


Created a Xen guest 77-100, gave it Monitoring entitlement, selinux is set to Permissive

[root@dhcp77-100 ~]# hostname
dhcp77-100.rhndev.redhat.com
[root@dhcp77-100 ~]# getenforce 
Permissive


Create a probe for disk usage on /dev/xvda1

[root@dhcp77-100 ~]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       2380564    958776   1298908  43% /
/dev/xvda1              101086     13140     82727  14% /boot
tmpfs                   262232         0    262232   0% /dev/shm


On Satellite running Probe


[nocpulse@rhndev1 ~]$ rhn-catalog 
22 ServiceProbe on dhcp77-100.rhndev.redhat.com (10.10.77.100    ): Linux: Disk Usage

[nocpulse@rhndev1 ~]$ rhn-runprobe --probe 22 --log all=3
2009-08-04 13:07:20 NOCpulse::Probe::Shell::Unix::connect Execute '/usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity -o StrictHostKeyChecking=no -o BatchMode=yes 10.10.77.100 /bin/sh -s'
2009-08-04 13:07:20 NOCpulse::Probe::Shell::Unix::connect Opened pipe to 22190
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::read_result stdout: >>>Linux#2.6.18-128.el5xen#1322
<<<
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<<
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::read_result status: >>>0<<<
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::connect OS Linux 2.6.18-128.el5xen, shell pid 1322
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::connect OK
2009-08-04 13:07:21 NOCpulse::Probe::Shell::AbstractShell::run /bin/df -k
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::read_result stdout: >>>Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       2380564    958776   1298908  43% /
/dev/xvda1              101086     13140     82727  14% /boot
tmpfs                   262232         0    262232   0% /dev/shm
<<<
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<<
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::read_result status: >>>0<<<
2009-08-04 13:07:21 NOCpulse::Probe::Threshold::value_crossed pctused did not cross
2009-08-04 13:07:21 NOCpulse::Probe::Threshold::value_crossed space_avail did not cross
2009-08-04 13:07:21 NOCpulse::Probe::Threshold::value_crossed space_used did not cross
2009-08-04 13:07:21 NOCpulse::Probe::Result::_calc_overall_status Overall status OK
2009-08-04 13:07:21 NOCpulse::Probe::Result::_calc_changed_items Overall status changed:  NO
2009-08-04 13:07:21 NOCpulse::Probe::ItemStatus::should_notify for 'space_avail'? NO
	Status OK, prior OK, due for renotify '1', worse status ''
	Notify flags CRITICAL =>  OK =>  UNKNOWN =>  WARNING =>  
2009-08-04 13:07:21 NOCpulse::Probe::Result::_calc_notifying_items Item 'space_avail'  WILL NOT notify: status OK, item thinks it should: NO, -1 elapsed seconds does trigger renotification
2009-08-04 13:07:21 NOCpulse::Probe::ItemStatus::should_notify for 'space_used'? NO
	Status OK, prior OK, due for renotify '1', worse status ''
	Notify flags CRITICAL =>  OK =>  UNKNOWN =>  WARNING =>  
2009-08-04 13:07:21 NOCpulse::Probe::Result::_calc_notifying_items Item 'space_used'  WILL NOT notify: status OK, item thinks it should: NO, -1 elapsed seconds does trigger renotification
2009-08-04 13:07:21 NOCpulse::Probe::ItemStatus::should_notify for 'pctused'? NO
	Status OK, prior OK, due for renotify '1', worse status ''
	Notify flags CRITICAL =>  OK =>  UNKNOWN =>  WARNING =>  
2009-08-04 13:07:21 NOCpulse::Probe::Result::_calc_notifying_items Item 'pctused'  WILL NOT notify: status OK, item thinks it should: NO, -1 elapsed seconds does trigger renotification
2009-08-04 13:07:21 NOCpulse::Probe::Result::_calc_notifying_items 0 items need notification
2009-08-04 13:07:21 NOCpulse::Probe::Result::_format_messages Message: Filesystem /dev/xvda1 (/boot): Filesystem pct used 14%; Space available 80 MB; Space used 12 MB
2009-08-04 13:07:21 	No items changed
2009-08-04 13:07:21 	Notification not required
2009-08-04 13:07:21 	NOTE: Running in test mode; no changes saved, nothing enqueued
2009-08-04 13:07:21 
============================================================
OK: Filesystem /dev/xvda1 (/boot): Filesystem pct used 14%; Space available 80 MB; Space used 12 MB
============================================================
2009-08-04 13:07:21 NOCpulse::Probe::Shell::Unix::disconnect Disconnecting
2009-08-04 13:07:22 NOCpulse::Probe::Shell::Unix::_kill_child Child exit code 0




Also verified, the "RHN Satellite" probe is running correctly on the Satellite itself for disk usage.




[nocpulse@rhndev1 ~]$ df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      10482240   7552200   2397560  76% /
/dev/dasda1              99168     15932     78120  17% /boot
none                   2055200         0   2055200   0% /dev/shm
jwm-devel.usersys.redhat.com:/shared/export/rhndev1.z900/var/satellite
                     428169408 351453088  59316512  86% /var/satellite
/root/Satellite-5.3.0-RHEL4-re20090724.0-s390x.iso
                        517774    517774         0 100% /mnt/iso
[nocpulse@rhndev1 ~]$ rhn-catalog 
5 ServiceProbe on fjs-0-03.rhndev.redhat.com (10.10.76.130    ): Linux: Interface Traffic
6 ServiceProbe on fjs-0-03.rhndev.redhat.com (10.10.76.130    ): General: Uptime SNMP
7 ServiceProbe on fjs-0-03.rhndev.redhat.com (10.10.76.130    ): Linux: Disk I/O Throughput
8 ServiceProbe on fjs-0-03.rhndev.redhat.com (10.10.76.130    ): Linux: CPU Usage
22 ServiceProbe on dhcp77-100.rhndev.redhat.com (10.10.77.100    ): Linux: Disk Usage
24 ServiceProbe on dhcp77-100.rhndev.redhat.com (10.10.77.100    ): RHN Satellite: Disk Space
[nocpulse@rhndev1 ~]$ rhn-runprobe --probe 24 --log all=3
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::connect Execute '/bin/sh -s'
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::connect Opened pipe to 23693
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::read_result stdout: >>>Linux#2.6.9-89.0.3.EL#23693
<<<
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<<
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::read_result status: >>>0<<<
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::connect OS Linux 2.6.9-89.0.3.EL, shell pid 23693
2009-08-04 13:12:17 NOCpulse::Probe::Shell::Unix::connect OK
2009-08-04 13:12:17 NOCpulse::Probe::Shell::AbstractShell::run /bin/df -k
2009-08-04 13:12:18 NOCpulse::Probe::Shell::Unix::read_result stdout: >>>Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      10482240   7552216   2397544  76% /
/dev/dasda1              99168     15932     78120  17% /boot
none                   2055200         0   2055200   0% /dev/shm
jwm-devel.usersys.redhat.com:/shared/export/rhndev1.z900/var/satellite
                     428169408 351453088  59316512  86% /var/satellite
/root/Satellite-5.3.0-RHEL4-re20090724.0-s390x.iso
                        517774    517774         0 100% /mnt/iso
<<<
2009-08-04 13:12:18 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<<
2009-08-04 13:12:18 NOCpulse::Probe::Shell::Unix::read_result status: >>>0<<<
2009-08-04 13:12:18 NOCpulse::Probe::Threshold::value_crossed pctused did not cross
2009-08-04 13:12:18 NOCpulse::Probe::Threshold::value_crossed space_avail did not cross
2009-08-04 13:12:18 NOCpulse::Probe::Threshold::value_crossed space_used did not cross
2009-08-04 13:12:18 NOCpulse::Probe::Result::_calc_overall_status Overall status OK
2009-08-04 13:12:18 NOCpulse::Probe::Result::_calc_changed_items First run, marking everything as changed
2009-08-04 13:12:18 NOCpulse::Probe::Result::_calc_changed_items Overall status changed:  YES
2009-08-04 13:12:18 NOCpulse::Probe::Result::_calc_notifying_items 0 items need notification
2009-08-04 13:12:18 NOCpulse::Probe::Result::_format_messages Message: Filesystem /dev/dasda1 (/boot): Filesystem pct used 17%; Space available 76 MB; Space used 15 MB
2009-08-04 13:12:18 	Items changed or removed:
2009-08-04 13:12:18 		space_avail '76' is OK
2009-08-04 13:12:18 		space_used '15' is OK
2009-08-04 13:12:18 		pctused '17' is OK
2009-08-04 13:12:18 	Notification not required
2009-08-04 13:12:18 	NOTE: Running in test mode; no changes saved, nothing enqueued
2009-08-04 13:12:18 
============================================================
OK: Filesystem /dev/dasda1 (/boot): Filesystem pct used 17%; Space available 76 MB; Space used 15 MB
============================================================
2009-08-04 13:12:18 NOCpulse::Probe::Shell::Unix::disconnect Disconnecting
2009-08-04 13:12:18 NOCpulse::Probe::Shell::Unix::_kill_child Child exit code 0

Comment 11 Brandon Perkins 2009-09-10 18:49:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1434.html


Note You need to log in before you can comment on or make changes to this bug.