Bug 498062 - monitoring fails to restart after execution of probes
monitoring fails to restart after execution of probes
Status: CLOSED CURRENTRELEASE
Product: Red Hat Satellite 5
Classification: Red Hat
Component: Monitoring (Show other bugs)
530
All Linux
low Severity medium
: ---
: ---
Assigned To: Milan Zazrivec
wes hayutin
na
:
Depends On:
Blocks: 463877
  Show dependency treegraph
 
Reported: 2009-04-28 13:33 EDT by wes hayutin
Modified: 2009-10-28 15:49 EDT (History)
2 users (show)

See Also:
Fixed In Version: sat530-unconfirmed
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-10-28 15:49:33 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
spacewalk-debug (6.12 MB, application/octet-stream)
2009-04-28 13:33 EDT, wes hayutin
no flags Details

  None (edit)
Description wes hayutin 2009-04-28 13:33:02 EDT
Created attachment 341621 [details]
spacewalk-debug

Description of problem:

4/24.1 build rhel 5 server..

recreate.
selinux= permissive, set w/ setenforce(so the config may be different)
1. setup monitoring
2. setup some clients and setup probes for the clients
3. used linux:memory, ping, linux:virt-memory etc..

4. disable monitoring in the webui admin config
5. restart satellite
6. enable monitoring
7. restart satellite

get:
[ OK ]
Starting MonitoringScout ...  [ FAIL ]
Starting NPBootstrap ...  2009-04-28 13:12:13 NPBootstrap:      !! ERROR FROM SHELL COMMAND: 
2009-04-28 13:12:13 NPBootstrap:        !! STDOUT: Requesting https://riverraid.rhndev.redhat.com/satconfig/cgi-bin/fetch_netsaintid.cgi?ssk=ce97bc8aea40&publickey=ssh-dss%20AAAAB3NzaC1kc3MAAACBAJf5NmWDHHheSobFRMbT8Ly0jmEBDySYAUMGMKPKpHgxAhBnDCm%2FtuqQzx5EytsebzFjoVDtOpQevUeAgee0C2xatXDkGEIIqpWWtSEWs9XVXYkA%2FOGLeQqpJcNNvEb53JwC6f9lxUPheCZU7z7UTJ76jXbmz3nwrqBu3MmfvXuHAAAAFQDsHTbj%2Fdrbv11fIjA00%2BqvN54eswAAAIEAjeo59wQmMmgxA7whEa8s6FvnVIbX1kSls%2B%2Fc5zTHQbU0o0N0VgcFMbO7bMkFug1vh4TNGfUB5fmAdYFkBGbUhFEcIYdN3Ki%2FogUeaBSb%2FR4aH3LbmXOsIu6lA0q8i3DRP6rsVP8eFv0vQ8NmpxgMJq%2FGBNymSRojLssELcbmtq0AAACAaiiCJS%2B59wIbGqFGrKCCIzFb6cm%2FW%2B4EEAo8AED6J%2B5PB%2F%2F%2BOH9VXvcGlbNRRAv22k883cXvdU09L3Yr5Jlk9yjIcoU3YrlNe84qycIGBXmGJbJakZmseiK2NmrfR7julqToeiC5rhatF6ynU%2Fj6JqgbD6pHW60NL6yHW3ef7h0%3D%20nocpulse%40riverraid%2Erhndev%2Eredhat%2Ecom%0A
Error on attempt 1:  Status: '500 Internal Server Error'; content: '
Comment 1 Miroslav Suchý 2009-04-30 09:14:56 EDT
Milan can you investigate it, please?
Comment 2 wes hayutin 2009-05-01 14:38:55 EDT
actually this bug is worse than I thought.. going to change it a bit.
Comment 3 wes hayutin 2009-05-01 14:40:12 EDT
recreate.
1. setup monitoring
2. execute probes , include some that utilize rhnmd
3. restart satellite
4. Monitoring and MonitoringScout service will *not* restart
Comment 4 wes hayutin 2009-05-01 14:43:33 EDT
'
Failed 5 times to get data for this node.

2009-05-01 14:35:08 NPBootstrap: 	!! STDERR: 
2009-05-01 14:35:08 NPBootstrap: 	!! EXIT: 256
2009-05-01 14:35:08 MonitoringScout: ----------- SputLite STATUS ---------------
2009-05-01 14:35:08 MonitoringScout: ----------- Dequeuer STATUS ---------------
2009-05-01 14:35:08 MonitoringScout: ----------- Dispatcher STATUS ---------------
Comment 5 wes hayutin 2009-05-01 14:44:44 EDT
hrm.. even more interesting... 
A few minutes later I tried to restart the
Monitoring and MonitoringScout service individually
and it worked...

maybe its a timing issue w/ another service?
more debugging.. but I can take it off qa-blockers I think.
Comment 6 wes hayutin 2009-05-01 14:49:23 EDT
yup.. seems to be some sort of timing issue w/ other services.
after creating several probes restarting the entire satellite will break the restart of Monitoring and MonitoringScout..

However if you go back a few minutes later and manually restart Monitoring and Monitoring Scout it will work.

Done.
Starting rhn-satellite...
Starting Jabber services                                   [  OK  ]
Starting Oracle Net Listener ...                           [  OK  ]
Starting Oracle DB instance "rhnsat" ...                   [  OK  ]
Starting osa-dispatcher:                                   [  OK  ]
Starting tomcat5:                                          [  OK  ]
Starting httpd:                                            [  OK  ]
Starting Monitoring ...  Starting InstallSoftwareConfig ...  [ OK ]
Starting GenerateNotifConfig ...  [ OK ]
Starting NotifEscalator ...  [ OK ]
Starting NotifLauncher ...  [ OK ]
Starting Notifier ...  [ OK ]
Starting AckProcessor ...  [ OK ]
Starting TSDBLocalQueue ...  [ OK ]
[ OK ]
Starting MonitoringScout ...  [ FAIL ]
Starting NPBootstrap ...  
'
Failed 5 times to get data for this node.

2009-05-01 14:46:33 NPBootstrap: 	!! STDERR: 
2009-05-01 14:46:33 NPBootstrap: 	!! EXIT: 256
Starting SputLite ...  [ OK ]
Starting Dequeuer ...  [ OK ]
Starting Dispatcher ...  [ OK ]
[ OK ]
Starting rhn-search...
Starting cobbler daemon:                                   [  OK  ]
Starting RHN Taskomatic...
Done.


[root@grandprix admin]# /etc/init.d/Monitoring restart
Stopping Monitoring ...  Stopping TSDBLocalQueue ...  [ OK ]
Stopping AckProcessor ...  [ OK ]
Stopping Notifier ...  [ OK ]
Stopping NotifLauncher ...  [ OK ]
Stopping NotifEscalator ...  [ OK ]
Stopping GenerateNotifConfig ...  [ OK ]
Stopping InstallSoftwareConfig ...  [ OK ]
[ OK ]
Starting Monitoring ...  Starting InstallSoftwareConfig ...  [ OK ]
Starting GenerateNotifConfig ...  [ OK ]
Starting NotifEscalator ...  [ OK ]
Starting NotifLauncher ...  [ OK ]
Starting Notifier ...  [ OK ]
Starting AckProcessor ...  [ OK ]
Starting TSDBLocalQueue ...  [ OK ]
[ OK ]
[root@grandprix admin]# 

[root@grandprix admin]# /etc/init.d/MonitoringScout restart
Stopping MonitoringScout ...  Stopping Dispatcher ...  [ OK ]
Stopping Dequeuer ...  [ OK ]
Stopping SputLite ...  [ OK ]
Stopping NPBootstrap ...  [ OK ]
Stopping InstallSoftwareConfig ...  [ OK ]
[ OK ]
Starting MonitoringScout ...  Starting InstallSoftwareConfig ...  [ OK ]
Starting NPBootstrap ...  [ OK ]
Starting SputLite ...  [ OK ]
Starting Dequeuer ...  [ OK ]
Starting Dispatcher ...  [ OK ]
[ OK ]
[root@grandprix admin]#
Comment 7 Milan Zazrivec 2009-05-12 08:38:07 EDT
Satellite-5.3.0-RHEL5-re20090507.1 on s390x, I was not able to reproduce
the problem following the steps in comment #0 and comment #3. Everything
always restarts smoothly, no error like the above.

Do you still see the problem on the latest ISO?
Comment 8 wes hayutin 2009-05-12 09:18:32 EDT
no.. I dont think this a problem anymore on 5/7.1
moving to on_qa to really test it out
Comment 9 wes hayutin 2009-05-12 09:53:17 EDT
verified

Note You need to log in before you can comment on or make changes to this bug.