Bug 498062

Summary: monitoring fails to restart after execution of probes
Product: Red Hat Satellite 5 Reporter: wes hayutin <whayutin>
Component: MonitoringAssignee: Milan Zázrivec <mzazrivec>
Status: CLOSED CURRENTRELEASE QA Contact: wes hayutin <whayutin>
Severity: medium Docs Contact:
Priority: low    
Version: 530CC: bperkins, msuchy
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: na
Whiteboard:
Fixed In Version: sat530-unconfirmed Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-28 19:49:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 463877    
Attachments:
Description Flags
spacewalk-debug none

Description wes hayutin 2009-04-28 17:33:02 UTC
Created attachment 341621 [details]
spacewalk-debug

Description of problem:

4/24.1 build rhel 5 server..

recreate.
selinux= permissive, set w/ setenforce(so the config may be different)
1. setup monitoring
2. setup some clients and setup probes for the clients
3. used linux:memory, ping, linux:virt-memory etc..

4. disable monitoring in the webui admin config
5. restart satellite
6. enable monitoring
7. restart satellite

get:
[ OK ]
Starting MonitoringScout ...  [ FAIL ]
Starting NPBootstrap ...  2009-04-28 13:12:13 NPBootstrap:      !! ERROR FROM SHELL COMMAND: 
2009-04-28 13:12:13 NPBootstrap:        !! STDOUT: Requesting https://riverraid.rhndev.redhat.com/satconfig/cgi-bin/fetch_netsaintid.cgi?ssk=ce97bc8aea40&publickey=ssh-dss%20AAAAB3NzaC1kc3MAAACBAJf5NmWDHHheSobFRMbT8Ly0jmEBDySYAUMGMKPKpHgxAhBnDCm%2FtuqQzx5EytsebzFjoVDtOpQevUeAgee0C2xatXDkGEIIqpWWtSEWs9XVXYkA%2FOGLeQqpJcNNvEb53JwC6f9lxUPheCZU7z7UTJ76jXbmz3nwrqBu3MmfvXuHAAAAFQDsHTbj%2Fdrbv11fIjA00%2BqvN54eswAAAIEAjeo59wQmMmgxA7whEa8s6FvnVIbX1kSls%2B%2Fc5zTHQbU0o0N0VgcFMbO7bMkFug1vh4TNGfUB5fmAdYFkBGbUhFEcIYdN3Ki%2FogUeaBSb%2FR4aH3LbmXOsIu6lA0q8i3DRP6rsVP8eFv0vQ8NmpxgMJq%2FGBNymSRojLssELcbmtq0AAACAaiiCJS%2B59wIbGqFGrKCCIzFb6cm%2FW%2B4EEAo8AED6J%2B5PB%2F%2F%2BOH9VXvcGlbNRRAv22k883cXvdU09L3Yr5Jlk9yjIcoU3YrlNe84qycIGBXmGJbJakZmseiK2NmrfR7julqToeiC5rhatF6ynU%2Fj6JqgbD6pHW60NL6yHW3ef7h0%3D%20nocpulse%40riverraid%2Erhndev%2Eredhat%2Ecom%0A
Error on attempt 1:  Status: '500 Internal Server Error'; content: '

Comment 1 Miroslav Suchý 2009-04-30 13:14:56 UTC
Milan can you investigate it, please?

Comment 2 wes hayutin 2009-05-01 18:38:55 UTC
actually this bug is worse than I thought.. going to change it a bit.

Comment 3 wes hayutin 2009-05-01 18:40:12 UTC
recreate.
1. setup monitoring
2. execute probes , include some that utilize rhnmd
3. restart satellite
4. Monitoring and MonitoringScout service will *not* restart

Comment 4 wes hayutin 2009-05-01 18:43:33 UTC
'
Failed 5 times to get data for this node.

2009-05-01 14:35:08 NPBootstrap: 	!! STDERR: 
2009-05-01 14:35:08 NPBootstrap: 	!! EXIT: 256
2009-05-01 14:35:08 MonitoringScout: ----------- SputLite STATUS ---------------
2009-05-01 14:35:08 MonitoringScout: ----------- Dequeuer STATUS ---------------
2009-05-01 14:35:08 MonitoringScout: ----------- Dispatcher STATUS ---------------

Comment 5 wes hayutin 2009-05-01 18:44:44 UTC
hrm.. even more interesting... 
A few minutes later I tried to restart the
Monitoring and MonitoringScout service individually
and it worked...

maybe its a timing issue w/ another service?
more debugging.. but I can take it off qa-blockers I think.

Comment 6 wes hayutin 2009-05-01 18:49:23 UTC
yup.. seems to be some sort of timing issue w/ other services.
after creating several probes restarting the entire satellite will break the restart of Monitoring and MonitoringScout..

However if you go back a few minutes later and manually restart Monitoring and Monitoring Scout it will work.

Done.
Starting rhn-satellite...
Starting Jabber services                                   [  OK  ]
Starting Oracle Net Listener ...                           [  OK  ]
Starting Oracle DB instance "rhnsat" ...                   [  OK  ]
Starting osa-dispatcher:                                   [  OK  ]
Starting tomcat5:                                          [  OK  ]
Starting httpd:                                            [  OK  ]
Starting Monitoring ...  Starting InstallSoftwareConfig ...  [ OK ]
Starting GenerateNotifConfig ...  [ OK ]
Starting NotifEscalator ...  [ OK ]
Starting NotifLauncher ...  [ OK ]
Starting Notifier ...  [ OK ]
Starting AckProcessor ...  [ OK ]
Starting TSDBLocalQueue ...  [ OK ]
[ OK ]
Starting MonitoringScout ...  [ FAIL ]
Starting NPBootstrap ...  
'
Failed 5 times to get data for this node.

2009-05-01 14:46:33 NPBootstrap: 	!! STDERR: 
2009-05-01 14:46:33 NPBootstrap: 	!! EXIT: 256
Starting SputLite ...  [ OK ]
Starting Dequeuer ...  [ OK ]
Starting Dispatcher ...  [ OK ]
[ OK ]
Starting rhn-search...
Starting cobbler daemon:                                   [  OK  ]
Starting RHN Taskomatic...
Done.


[root@grandprix admin]# /etc/init.d/Monitoring restart
Stopping Monitoring ...  Stopping TSDBLocalQueue ...  [ OK ]
Stopping AckProcessor ...  [ OK ]
Stopping Notifier ...  [ OK ]
Stopping NotifLauncher ...  [ OK ]
Stopping NotifEscalator ...  [ OK ]
Stopping GenerateNotifConfig ...  [ OK ]
Stopping InstallSoftwareConfig ...  [ OK ]
[ OK ]
Starting Monitoring ...  Starting InstallSoftwareConfig ...  [ OK ]
Starting GenerateNotifConfig ...  [ OK ]
Starting NotifEscalator ...  [ OK ]
Starting NotifLauncher ...  [ OK ]
Starting Notifier ...  [ OK ]
Starting AckProcessor ...  [ OK ]
Starting TSDBLocalQueue ...  [ OK ]
[ OK ]
[root@grandprix admin]# 

[root@grandprix admin]# /etc/init.d/MonitoringScout restart
Stopping MonitoringScout ...  Stopping Dispatcher ...  [ OK ]
Stopping Dequeuer ...  [ OK ]
Stopping SputLite ...  [ OK ]
Stopping NPBootstrap ...  [ OK ]
Stopping InstallSoftwareConfig ...  [ OK ]
[ OK ]
Starting MonitoringScout ...  Starting InstallSoftwareConfig ...  [ OK ]
Starting NPBootstrap ...  [ OK ]
Starting SputLite ...  [ OK ]
Starting Dequeuer ...  [ OK ]
Starting Dispatcher ...  [ OK ]
[ OK ]
[root@grandprix admin]#

Comment 7 Milan Zázrivec 2009-05-12 12:38:07 UTC
Satellite-5.3.0-RHEL5-re20090507.1 on s390x, I was not able to reproduce
the problem following the steps in comment #0 and comment #3. Everything
always restarts smoothly, no error like the above.

Do you still see the problem on the latest ISO?

Comment 8 wes hayutin 2009-05-12 13:18:32 UTC
no.. I dont think this a problem anymore on 5/7.1
moving to on_qa to really test it out

Comment 9 wes hayutin 2009-05-12 13:53:17 UTC
verified