Created attachment 341621 [details] spacewalk-debug Description of problem: 4/24.1 build rhel 5 server.. recreate. selinux= permissive, set w/ setenforce(so the config may be different) 1. setup monitoring 2. setup some clients and setup probes for the clients 3. used linux:memory, ping, linux:virt-memory etc.. 4. disable monitoring in the webui admin config 5. restart satellite 6. enable monitoring 7. restart satellite get: [ OK ] Starting MonitoringScout ... [ FAIL ] Starting NPBootstrap ... 2009-04-28 13:12:13 NPBootstrap: !! ERROR FROM SHELL COMMAND: 2009-04-28 13:12:13 NPBootstrap: !! STDOUT: Requesting https://riverraid.rhndev.redhat.com/satconfig/cgi-bin/fetch_netsaintid.cgi?ssk=ce97bc8aea40&publickey=ssh-dss%20AAAAB3NzaC1kc3MAAACBAJf5NmWDHHheSobFRMbT8Ly0jmEBDySYAUMGMKPKpHgxAhBnDCm%2FtuqQzx5EytsebzFjoVDtOpQevUeAgee0C2xatXDkGEIIqpWWtSEWs9XVXYkA%2FOGLeQqpJcNNvEb53JwC6f9lxUPheCZU7z7UTJ76jXbmz3nwrqBu3MmfvXuHAAAAFQDsHTbj%2Fdrbv11fIjA00%2BqvN54eswAAAIEAjeo59wQmMmgxA7whEa8s6FvnVIbX1kSls%2B%2Fc5zTHQbU0o0N0VgcFMbO7bMkFug1vh4TNGfUB5fmAdYFkBGbUhFEcIYdN3Ki%2FogUeaBSb%2FR4aH3LbmXOsIu6lA0q8i3DRP6rsVP8eFv0vQ8NmpxgMJq%2FGBNymSRojLssELcbmtq0AAACAaiiCJS%2B59wIbGqFGrKCCIzFb6cm%2FW%2B4EEAo8AED6J%2B5PB%2F%2F%2BOH9VXvcGlbNRRAv22k883cXvdU09L3Yr5Jlk9yjIcoU3YrlNe84qycIGBXmGJbJakZmseiK2NmrfR7julqToeiC5rhatF6ynU%2Fj6JqgbD6pHW60NL6yHW3ef7h0%3D%20nocpulse%40riverraid%2Erhndev%2Eredhat%2Ecom%0A Error on attempt 1: Status: '500 Internal Server Error'; content: '
Milan can you investigate it, please?
actually this bug is worse than I thought.. going to change it a bit.
recreate. 1. setup monitoring 2. execute probes , include some that utilize rhnmd 3. restart satellite 4. Monitoring and MonitoringScout service will *not* restart
' Failed 5 times to get data for this node. 2009-05-01 14:35:08 NPBootstrap: !! STDERR: 2009-05-01 14:35:08 NPBootstrap: !! EXIT: 256 2009-05-01 14:35:08 MonitoringScout: ----------- SputLite STATUS --------------- 2009-05-01 14:35:08 MonitoringScout: ----------- Dequeuer STATUS --------------- 2009-05-01 14:35:08 MonitoringScout: ----------- Dispatcher STATUS ---------------
hrm.. even more interesting... A few minutes later I tried to restart the Monitoring and MonitoringScout service individually and it worked... maybe its a timing issue w/ another service? more debugging.. but I can take it off qa-blockers I think.
yup.. seems to be some sort of timing issue w/ other services. after creating several probes restarting the entire satellite will break the restart of Monitoring and MonitoringScout.. However if you go back a few minutes later and manually restart Monitoring and Monitoring Scout it will work. Done. Starting rhn-satellite... Starting Jabber services [ OK ] Starting Oracle Net Listener ... [ OK ] Starting Oracle DB instance "rhnsat" ... [ OK ] Starting osa-dispatcher: [ OK ] Starting tomcat5: [ OK ] Starting httpd: [ OK ] Starting Monitoring ... Starting InstallSoftwareConfig ... [ OK ] Starting GenerateNotifConfig ... [ OK ] Starting NotifEscalator ... [ OK ] Starting NotifLauncher ... [ OK ] Starting Notifier ... [ OK ] Starting AckProcessor ... [ OK ] Starting TSDBLocalQueue ... [ OK ] [ OK ] Starting MonitoringScout ... [ FAIL ] Starting NPBootstrap ... ' Failed 5 times to get data for this node. 2009-05-01 14:46:33 NPBootstrap: !! STDERR: 2009-05-01 14:46:33 NPBootstrap: !! EXIT: 256 Starting SputLite ... [ OK ] Starting Dequeuer ... [ OK ] Starting Dispatcher ... [ OK ] [ OK ] Starting rhn-search... Starting cobbler daemon: [ OK ] Starting RHN Taskomatic... Done. [root@grandprix admin]# /etc/init.d/Monitoring restart Stopping Monitoring ... Stopping TSDBLocalQueue ... [ OK ] Stopping AckProcessor ... [ OK ] Stopping Notifier ... [ OK ] Stopping NotifLauncher ... [ OK ] Stopping NotifEscalator ... [ OK ] Stopping GenerateNotifConfig ... [ OK ] Stopping InstallSoftwareConfig ... [ OK ] [ OK ] Starting Monitoring ... Starting InstallSoftwareConfig ... [ OK ] Starting GenerateNotifConfig ... [ OK ] Starting NotifEscalator ... [ OK ] Starting NotifLauncher ... [ OK ] Starting Notifier ... [ OK ] Starting AckProcessor ... [ OK ] Starting TSDBLocalQueue ... [ OK ] [ OK ] [root@grandprix admin]# [root@grandprix admin]# /etc/init.d/MonitoringScout restart Stopping MonitoringScout ... Stopping Dispatcher ... [ OK ] Stopping Dequeuer ... [ OK ] Stopping SputLite ... [ OK ] Stopping NPBootstrap ... [ OK ] Stopping InstallSoftwareConfig ... [ OK ] [ OK ] Starting MonitoringScout ... Starting InstallSoftwareConfig ... [ OK ] Starting NPBootstrap ... [ OK ] Starting SputLite ... [ OK ] Starting Dequeuer ... [ OK ] Starting Dispatcher ... [ OK ] [ OK ] [root@grandprix admin]#
Satellite-5.3.0-RHEL5-re20090507.1 on s390x, I was not able to reproduce the problem following the steps in comment #0 and comment #3. Everything always restarts smoothly, no error like the above. Do you still see the problem on the latest ISO?
no.. I dont think this a problem anymore on 5/7.1 moving to on_qa to really test it out
verified