Bug 513368 - webUI: Service Temporarily Unavailable, osad/jabber_lib.print_message('socket error',)
Summary: webUI: Service Temporarily Unavailable, osad/jabber_lib.print_message('socket...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite 5
Classification: Red Hat
Component: WebUI
Version: 530
Hardware: All
OS: Linux
high
urgent
Target Milestone: ---
Assignee: Jan Pazdziora (Red Hat)
QA Contact: Jeff Browning
URL:
Whiteboard:
Depends On:
Blocks: 456985 463877
TreeView+ depends on / blocked
 
Reported: 2009-07-23 11:08 UTC by Petr Sklenar
Modified: 2009-09-10 18:49 UTC (History)
6 users (show)

Fixed In Version: sat530
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-10 18:49:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Petr Sklenar 2009-07-23 11:08:48 UTC
Description of problem:
I cannot connect to satellite webUI after few days. It writes in log : osad/jabber_lib.main('Unable to connect to jabber servers, sleeping 10 seconds',)

Version-Release number of selected component (if applicable):
sat530

How reproducible:
I saw that three time on one machine

Steps to Reproduce:
1. clear i386 RHEL4U8 + sat52 with external oracle
2. upgrade to sat530
3. wait few days
  
Actual results:
Service Temporarily Unavailable

Expected results:
webUI works

Additional info:
tail /var/log/rhn/osa-dispatcher.log
2009/07/08 07:20:26 -04:00 30903 0.0.0.0: osad/jabber_lib.__init__
2009/07/08 07:20:26 -04:00 30903 0.0.0.0: osad/jabber_lib.print_message('socket error',)
2009/07/08 07:20:26 -04:00 30903 0.0.0.0: osad/jabber_lib.print_message('Could not connect to jabber server', 'hp-bl460c-02.rhts.bos.redhat.com')
2009/07/08 07:20:26 -04:00 30903 0.0.0.0: osad/jabber_lib.setup_connection('Could not connect to any jabber server',)
2009/07/08 07:20:26 -04:00 30903 0.0.0.0: osad/jabber_lib.main('Unable to connect to jabber servers, sleeping 10 seconds',)


---
I restarted satellite last time and it worked for next few days. I appears now again.

Comment 2 Petr Sklenar 2009-07-23 12:12:50 UTC
jabberd is stopped:

[root@hp-bl460c-02 ~]# rhn-satellite status
jabberd router is stopped
osa-dispatcher (pid 30904) is running...
lock file found but no process running for pid 31435
httpd (pid 31456 7059 7058 7057 7056 7055 7054 7053 7052) is running...
2009-07-08 08:28:28 Monitoring: ----------- InstallSoftwareConfig STATUS ---------------
2009-07-08 08:28:28 Monitoring: ----------- GenerateNotifConfig STATUS ---------------
2009-07-08 08:28:28 Monitoring: ----------- NotifEscalator STATUS ---------------
2009-07-08 08:28:28 Monitoring: ----------- NotifLauncher STATUS ---------------
2009-07-08 08:28:28 Monitoring: ----------- Notifier STATUS ---------------
2009-07-08 08:28:29 Monitoring: ----------- AckProcessor STATUS ---------------
2009-07-08 08:28:29 Monitoring: ----------- TSDBLocalQueue STATUS ---------------
2009-07-08 08:28:29 MonitoringScout: ----------- InstallSoftwareConfig STATUS ---------------
2009-07-08 08:28:29 MonitoringScout: ----------- NPBootstrap STATUS ---------------
2009-07-08 08:28:29 MonitoringScout: ----------- SputLite STATUS ---------------
2009-07-08 08:28:30 MonitoringScout: ----------- Dequeuer STATUS ---------------
2009-07-08 08:28:30 MonitoringScout: ----------- Dispatcher STATUS ---------------
rhn-search is running (31703).
cobblerd (pid 6986) is running...
RHN Taskomatic is running (31814).

Comment 8 Clifford Perry 2009-07-27 16:47:19 UTC
jabberd is being restarted due to logroate. Was able to replicate by doing a:

logrotate -f /etc/logrotate.conf

I reviewed all the contents of /etc/logrotate* and see nothing though saying to rotate and HUP jabberd. Very weird.

Still looking.

Comment 16 Jan Pazdziora (Red Hat) 2009-07-30 15:41:18 UTC
Fix in Spacewalk repo, master 172090659d5a0b6cba91299617794e369672ace4, VADER 7b2dc374f28c04f68c4045520cd4354bfc6c7e59.

Comment 17 Brad Buckingham 2009-07-31 17:39:36 UTC
Moving to ON_QA

Comment 19 Jan Pazdziora (Red Hat) 2009-08-03 08:20:59 UTC
Yes, the fix is in ProgAGoGo-1.11.5-2.el4sat.noarch.rpm and ProgAGoGo-1.11.5-2.el5sat.noarch.rpm.

Comment 24 Jan Pazdziora (Red Hat) 2009-08-10 15:09:59 UTC
Pulling from ON_QA, until the fix for bugzilla 516073 hits the compose.

Comment 30 Miroslav Suchý 2009-08-20 10:10:20 UTC
verified in stage on xen5.
I forced logrotate to rotate even with 1k of log size. verified that monitoring has been restarted and jabberd and tomcat survived.

Comment 31 Brandon Perkins 2009-09-10 18:49:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1434.html


Note You need to log in before you can comment on or make changes to this bug.