Bug 493665 - /usr/sbin/rhn-satellite restart fails to shutdown & restart Jabber services sometimes.
/usr/sbin/rhn-satellite restart fails to shutdown & restart Jabber services s...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Satellite 5
Classification: Red Hat
Component: Other (Show other bugs)
530
All Linux
low Severity medium
: ---
: ---
Assigned To: Devan Goodwin
Preethi Thomas
:
Depends On:
Blocks: 463876
  Show dependency treegraph
 
Reported: 2009-04-02 10:59 EDT by Preethi Thomas
Modified: 2009-09-10 14:54 EDT (History)
4 users (show)

See Also:
Fixed In Version: sat530
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-10 14:54:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch reverting jabberd 2.2 config changes in spacewalk-config. (54.81 KB, text/plain)
2009-04-17 14:40 EDT, Devan Goodwin
no flags Details

  None (edit)
Description Preethi Thomas 2009-04-02 10:59:59 EDT
Description of problem:
/usr/sbin/rhn-satellite restart fails to shutdown & restart Jabber services.

Version-Release number of selected component (if applicable):
Satellite-5.3.0-RHEL5-re20090327.0-i386-embedded-oracle.iso

How reproducible:


Steps to Reproduce:
1.[root@rlx-3-24 ~]# /usr/sbin/rhn-satellite restart
2.
3.
  
Actual results:

Shutdown Oracle: Processing Database instance "rhnsat": log file /opt/apps/oracle/web/product/10.2.0/db_1/log/shutdown.log
                                                           [  OK  ]
Shutting down Jabber router:                               [FAILED]
Done.
Starting rhn-satellite...
Starting Jabber services                                   [FAILED]
Starting Oracle: Processing Database instance "rhnsat": log file /opt/apps/oracle/web/product/10.2.0/db_1/log/startup.log

Expected results:


Additional info:
Comment 1 Milan Zázrivec 2009-04-02 13:07:20 EDT
Actually, those jabber services don't even start properly:

# runuser -s /bin/bash - jabberd -c "ulimit -S -c 0 > /dev/null 2>&1; /usr/bin/jabberd -Db"
JBRD: debug on
JBRD: version(2.0s10)
JBRD: config_dir(/etc/jabberd)
JBRD: LaunchJob: router -> /usr/bin/router -c /etc/jabberd/router.xml -D
JBRD: LaunchJob: resolver -> /usr/bin/resolver -c /etc/jabberd/resolver.xml -D
JBRD: LaunchJob: sm -> /usr/bin/sm -c /etc/jabberd/sm.xml -D
JBRD: LaunchJob: s2s -> /usr/bin/s2s -c /etc/jabberd/s2s.xml -D
JBRD: LaunchJob: c2s -> /usr/bin/c2s -c /etc/jabberd/c2s.xml -D
C2S : WARN: Debugging not enabled.  Ignoring -D.
ERROR: c2s died.  Shutting down server.
JBRD: Got a signal... pass it on.
JBRD: It was a TERM.  Shut it all down!

# strace -f -o c2s.strace /usr/bin/c2s -c /etc/jabberd/c2s.xml -D
# grep 'open.*ENOENT' c2s.strace 
open("/var/lib/jabberd/pid/c2s.pid", O_RDWR|O_CREAT|O_TRUNC, 0666) = -1 ENOENT (No such file or directory)
# ls -d /var/lib/jabberd/pid/
ls: /var/lib/jabberd/pid/: No such file or directory
Comment 2 Clifford Perry 2009-04-07 13:01:16 EDT
This looks like a Jabberd 2.2 config option setting that we did in spacewalk 0.5 for F10 support - the /var/lib/jabberd/pid/ structure. That was new. We should be using /var/lib/jabberd/ - Going to assign to Devan to review/investigate and figure out what/how jabberd on Sat 5.3.0 is looking - since it is based on jabberd 2.0. 

Cliff.
Comment 3 Devan Goodwin 2009-04-15 10:23:52 EDT
Plan is to revert to the previous configuration before we updated for 2.2 in spacewalk. This boils down to about 4 commits by dgilmore we'll revert in Satellite git and they'll appear in the srpm patch. No need for a package rename as it's not satellite specific code, just older spacewalk code.
Comment 4 Devan Goodwin 2009-04-17 14:40:10 EDT
Ok attempted the revert but the jabberd service still does not seem to want to start. (authentication error in /var/log/messages)

Attaching a patch that reverts all the commits I could find that were done to make spacewalk work with jabberd 2.2. (full details in the commit message)

This commit is not in satellite.git yet as it does not seem to work.

Need to hand off to someone who knows the internals better, assigning to prad to take a look.

My test was to tito build --rpm --test on spacewalk-config after applying this patch, install that on a pre-configured satellite (which may be the problem), try to cleanup the jabberd config files etc, and then re-run install.pl but keeping the oracle db intact.
Comment 5 Devan Goodwin 2009-04-17 14:40:58 EDT
Created attachment 340066 [details]
Patch reverting jabberd 2.2 config changes in spacewalk-config.
Comment 6 Devan Goodwin 2009-04-27 11:54:23 EDT
Looks like the reverts do actually work, but to test you need a very fresh system that hasn't run install.pl before.

[root@sat1 satiso]# rhn-satellite status
jabberd router (pid 12332) is running...

[root@sat1 satiso]# rhn-satellite status
jabberd router (pid 12332) is running...
Oracle Net Listener (pid 12405) is running...
Oracle DB instance rhnsat (pid 12417) is running...
osa-dispatcher (pid 12465) is running...
/etc/init.d/tomcat5 is already running (12996)
httpd (pid 13379 13378 13377 13376 13375 13374 13373 13040 13039 13038 13037 13036 13035 13034 13033 13021) is running...
rhn-search is running (13073).
cobblerd (pid 13107) is running...
RHN Taskomatic is running (13134).
[root@sat1 satiso]# rhn-satellite restart
Shutting down rhn-satellite...
Stopping RHN Taskomatic...
Stopped RHN Taskomatic.
Stopping cobbler daemon:                                   [  OK  ]
Stopping rhn-search...
Stopped rhn-search.
Stopping MonitoringScout ...  [ OK ]
Stopping Monitoring ...  [ OK ]
Stopping httpd:                                            [  OK  ]
Stopping tomcat5:                                          [  OK  ]
Shutting down osa-dispatcher:                              [  OK  ]
Shutting down Oracle Net Listener ...                      [  OK  ]
Shutting down Oracle DB instance "rhnsat" ...              [  OK  ]
Shutting down Jabber router:                               [  OK  ]
Done.
Starting rhn-satellite...
Starting Jabber services                                   [  OK  ]
Starting Oracle Net Listener ...                           [  OK  ]
Starting Oracle DB instance "rhnsat" ...                   [  OK  ]
Starting osa-dispatcher: RHN 14249 2009/04/27 12:50:15 -03:00: ('Server does not support TLS - <starttls /> not in <features /> stanza',)
                                                           [  OK  ]
Starting tomcat5:                                          [  OK  ]
Starting httpd:                                            [  OK  ]
Starting Monitoring ...  [ OK ]
Starting MonitoringScout ...  [ OK ]
Starting rhn-search...
Starting cobbler daemon:                                   [  OK  ]
Starting RHN Taskomatic...
Done.



Fixed in satellite.git: 3fb4a8af35a7ea6b674fcbaa66b23e7fa4b94c78
Comment 7 Devan Goodwin 2009-04-27 15:34:48 EDT
Further problem surfaced related to this where server.pem was still getting created in /etc/pki/spacewalk/jabberd/ (new 2.2 location we configured in spacewalk) instead of /etc/jabberd/server.pem (old location).

This surfaced in a very quiet way, the jabberd service would start fine but clients would be unable to connect, error message would appear in /var/log/messages on the Satellite: "packet sent before session start, closing stream" which doesn't give a lot of hints but looking at the jabberd service startup log in the same file you'd see a message about missing SSL pem, even though the service started ok.

satellite.git:  1f50679e220f30ed31a0a8f1246d6639e431123f
Comment 8 Preethi Thomas 2009-05-26 13:00:39 EDT
verified
Satellite-5.3.0-RHEL4-re20090521.1-i386-embedded-oracle.iso

[root@fjs-0-07 log]# runuser -s /bin/bash - jabberd -c "ulimit -S -c 0 > /dev/null 2>&1;
/usr/bin/jabberd -Db"
JBRD: debug on
JBRD: version(2.0s10)
JBRD: config_dir(/etc/jabberd)
JBRD: LaunchJob: router -> /usr/bin/router -c /etc/jabberd/router.xml -D
Comment 9 Tomas Lestach 2009-08-07 05:30:04 EDT
# rhn-satellite restart
Shutting down rhn-satellite...
Stopping RHN Taskomatic...
Stopped RHN Taskomatic.
Stopping cobbler daemon:                                   [  OK  ]
Stopping rhn-search...
Stopped rhn-search.
Stopping MonitoringScout ...  [ OK ]
Stopping Monitoring ...  [ OK ]
Stopping httpd:                                            [  OK  ]
Stopping tomcat5:                                          [  OK  ]
Shutting down osa-dispatcher:                              [  OK  ]
Shutting down Oracle Net Listener ...                      [  OK  ]
Shutting down Oracle DB instance "rhnsat" ...              [  OK  ]
Shutting down Jabber router:                               [  OK  ]
Done.
Starting rhn-satellite...
Starting Jabber services                                   [  OK  ]
Starting Oracle Net Listener ...                           [  OK  ]
Starting Oracle DB instance "rhnsat" ...                   [  OK  ]
Starting osa-dispatcher:                                   [  OK  ]
Starting tomcat5:                                          [  OK  ]
Starting httpd:                                            [  OK  ]
Starting Monitoring ...  [ OK ]
Starting MonitoringScout ...  [ OK ]
Starting rhn-search...
Starting cobbler daemon:                                   [  OK  ]
Starting RHN Taskomatic...
Done.
# rhn-satellite status
jabberd router (pid 19427) is running...
Oracle Net Listener (pid 19500) is running...
Oracle DB instance rhnsat (pid 19512) is running...
osa-dispatcher (pid 19562) is running...
/etc/init.d/tomcat5 is already running (20093)
httpd (pid 20151 20150 20149 20148 20147 20146 20145 20144 20123) is running...
rhn-search is running (20175).
cobblerd (pid 20208) is running...
RHN Taskomatic is running (20236).

Jabberd works correctly on the satellite server.

Stage validated -> RELEASE_PENDING.
Comment 10 Brandon Perkins 2009-09-10 14:54:42 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1434.html

Note You need to log in before you can comment on or make changes to this bug.