Description of problem: Quoting Cliff: While working with mpoole on ATT jabberd/osad/osa-dispatcher issues it seems that in upgrading to jabberd 2.2 we overlooked a couple of fixes we implemented in the init script for jabberd 2.0 for Sat 5.3 and earlier. #1 - ulimit stuff to allow for LOTS of client connections (open files) #2 - blowing away auth files on restart to clear up corrupted files 1) Details, in startup we use to do: # Set the number of open files to a larger default [ -z "$MAX_OPEN_FILES" ] && MAX_OPEN_FILES=2048 start() { # Set ulimit ulimit -n $MAX_OPEN_FILES echo -n $"Starting Jabber services" #Remove the database files, if they exist. 2) Details, in startup section we use to do: #Remove the database files, if they exist. if [ -e /var/lib/jabberd ] then for i in /var/lib/jabberd/* do #Don't try to remove /var/lib/jabberd/* if [ -e $i ] then rm -f $i fi done fi Version-Release number of selected component (if applicable): jabberd-2.2.8-8.el5sat How reproducible: Deterministic. Steps to Reproduce: 1. Look into the code. Actual results: ulimit and rm -rf /var/lib/jabberd/* is not there. Expected results: ulimit and rm -rf /var/lib/jabberd/* is there. Additional info:
Martin Poole then commented: for 1) should we automatically adjust the config file (c2s.xml being the normally affected beast) or make it suitably huge by default. <max_fds>1024</max_fds> since without modification the daemon will be limited to max_fds rather than ulimit.nofile for 2) note the location has now changed and appears to be /var/lib/jabberd/db
(In reply to comment #0) > 2) Details, in startup section we use to do: > > #Remove the database files, if they exist. > if [ -e /var/lib/jabberd ] > then > for i in /var/lib/jabberd/* > do > #Don't try to remove /var/lib/jabberd/* > if [ -e $i ] > then > rm -f $i > fi > done > fi Cliff, why can't we simply do rm -rf /var/lib/jabberd/* here? I somehow fail to see what the intended logic of the code was.
Jan - looking: Sat 5.3 - jabberd-2.0s10-3.52.el5sat : -------------------------------------- [root@rlx-1-10 jabberd]# pwd /var/lib/jabberd [root@rlx-1-10 jabberd]# ls authreg.db __db.001 __db.002 __db.003 __db.004 __db.005 log.0000000001 sm.db [root@rlx-1-10 jabberd]# I can confirm that those above files will be re-created on restart as clients connect. It is transient data in that it does not need to be preserved. Sat 5.4 - jabberd-2.2.8-8.el5sat: --------------------------------- [root@rlx-1-18 jabberd]# pwd /var/lib/jabberd [root@rlx-1-18 jabberd]# find . . ./log ./db ./db/__db.001 ./db/__db.004 ./db/__db.002 ./db/authreg.db ./db/__db.005 ./db/sm.db ./db/log.0000000001 ./db/__db.003 ./pid ./pid/s2s.pid ./pid/router.pid ./pid/c2s.pid ./pid/sm.pid [root@rlx-1-18 jabberd]# Looking on Sat 5.4 - the script would rm -rf /var/lib/jabberd/db/* here, leaving the .pid files alone. Though, if the daemon is already shut down, removing the pid files if they still exist is good cleanup to do, before starting back up. As for the specific question on why the loop to go through all files. I cannot remember. No doubt the developer at the time hit an issue and prefered not to use 'rm -rf'. I do not know why it was choosen to be implemented this way. Cliff
(In reply to comment #3) > Looking on Sat 5.4 - the script would rm -rf /var/lib/jabberd/db/* here, > leaving the .pid files alone. > > Though, if the daemon is already shut down, removing the pid files if they > still exist is good cleanup to do, before starting back up. Pid files get removed by the script already, before starting the individual programs: for prog in ${progs}; do if [ $( pidof -s ${prog} ) ]; then echo "process [${prog}] already running" continue fi echo -n "Starting ${prog}: " rm -f /var/lock/subsys/${prog} rm -f ${pidPath}/${prog}.pid
Addressed in the thirdparty repo, ced047d23d06089e0be8111222e91b2bc424c09b, f575bdfd27e508b1adaa58a1a44e8e939af6d5a0, and 3164cd44232cd83a934b308bea3b14ccbe288477. Built as jabberd-2.2.8-9.el5sat.
Fixed the fix in Satellite thirdparty, e6c16053c4f7ea9c135ed2517cc1432d6634236c, to work correctly upon upgrades.
Moving ON_QA.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0989.html