Bug 475999

Summary: Mint not shutdown by initscript
Product: Red Hat Enterprise MRG Reporter: Jan Sarenik <jsarenik>
Component: cuminAssignee: Nuno Santos <nsantos>
Status: CLOSED ERRATA QA Contact: Jan Sarenik <jsarenik>
Severity: high Docs Contact:
Priority: medium    
Version: betaCC: aortega, iboverma, jsarenik
Target Milestone: 1.1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: cumin-0.1.3073-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-21 16:18:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Locally built RPM for testing fix none

Description Jan Sarenik 2008-12-11 14:29:13 UTC
After starting cumin, the mint-server gets started as well.
But when I stop it, the mint does not get stopped as it should.

version used: cumin-0.1.2968-1.el5

100% reproducible:
1. install cumin (incl. the posgresql settings, database-init, add-user)
2. /etc/init.d/cumin start
3. pgrep mint-server && echo mint is running
4. /etc/init.d/cumin stop
5. pgrep mint-server && echo mint is still running

I expect the initscript to shutdown mint-server along with cumin.

Comment 1 Justin Ross 2008-12-11 16:40:56 UTC
My theory, based on some of the instances where I've seen mint-server fail to exit, is that this occurs because of http://bugzilla.redhat.com/show_bug.cgi?id=476038, db deadlocks that can freeze mint threads.

Comment 2 Jan Sarenik 2008-12-12 09:33:41 UTC
The bug is still valid in cumin-0.1.2986-1.el5 so I doubt it is too
much connected to 476038.

Comment 3 Jan Sarenik 2008-12-12 14:19:22 UTC
After having looked into sources in svn trunk it is clear
that cumin sends SIGTERM to process mint-server it
previously started via Popen in 'start_mint_processes'
of 'trunk/cumin/python/cumin/tools.py' file.

I am wondering whether the mint-server is still running on
'pop.pid' PID which the above-mentioned function returns
or it gets restarted meanwhile, having another process id...

If I manually kill mint-server with SIGTERM right after
'/etc/init.d/cumin stop', it ends happily, so I do not
suspect it is hanging:

# /etc/init.d/cumin stop; pkill mint-server


I will continue to examine it deeper next week.

Comment 4 Justin Ross 2008-12-15 18:52:50 UTC
In change 3001, I have added some logic to more carefully kill the mint process and check for confirmation.

Comment 5 Jan Sarenik 2008-12-22 09:17:29 UTC
Still valid on cumin-0.1.3021-1.el5

After stopping cumin via '/etc/init.d/cumin stop',
the mint process starts eating all the CPU.

Comment 6 Justin Ross 2009-01-07 18:09:16 UTC
Jan, just to make sure: did you also reinstall the schema from scratch?  Ie, use cumin-database-destroy, then cumin-database-init?

Comment 7 Jan Sarenik 2009-01-08 10:46:08 UTC
Sure I did.

Again, here is what I have just done (on rev 3030):

   #(postgresql set up and running)
   yum -y remove cumin
   yum -y install cumin
   cumin-database-destroy
   cumin-database-init
   cumin-admin add-user test
   /etc/init.d/cumin start
   firefox http://localhost:45672/
     # log in, possibly add local broker
     # if qpidd is running, log out
   /etc/init.d/cumin stop
   pgrep mint-server # running
   sleep 20
   pgrep mint-server # still running
   top
     # though it's not eating all the CPU
     # as in previous version
   pkill mint-server
   pgrep mint-server # not running anymore

BTW I am testin it on my local RHEL-5.2 running in chroot
(not jailed via vserver or anything like it).

Comment 8 Justin Ross 2009-01-08 15:55:38 UTC
The fact that it's no longer eating all the CPU is a very positive development.  I'm now comfortable (a) release noting this for 1.1 and (b) extending the time over which we attempt to kill the mint-server subprocess in the cumin process.

I extended the wait time in change 3036.

Comment 9 Justin Ross 2009-01-14 18:29:42 UTC
Reopening for 1.1.1 so we make sure to flush out any still-hidden issues.

Comment 10 Nuno Santos 2009-01-22 20:16:36 UTC
Fixed at revision 3068: added a handler for SIG_TERM that catches the signal from the initscript (it's doing "killproc $servicename -TERM") and shuts down the mint process properly.

Comment 11 Nuno Santos 2009-01-22 22:31:48 UTC
Created attachment 329746 [details]
Locally built RPM for testing fix

You can use this temporary RPM to test while the fix does not show up in the candidates repo.

Comment 12 Jan Sarenik 2009-01-29 08:41:41 UTC
Verified on RHEL5.3 i386. Thanks for fixing it!

Comment 14 errata-xmlrpc 2009-04-21 16:18:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0434.html