Bug 596719

Summary: certmonger can start many daemons and service certmonger status won't recognize it
Product: Red Hat Enterprise Linux 6 Reporter: Aleš Mareček <amarecek>
Component: certmongerAssignee: Nalin Dahyabhai <nalin>
Status: CLOSED CURRENTRELEASE QA Contact: Aleš Mareček <amarecek>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: dpal
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: certmonger-0.24-3.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-10 19:58:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 556621    

Description Aleš Mareček 2010-05-27 11:29:58 UTC
Description of problem:
When you start daemon from shell without "service" usage, you can start as many daemons as you want.
Run without "-p" causes that certmonger won't write its pid into the file but there is no other mechanism how to recognize that service is already running. This can be also problem if your service is already running (without "-p" option) and in the future you will trust "service certmonger status" that will tell you the service is not running.

Version-Release number of selected component (if applicable):
certmonger-0.21-1.el6

How reproducible:
Always

Steps to Reproduce:
1.
service certmonger stop;
for I in `seq 1 1 5`; do
  certmonger -S -n &
done;
ps aux | grep "certmonger" | grep -v "grep";
service certmonger status;
[ `echo $?` == 3 ] && echo 'Status said: "Daemon is not running" <--- FALSE!';
service certmonger start;
SERVICE_RUN_PID=`cat /var/run/certmonger.pid`;
echo ${SERVICE_RUN_PID};
service certmonger status;
ps aux | grep "certmonger" | grep -v "grep";
for I in `seq 1 1 5`; do
  certmonger -S -n -p /var/run/certmonger.pid &
  cat /var/run/certmonger.pid
done; ps aux | grep "certmonger" | grep -v "grep";
service certmonger stop;
service certmonger status;
[ `echo $?` == 3 ] && echo 'Status said: "Daemon is not running" <--- FALSE!';
[ ! -z "`ps aux | grep ${SERVICE_RUN_PID} | grep -v 'grep'`" ] && echo 'Sure! "service" killed other certmonger daemon!'
  
Actual results:
It is possible to run many daemons.

Expected results:
Only one daemon can start.

Additional info:

Comment 1 Nalin Dahyabhai 2010-05-27 16:15:28 UTC
The daemon isn't noticing that it fails to take the name of its well-known service on the system bus, which would cause it to exit here.

Comment 3 Fedora Update System 2010-05-27 22:53:40 UTC
certmonger-0.23-1.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/certmonger-0.23-1.fc13

Comment 4 Fedora Update System 2010-05-27 22:54:03 UTC
certmonger-0.23-1.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/certmonger-0.23-1.fc12

Comment 5 Fedora Update System 2010-05-27 22:56:19 UTC
certmonger-0.23-1.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/certmonger-0.23-1.fc11

Comment 6 Fedora Update System 2010-06-03 18:04:34 UTC
certmonger-0.23-1.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 7 Fedora Update System 2010-06-03 18:08:08 UTC
certmonger-0.23-1.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 8 Fedora Update System 2010-06-03 18:13:45 UTC
certmonger-0.23-1.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 9 Aleš Mareček 2010-06-07 15:57:44 UTC
Greetings!
I was testing this bug and it seems that "starting" problem has been resolved. But I have another one: When you start the daemon by service and then start it from shell with -p value the daemon won't start but it will delete pidfile! Thus we'll get properly running daemon without pidfile and status will tell us there is an "zombie".
Moving back to "ASSIGNED".

Reproducer:
service certmonger start;
echo $?;
cat /var/run/certmonger.pid;
service certmonger status;
echo $?;
/usr/sbin/certmonger -S -p /var/run/certmonger.pid;
echo $?;
cat /var/run/certmonger.pid;
service certmonger status;
echo $?

Result:
Starting certmonger: [  OK  ]
0
14152
certmonger (pid  14152) is running...
0
Error connecting to D-Bus.
1
cat: /var/run/certmonger.pid: No such file or directory
certmonger dead but subsys locked
2

Comment 10 Nalin Dahyabhai 2010-06-07 20:20:03 UTC
Does this scratch build (https://brewweb.devel.redhat.com//taskinfo?taskID=2500399), featuring some changes pull out of git, fix the issues?

Comment 11 Aleš Mareček 2010-06-08 08:02:36 UTC
Greetings!
I was testing it on i386 and it seems to be OK :)

$ service certmonger start; echo $?; /usr/sbin/certmonger -S; echo $?; /usr/sbin/certmonger -S -p /var/run/certmonger.pid; echo $?

Starting certmonger:                                       [  OK  ]
0
Error connecting to D-Bus.
1
Error locking pidfile "/var/run/certmonger.pid": Resource temporarily unavailable
1

Moving to "ON_QA", after official RHEL version I'll test it on all architectures and move it to "VERIFIED" (if OK).

Comment 12 Nalin Dahyabhai 2010-06-08 15:38:30 UTC
Please continue testing with certmonger-0.24-1.el6.  (I can move this back to MODIFIED if you wish, but I suspect you'd just have to move it back to ON_QA.)

Comment 13 Aleš Mareček 2010-06-09 15:20:14 UTC
Nalin,
I have next issue. When you start daemon that is already running and use other pidfile, let say /var/run/certmonger_2.pid, daemon won't be started (correct) but pidfile will be created (and this file is blank).
I suppose we don't want this behaviour thus moving to "ASSIGNED" again.

Comment 14 Nalin Dahyabhai 2010-06-10 19:11:26 UTC
Try certmonger-0.24-2.el6, then.

Comment 15 Aleš Mareček 2010-06-15 12:52:26 UTC
Hi Nalin,
I'm confirming that the issue with existing pidfile is resolved.
I found next issues:
1) When you start certmonger manually with -p <pidfile> service can't stop certmonger until lock file is created (Example 1). Same behaviour with "service" run and deleting it's lock file (Example 2).
2) When you start certmonger by service and then kill it by SIGKILL, delete the lock file, behaviour is slightly different. Service knows that certmonger is dead and pid file exists but it won't delete the pid file (Example 3).
Same result for all is service can't restore certmonger without lock file by using "service certmonger stop".
Please have a look at it.

Additional info:
service certmonger start on killed certmonger creates lock file and rewrites pid file by new pid - CORRECT.
Also service certmonger start on running certmonger recreates lock file and doesn't rewrite pid file (nor start) - CORRECT.

Example 1:
[root@hp-ml370g5-01 ~]# certmonger -S -p /var/run/certmonger.pid
[root@hp-ml370g5-01 ~]# service certmonger status
certmonger (pid  9051) is running...
[root@hp-ml370g5-01 ~]# service certmonger stop
[root@hp-ml370g5-01 ~]# service certmonger status
certmonger (pid  9051) is running...
[root@hp-ml370g5-01 ~]# cat /var/run/certmonger.pid
9051
[root@hp-ml370g5-01 ~]# ps aux | grep certmonger | grep -v grep
root      9051  0.0  0.0  59384   716 ?        Ss   08:19   0:00 certmonger -S -p /var/run/certmonger.pid
[root@hp-ml370g5-01 ~]# touch /var/lock/subsys/certmonger
[root@hp-ml370g5-01 ~]# service certmonger status
certmonger (pid  9051) is running...
[root@hp-ml370g5-01 ~]# service certmonger stop
Stopping certmonger: [  OK  ]
[root@hp-ml370g5-01 ~]# service certmonger status
certmonger is stopped
[root@hp-ml370g5-01 ~]# ps aux | grep certmonger | grep -v grep
[root@hp-ml370g5-01 ~]# 

Example 2:
[root@ibm-z10-04 ~]# service certmonger start
Starting certmonger: [  OK  ]
[root@ibm-z10-04 ~]# service certmonger status
certmonger (pid  4996) is running...
[root@ibm-z10-04 ~]# rm -rf /var/lock/subsys/certmonger 
[root@ibm-z10-04 ~]# service certmonger status
certmonger (pid  4996) is running...
[root@ibm-z10-04 ~]# service certmonger stop
[root@ibm-z10-04 ~]# service certmonger status
certmonger (pid  4996) is running...
[root@ibm-z10-04 ~]# touch /var/lock/subsys/certmonger
[root@ibm-z10-04 ~]# service certmonger stop
Stopping certmonger: [  OK  ]
[root@ibm-z10-04 ~]# service certmonger status
certmonger is stopped

Example 3:
[root@ibm-z10-04 ~]# service certmonger start
Starting certmonger: [  OK  ]
[root@ibm-z10-04 ~]# service certmonger status
certmonger (pid  4936) is running...
[root@ibm-z10-04 ~]# kill -9 4936
[root@ibm-z10-04 ~]# service certmonger status
certmonger dead but pid file exists
[root@ibm-z10-04 ~]# ls /var/run/certmonger* /var/lock/subsys/certmonger*
/var/lock/subsys/certmonger  /var/run/certmonger.pid
[root@ibm-z10-04 ~]# rm -rf /var/lock/subsys/certmonger
[root@ibm-z10-04 ~]# service certmonger status
certmonger dead but pid file exists
[root@ibm-z10-04 ~]# ls /var/run/certmonger* /var/lock/subsys/certmonger*
ls: cannot access /var/lock/subsys/certmonger*: No such file or directory
/var/run/certmonger.pid
[root@ibm-z10-04 ~]# service certmonger stop
[root@ibm-z10-04 ~]# service certmonger status
certmonger dead but pid file exists
[root@ibm-z10-04 ~]# ls /var/run/certmonger* /var/lock/subsys/certmonger*
ls: cannot access /var/lock/subsys/certmonger*: No such file or directory
/var/run/certmonger.pid
[root@ibm-z10-04 ~]# touch /var/lock/subsys/certmonger
[root@ibm-z10-04 ~]# service certmonger stop
Stopping certmonger: [FAILED]
[root@ibm-z10-04 ~]# service certmonger status
certmonger is stopped

Comment 16 Nalin Dahyabhai 2010-06-15 17:44:00 UTC
Fixing the init script startup/shutdown logic for 0.24-3.

Comment 17 Aleš Mareček 2010-06-18 14:41:09 UTC
Hi Nalin,
it seems changes are almost OK. Unfortunately I've found one more thing: certmonger doesn't recreate lock file after "service certmonger start". After start certmonger and then delete it's lock file, try to "service certmonger start". Nothing will happen. In older versions the lock file was recreated.
It seems to be harmless but it is also a regression. Was that change wanted? ("restart" recreates lock file)


root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# service certmonger start
Starting certmonger: [  OK  ]
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# rm -rf /var/lock/subsys/certmonger 
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# ls -la /var/lock/subsys/certmonger
ls: cannot access /var/lock/subsys/certmonger: No such file or directory
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# service certmonger start
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# echo $?
0
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# ls -la /var/lock/subsys/certmonger
ls: cannot access /var/lock/subsys/certmonger: No such file or directory
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# service certmonger restart
Stopping certmonger: [  OK  ]
Starting certmonger: [  OK  ]
root@ibm-hs21-04 <i686> [bz596719-certmonger-can-start-many-daemons]# ls -la /var/lock/subsys/certmonger
-rw-r--r--. 1 root root 0 Jun 18 10:38 /var/lock/subsys/certmonger

Comment 18 Nalin Dahyabhai 2010-06-28 20:49:37 UTC
Fixing for 0.24-4.

Comment 20 Aleš Mareček 2010-07-01 13:27:26 UTC
Tested on i386, x86_64, ppc64, s390x.

OLD:
root@dell-pe2950-01 <x86_64> [bz596719-certmonger-can-start-many-daemons]# rpm -q certmonger
certmonger-0.22-1.el6.x86_64
- snip -
:: [   PASS   ] :: Starting certmonger manually.
:: [   FAIL   ] :: More than 1 daemon is running! 
:: [   PASS   ] :: Service shouldn't start when it's already running.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: File /var/run/certmonger.pid should exist
:: [   PASS   ] :: 1 daemon is running
:: [   FAIL   ] :: Pid file has not been rewritten by new daemon! 
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: 1 daemon is running
:: [   FAIL   ] :: Service has not been started bud pid file has been created! 
:: [   PASS   ] :: 'service certmonger status' returned the daemon is running.
:: [   PASS   ] :: Stopping daemon's manual run by 'service certmonger stop'.
:: [   PASS   ] :: certmonger is stopped.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Removing lock file.
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   FAIL   ] :: certmonger is stopped. (Expected 3, got 0)
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Removing lock file.
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   FAIL   ] :: certmonger is stopped. (Expected 3, got 1)
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   PASS   ] :: certmonger is stopped.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Removing lock file.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: 'service certmonger start' recreated lock file
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   LOG    ] :: Duration: 5s
:: [   LOG    ] :: Assertions: 24 good, 5 bad
:: [   FAIL   ] :: RESULT: Test
- snip -

NEW:
root@dell-pe2950-01 <x86_64> [bz596719-certmonger-can-start-many-daemons]# rpm -q certmonger
certmonger-0.24-4.el6.x86_64
- snip -
:: [   PASS   ] :: Starting certmonger manually.
:: [   PASS   ] :: 1 daemon is running
:: [   PASS   ] :: Service shouldn't start when it's already running.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: File /var/run/certmonger.pid should exist
:: [   PASS   ] :: 1 daemon is running
:: [   PASS   ] :: Pid file has not been rewritten by new daemon.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: 1 daemon is running
:: [   PASS   ] :: 'service certmonger status' returned the daemon is running.
:: [   PASS   ] :: Stopping daemon's manual run by 'service certmonger stop'.
:: [   PASS   ] :: certmonger is stopped.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Removing lock file.
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   PASS   ] :: certmonger is stopped.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Removing lock file.
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   PASS   ] :: certmonger is stopped.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   PASS   ] :: certmonger is stopped.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: Removing lock file.
:: [   PASS   ] :: Running 'service certmonger start'
:: [   PASS   ] :: 'service certmonger start' recreated lock file
:: [   PASS   ] :: Running 'service certmonger stop'
:: [   LOG    ] :: Duration: 12s
:: [   LOG    ] :: Assertions: 28 good, 0 bad
:: [   PASS   ] :: RESULT: Test
- snip -

Comment 21 releng-rhel@redhat.com 2010-11-10 19:58:02 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.