Bug 849465 - Mismatch between PID file created by daemon and systemd service unit
Summary: Mismatch between PID file created by daemon and systemd service unit
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: monit
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Maxim Burgerhout
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-19 17:06 UTC by Ken Hall
Modified: 2012-12-09 05:56 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-12-09 05:56:27 UTC
Type: Bug
Embargoed:
maxim: needinfo+


Attachments (Terms of Use)
Monit RC file (5.41 KB, text/plain)
2012-08-20 17:26 UTC, Ken Hall
no flags Details

Description Ken Hall 2012-08-19 17:06:46 UTC
Description of problem:
Monit stops immediately after starting, killed by systemd

Version-Release number of selected component (if applicable):
Fedora 17 and monit-5.3.1-3.fc17.x86_64

How reproducible:
Every Time

Steps to Reproduce:
1.Create valid monit configuration and start via "systemctl start"
2.
3.
  
Actual results:
Daemon starts, and is immediately killed by systemd 

Expected results:
Daemon should start and remain active

Additional info:
As of Fedora 17, monit includes a systemd service unit for startup rather than sysvinit script.  Default PID file for monit is /var/run/monit.pid, but systemd defaults to look for /var/run/monit.  Systemd does not believe daemon starts properly and kills process, which ends at the expiration of the startup delay.

Added following line to /monit service unit definition:

PIDFile=/var/run/monit.pid

Daemon now starts and remains active.

Oddly, overiding PID file in /etc/monitrc did not correct the problem.  Message issued by monit daemon indicates "PID File not found", but apparently PID file is not properly created because systemd kills the process anyway.  Changing startup delay had no effect.

Comment 1 Maxim Burgerhout 2012-08-20 05:48:50 UTC
I don't seem to be able to reproduce this. If I install monit on my Fedora 17 desktop, I have no issues whatsoever:

$ sudo yum -y -q install monit

$ sudo systemctl status monit.service
monit.service - Monit process and file monitoring utility
	  Loaded: loaded (/usr/lib/systemd/system/monit.service; enabled)
	  Active: inactive (dead)
	  CGroup: name=systemd:/system/monit.service

$ sudo systemctl start monit.service

$ sudo systemctl status monit.service
monit.service - Monit process and file monitoring utility
	  Loaded: loaded (/usr/lib/systemd/system/monit.service; enabled)
	  Active: active (running) since Mon, 20 Aug 2012 07:40:25 +0200; 4s ago
	 Process: 24301 ExecStart=/usr/bin/monit (code=exited, status=0/SUCCESS)
	Main PID: 24303 (monit)
	  CGroup: name=systemd:/system/monit.service
		  └ 24303 /usr/bin/monit



Can you post your main configuration file? What does your unit file look like? Did you make any alterations in any of those?

Comment 2 Maxim Burgerhout 2012-08-20 05:51:01 UTC
And to make things complete:

$ ll /run/monit.pid 
-rw-r--r--. 1 root root 6 Aug 20 07:45 /run/monit.pid

$ cat /run/monit.pid 
24620

$ sudo systemctl stop monit.service

$ ll /run/monit.pid 
ls: cannot access /run/monit.pid: No such file or directory

Comment 3 Ken Hall 2012-08-20 17:26:05 UTC
Created attachment 605748 [details]
Monit RC file

Comment 4 Ken Hall 2012-08-20 17:26:50 UTC
Starting after a "vanilla" install does work, however my "production" monit rc file will not stay up.  Log reads:

[root@iserver monit.d]# systemctl status monit.service
monit.service - Monit process and file monitoring utility
          Loaded: loaded (/usr/lib/systemd/system/monit.service; enabled)
          Active: inactive (dead) since Mon, 20 Aug 2012 13:12:47 -0400; 20s ago
         Process: 18494 ExecStart=/usr/bin/monit (code=exited, status=0/SUCCESS)
        Main PID: 18495 (code=exited, status=0/SUCCESS)
          CGroup: name=systemd:/system/monit.service

Aug 20 13:12:32 iserver.hall171.com monit[18494]: /etc/monit.d/iserver:93: Warning: PAM group monit was added already, entry ignored 'monit'
Aug 20 13:12:47 iserver.hall171.com monit[18496]: monit daemon with pid [18496] killed
Aug 20 13:12:47 iserver.hall171.com monit[18496]: 'system_iserver.hall171.com' Monit stopped

File is attached.  Adding PIDFile parameter to systemd service unit corrects the problem, even though the PID file isn't overridden.  This is the same config file I used for Fedora 14 and is very similar to one used on Fedora 16 on another machine.

Comment 5 Maxim Burgerhout 2012-08-21 15:18:15 UTC
Still not able to reproduce this. If I take your monitrc and comment out the pieces I do not have here, like your service definitions and they ssl stuff, everything works fine for me.

Comment 6 Ken Hall 2012-08-23 20:26:02 UTC
Reading the man page explanation of how systemd determines the PID, you might want to look at the effect the startup delay has on the processes that run when the daemon starts.  That's the only thing I can think of that's different between my startup and the default.  I have a 15 second delay on startup to allow my service processes to stabilize.  (Might not be necessary on this machine, but definitely needed on my other one where my Opensim processes take a while to get going.)

From the "systemd.service" man page:

PIDFile=
   Takes an absolute file name pointing to the PID file of this daemon. Use of this option is recommended for services where Type= is set to forking. systemd will read the PID of the main process of the daemon after start-up of the service. systemd will not write to the file configured here.

The service unit for monit does have Type=forking, so the "final" process might not be the one systemd identifies.

If you still don't see the problem, it might be a race condition, the machine I'm running is fairly fast.  (quad-core 3.2 ghz. with 16GB of RAM)

Comment 7 Maxim Burgerhout 2012-10-02 09:25:24 UTC
I'm starting to think you might have an old unit file there. Can you post your /usr/lib/systemd/system/monit.service file?

It should read:

[Unit]
Description=Pro-active monitoring utility for unix systems
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/monit -I
ExecStop=/usr/bin/monit quit
ExecReload=/usr/bin/monit reload

[Install]
WantedBy=multi-user.target

It should *not* be forking, it should be *simple*.

Comment 8 Ken Hall 2012-11-30 20:08:57 UTC
Sorry for the delay, I hadn't realized you responded.  The behavior is actually getting even more odd, but you might be right about the unit file. 

I now have two machines.  The problem originally appeared on iserver, which is an older dual-core system that runs a web server.  It has this in the unit file:

[Unit]
Description=Monit process and file monitoring utility
After=network.target

[Service]
ExecStart=/usr/bin/monit
Type=forking
ExecReload=/usr/bin/monit reload
PIDFile=/var/run/monit.pid

[Install]
WantedBy=multi-user.target

The PIDFile line was added by me to fix the crashing issue.

Another machine here, "raptor", is a quad core with a lot of memory, so it comes up much faster.  The unit file there should be the original (confirmed by rpm --verify):

[Unit]
Description=Monit process and file monitoring utility
After=network.target

[Service]
ExecStart=/usr/bin/monit
Type=forking
ExecReload=/usr/bin/monit reload

[Install]
WantedBy=multi-user.target

Both machines have the same version of monit, 5.3.1-3.fc17.

Raptor has worked properly all along without adding PIDfile to the unit file, leading me to believe this is some kind of race condition.  I can try changing the setting to "simple" and retest, but the Fedora package will need updating.

Comment 9 Ken Hall 2012-11-30 20:14:56 UTC
Did not work with "simple" and no PIDFile line:

[root@iserver multi-user.target.wants]# systemctl status monit.service
monit.service - Monit process and file monitoring utility
          Loaded: loaded (/usr/lib/systemd/system/monit.service; enabled)
          Active: deactivating (stop-sigterm) since Fri, 30 Nov 2012 15:12:47 -0500; 10s ago
         Process: 7220 ExecStart=/usr/bin/monit (code=exited, status=0/SUCCESS)
          CGroup: name=systemd:/system/monit.service
                   7222 /usr/bin/monit

Nov 30 15:12:47 iserver.hall171.com monit[7220]: Starting monit daemon with http interface at [*:8009]
[root@iserver multi-user.target.wants]# systemctl status monit.service
monit.service - Monit process and file monitoring utility
          Loaded: loaded (/usr/lib/systemd/system/monit.service; enabled)
          Active: inactive (dead) since Fri, 30 Nov 2012 15:13:03 -0500; 1s ago
         Process: 7220 ExecStart=/usr/bin/monit (code=exited, status=0/SUCCESS)
          CGroup: name=systemd:/system/monit.service

Nov 30 15:12:47 iserver.hall171.com monit[7220]: Starting monit daemon with http interface at [*:8009]
Nov 30 15:13:02 iserver.hall171.com monit[7222]: monit daemon with pid [7222] killed
Nov 30 15:13:02 iserver.hall171.com monit[7222]: 'system_iserver.hall171.com' Monit stopped

Comment 10 Maxim Burgerhout 2012-11-30 21:19:09 UTC
Gosh. I've been running on F18 for a while and I assumed I had back-ported the unit file from F18 to F17. And I had. I just hadn't rebuilt monit with it, while it had been accidentally built for F18 because of the mass rebuilt a couple of months back. Sorry for that and thanks for being persistent. 

I've built and pushed updates. Can you test and provide feedback in Bodhi?

Comment 11 Fedora Update System 2012-11-30 21:45:01 UTC
monit-5.3.1-4.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/monit-5.3.1-4.fc17

Comment 12 Fedora Update System 2012-12-01 08:28:20 UTC
Package monit-5.3.1-4.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing monit-5.3.1-4.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-19443/monit-5.3.1-4.fc17
then log in and leave karma (feedback).

Comment 13 Ken Hall 2012-12-01 22:11:12 UTC
That seems to have fixed it on iserver, it starts normally with the unmodified unit file now.

Thanks!

Comment 14 Fedora Update System 2012-12-09 05:56:30 UTC
monit-5.3.1-4.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.