Bug 702621

Summary: Service status is not reported correctly when LSB header is loaded
Product: [Fedora] Fedora Reporter: Honza Horak <hhorak>
Component: systemdAssignee: systemd-maint
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 15CC: andrew, fdinitto, johannbg, johannbg, johannbg, lhh, lpoetter, marbolangos, metherid, mschmidt, notting, plautrba, sdake
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-25 17:16:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 719931, 721375    
Bug Blocks:    

Description Honza Horak 2011-05-06 11:55:38 UTC
Description of problem:
When systemd uses LSB header service definition (e.g. sshd.service) and main process is killed, systemctl still reports, that service is "running" (and it cannot be start with systemctl start ...).

Version-Release number of selected component (if applicable):
rpm -q systemd
systemd-26-1.fc15.x86_64

How reproducible:
every-time

Steps to Reproduce:
1. systemctl start sshd.service
2. systemctl status sshd.service (it is running with e.g. Main PID: 1322)
3. kill -9 1322
4. systemctl status sshd.service (it still reports that service is "running", but main PID is described as killed)
5. systemctl start sshd.service (nothing changes)
  
Actual results:
$ systemctl status sshd.service
sshd.service - LSB: Start up the OpenSSH server daemon
	  Loaded: loaded (/etc/rc.d/init.d/sshd)
	  Active: active (exited) since Thu, 05 May 2011 13:10:53 +0200; 24h ago
	 Process: 1274 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
	Main PID: 1322 (code=killed, signal=KILL)
	  CGroup: name=systemd:/system/sshd.service


$ systemctl start sshd.service
$ systemctl status sshd.service
sshd.service - LSB: Start up the OpenSSH server daemon
	  Loaded: loaded (/etc/rc.d/init.d/sshd)
	  Active: active (exited) since Thu, 05 May 2011 13:10:53 +0200; 24h ago
	 Process: 1274 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
	Main PID: 1322 (code=killed, signal=KILL)
	  CGroup: name=systemd:/system/sshd.service


Expected results:
$ systemctl status sshd.service
sshd.service - LSB: Start up the OpenSSH server daemon
	  Loaded: loaded (/etc/rc.d/init.d/sshd)
	  Active: failed since Fri, 06 May 2011 13:39:12 +0200; 6s ago
	 Process: 1274 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/sshd.service


$ systemctl start sshd.service
$ systemctl status sshd.service
sshd.service - LSB: Start up the OpenSSH server daemon
	  Loaded: loaded (/etc/rc.d/init.d/sshd)
	  Active: active (running) since Fri, 06 May 2011 13:53:07 +0200; 2s ago
	 Process: 30650 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
	Main PID: 30657 (sshd)
	  CGroup: name=systemd:/system/sshd.service
		  └ 30657 /usr/sbin/sshd

Additional info:
I know there is a problem of tracking services which double fork, but if I know PID, I can always check if it is running, so there shouldn't be "running" if it's not true.

This works correctly if native systemd service file is loaded.

Comment 1 Honza Horak 2011-05-06 12:16:31 UTC
I've found bug #629040, which made it more clear for me and I've realized that status "running" in comment #0 should be replaced by "active (exited)" state. Sorry for that confusion.

Nevertheless, as I've understood, there is a problem only with services using more than one main process. SSH uses one main process and it is recognized by systemd (Main PID is found correctly), so is it still necessary to report "active (exited)" in this case?

Comment 2 Steven Dake 2011-07-04 18:17:53 UTC
Note that because systemd is not LSB compliant in Fedora 15 and rawhide (F16), rgmanager, pacemaker, pacemaker-cloud (f16 feature) are DOA in these distributions.

Comment 3 Fabio Massimo Di Nitto 2011-07-05 12:46:50 UTC
I can confirm the original behavior also for cman.

This is effectively a regression from previous init system and affects all services that have not converted yet to native systemd.

Comment 4 Michal Schmidt 2011-07-05 13:00:49 UTC
Upstream patch:
http://cgit.freedesktop.org/systemd/commit/?id=f8788303929c27d0b7f7e4b8ffe22767a3d0ff67

It improves the detection of the type services whose SysV initscripts contain the 'pidfile:' header.

With this change:

[root@f15 ~]# systemctl status sshd.service
sshd.service - LSB: Start up the OpenSSH server daemon
	  Loaded: loaded (/etc/rc.d/init.d/sshd)
	  Active: active (running) since Tue, 05 Jul 2011 14:58:26 +0200; 14s ago
	 Process: 1253 ExecStop=/etc/rc.d/init.d/sshd stop (code=exited, status=0/SUCCESS)
	 Process: 1284 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
	Main PID: 1291 (sshd)
	  CGroup: name=systemd:/system/sshd.service
		  └ 1291 /usr/sbin/sshd
[root@f15 ~]# kill -9 1291
# systemctl status sshd.service
sshd.service - LSB: Start up the OpenSSH server daemon
	  Loaded: loaded (/etc/rc.d/init.d/sshd)
	  Active: failed since Tue, 05 Jul 2011 14:59:07 +0200; 15s ago
	 Process: 1349 ExecStop=/etc/rc.d/init.d/sshd stop (code=exited, status=0/SUCCESS)
	 Process: 1284 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
	Main PID: 1291 (code=killed, signal=KILL)
	  CGroup: name=systemd:/system/sshd.service
[root@f15 ~]# echo $?
3

Comment 5 Steven Dake 2011-07-05 17:45:31 UTC
Michal,

I owe you one - thanks for the rapid response!

I have done a scratch build against f15 and verify the system is operating as expected.

Thanks!
-steve

Comment 6 Fedora Update System 2011-07-06 09:33:49 UTC
systemd-26-7.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/systemd-26-7.fc15

Comment 7 Fedora Update System 2011-07-06 21:39:51 UTC
Package systemd-26-7.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-26-7.fc15'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/systemd-26-7.fc15
then log in and leave karma (feedback).

Comment 8 Michal Schmidt 2011-07-08 22:05:57 UTC
The patch caused a regression by exposing a latent bug that needs to be fixed first.
systemd-26-8.fc15 drops the patch.

Comment 9 Steven Dake 2011-07-20 16:16:06 UTC
Michal,

Is it possible to fix the latent issue mentioned in comment #8 for F16?  Freeze is fast approaching and our high availability feature set is broken in F15 and rawhide.

Comment 10 Michal Schmidt 2011-07-21 13:31:16 UTC
The latent issues are the two bugs that this BZ "Depends on". Neither of them should be a problem in Rawhide, they are F15 only.
The patch for systemd is already included in systemd-30.fc16.
Are you sure this bug is present in the current Rawhide?

Comment 11 Fedora Admin XMLRPC Client 2011-10-20 16:29:49 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 12 Jóhann B. Guðmundsson 2012-01-24 13:11:10 UTC
Is this still a problem or can this bug be closed?