Bug 815835

Summary:	/sbin/runlevel reports "unknown" where it shouldn't
Product:	[Fedora] Fedora	Reporter:	John Florian <john>
Component:	systemd	Assignee:	systemd-maint
Status:	CLOSED WORKSFORME	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	16	CC:	ikent, johannbg, maurizio.antillon, metherid, mschmidt, notting, plautrba, robatino, steve.traylen, systemd-maint
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-01-15 23:22:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description John Florian 2012-04-24 15:55:11 UTC

Description of problem:
Puppet uses '/sbin/chkconfig SERVICE' to learn if a service is enabled for startup at boot.  When SERVICE is autofs and is enabled, chkconfig (ala systemd) will return the wrong exit code.

Version-Release number of selected component (if applicable):
autofs-5.0.6-5.fc16.x86_64

How reproducible:
always

Steps to Reproduce:
1. chkconfig autofs on
2.chkconfig autofs; echo $?
  
Actual results:
The echo emits '1' which has traditionally meant the service is not enabled.


Expected results:
Should emit '0' indicating the service is enabled.

Additional info:

Comment 1 John Florian 2012-04-24 16:05:41 UTC

On further review, I actually think this is a chkconfig bug.  While /etc/init.d/autofs does not declare the PID file in the header comments, I've tried adding that with:

diff -up /etc/init.d/autofs.orig /etc/init.d/autofs
--- /etc/init.d/autofs.orig     2012-01-22 18:45:05.000000000 -0500
+++ /etc/init.d/autofs  2012-04-24 11:18:41.780806815 -0400
@@ -6,6 +6,7 @@
 # processname: /usr/sbin/automount
 # config: /etc/auto.master
 # description: Automounts filesystems on demand
+# pidfile: /var/run/autofs.pid
 #
 ### BEGIN INIT INFO
 # Provides: autofs

... and found it to be working no better.  (I did also run 'systemctl --system daemon-reload'.)

Comment 2 Bill Nottingham 2012-04-24 16:28:45 UTC

? pid files have nothing to do with whether or not the service is enabled.

I can't reproduce this on F17:

# chkconfig autofs
Note: forwarding to 'systemctl is-enabled autofs.service'.
enabled
# echo $?
0

Comment 3 John Florian 2012-04-24 17:16:02 UTC

Bill, sorry about the pid file noise -- I'd misread the puppet message earlier and have continued thinking running vs. enabled.  My brain is under-caffeinated today it seems and your absolutely right that it has nothing to do with the bug report.

I don't have an F17 box to test with but I'm seeing many with F16 having this problem.  Has the autofs service been migrated to be a native systemd service in F17?  With a legacy init script, how does chkconfig/systemd determine if a service is enabled "under the hood"?

Comment 4 Bill Nottingham 2012-04-24 17:47:53 UTC

I've tried the F16 updates autofs package on F17, and gotten the same result. What version of chkconfig do you have installed?

With a legacy init script, chkconfig uses the same code it always has - it's only with a systemd service that it forwards to 'systemctl is-enabled'.

Comment 5 Bill Nottingham 2012-04-24 17:54:28 UTC

Also, what's the output of 'runlevel'?

Comment 6 John Florian 2012-04-24 18:37:20 UTC

Okay, this just got weird. I was seeing this with other services (puppet, yum-cron and libvirtd), so I thought I'd capture it all here in case that revealed some pattern:

$ runlevel
N 5
$ for s in $(chkconfig --list | awk '/5:on/{print $1}'); do chkconfig $s; echo "$s returned $?"; rpm -qf /etc/init.d/$s; done

Note: This output shows SysV services only and does not include native
systemd services. SysV configuration data might be overridden by native
systemd configuration.

autofs returned 0
autofs-5.0.6-5.fc16.x86_64
jexec returned 0
jdk-1.6.0_27-fcs.x86_64
libvirtd returned 0
libvirt-0.9.6-5.fc16.x86_64
netfs returned 0
initscripts-9.34.2-1.fc16.x86_64
network returned 0
initscripts-9.34.2-1.fc16.x86_64
puppet returned 0
puppet-2.6.14-1.fc16.noarch
sandbox returned 0
policycoreutils-2.1.4-13.fc16.x86_64
yum-cron returned 0
yum-cron-3.4.3-23.fc16.noarch

Notice how these all are reporting correctly ... now.

This is after rebooting and the bug report was started prior to rebooting. I don't think the rebooting per se helped as I'd done that earlier today. What *may* have helped was cleaning out /tmp of the 300K+ systemd-namespace-* directories that had accumulated due to a rapidly respawning mysqld. I wound up using single-user mode to do the cleanup as find was acting very bizarre complaining that (all?) the files didn't exist when using:

find -name 'systemd-namespace*' -group mysql -exec rm -rf {} \;

Find continued complaining in single-user mode, but I confirmed the file count was going down and half-hour later it finished. Now chkconfig seems happy too. I wasn't out of space in /tmp, but it sure seems like that mess had something to do with the issue I was having.

Comment 7 John Florian 2012-04-24 18:43:36 UTC

Or maybe not.  Here's another F16 box that's also having the same problem.  I do see that 'runlevel' might be suggesting the root cause???

$ runlevel
unknown
$ for s in $(chkconfig --list | awk '/5:on/{print $1}'); do chkconfig $s; echo $?; rpm -qf /etc/init.d/$s; done

Note: This output shows SysV services only and does not include native
      systemd services. SysV configuration data might be overridden by native
      systemd configuration.

1
autofs-5.0.6-5.fc16.x86_64
1
jdk-1.6.0_30-fcs.x86_64
jdk-1.6.0_24-fcs.x86_64
1
libvirt-client-0.9.6-5.fc16.x86_64
1
libvirt-0.9.6-5.fc16.x86_64
1
initscripts-9.34.2-1.fc16.x86_64
1
initscripts-9.34.2-1.fc16.x86_64
1
puppet-2.6.14-1.fc16.noarch
1
policycoreutils-2.1.4-13.fc16.x86_64
1
yum-cron-3.4.3-23.fc16.noarch

Comment 8 John Florian 2012-04-24 18:44:50 UTC

Oh and both of these systems have:

$ rpm -q chkconfig
chkconfig-1.3.59-1.fc16.x86_64

Comment 9 Bill Nottingham 2012-04-24 19:04:42 UTC

Correct - chkconfig <foo> reports the status of the service for the current runlevel. If 'runlevel' is unknown, chkconfig will return an error.

Comment 10 John Florian 2012-04-25 13:01:53 UTC

(In reply to comment #9)
> Correct - chkconfig <foo> reports the status of the service for the current
> runlevel. If 'runlevel' is unknown, chkconfig will return an error.

That makes sense (at least for non-native systemd services), but I don't understand this:

System 1:

$ ls -l /etc/systemd/system/default.target
lrwxrwxrwx. 1 root root 36 Jan 16 08:26 /etc/systemd/system/default.target -> /lib/systemd/system/graphical.target
$ runlevel
N 5


System 2:

$ ls -l /etc/systemd/system/default.target
lrwxrwxrwx. 1 root root 36 Feb 13 09:36 /etc/systemd/system/default.target -> /lib/systemd/system/runlevel5.target
$ runlevel
unknown


While I can understand what I see at the file system level of systemd configuration, I don't know how utmp is affected to cause the "backwards" looking results above.

Comment 11 Bill Nottingham 2012-04-25 17:50:42 UTC

If runlevel is 'unknown' when you're booted to an actual runlevel/target (as opposed to when you're in single-user mode, or something similar), then that would be a systemd issue.

Comment 12 Michal Schmidt 2012-04-25 18:04:30 UTC

Does "systemctl list-jobs" show any unfinished jobs?

Comment 13 John Florian 2012-04-25 18:05:08 UTC

Thanks Bill for the clarification; it did seem a bit fishy.

To be perfectly clear, both systems are booted normally to their default targets.  Also, System 1 is now returning 'unknown' and can't think of any reason why it would have suddenly changed.

Comment 14 John Florian 2012-04-25 18:06:33 UTC

(In reply to comment #12)
> Does "systemctl list-jobs" show any unfinished jobs?

Nope.

System 1:
$ systemctl list-jobs
 JOB UNIT                      TYPE            STATE  

0 jobs listed.

System 2:
$  systemctl list-jobs
 JOB UNIT                      TYPE            STATE  

0 jobs listed.

Comment 15 Lennart Poettering 2013-01-14 20:58:08 UTC

Coudl you please append the outputs of "systemctl status runlevel5.target" as well as "systemctl show graphical.target" when this happens, after boot?

Comment 16 John Florian 2013-01-14 21:52:57 UTC

(In reply to comment #15)
> Coudl you please append the outputs of "systemctl status runlevel5.target"
> as well as "systemctl show graphical.target" when this happens, after boot?

Sorry Lennart, I just updated the system to F18RC1 last week.  I still have several F16 hosts around, but I can't think of any that have X installed.  I can say that I'd forgotten about this bug, so something must have gotten this fixed and IIRC, there was a chkconfig update that did the trick ... but my human memory gets swapped out too frequently to say that with any certainty.

Is there anything else I can get you?

Comment 17 Lennart Poettering 2013-01-15 23:22:33 UTC

Hmm, OK, let's close it for now. Feel free to reopen if it resurfaces and somebody can reproduce it.

Comment 18 Andre Robatino 2013-08-30 04:29:32 UTC

I am always getting "unknown" in either F20 or Rawhide, regardless of the actual runlevel. I filed https://bugzilla.redhat.com/show_bug.cgi?id=1002806 .

Comment 19 Steve Traylen 2015-04-30 19:32:53 UTC

Hmm this crops up in docker centos 6 as well.

# docker run -i -t centos:centos6 /bin/bash
# yum -y install openssh-server
# chkconfig --list sshd
sshd           	0:off	1:off	2:on	3:on	4:on	5:on	6:off

# chkconfig sshd || echo "Return is non-zero"
Return is non-zero

# runlevel 
unknown

Of course a runlevel of unknown in docker sort of makes sense so am
not reopening this bug.

Steve.