Description of problem: When booting recent kernels 2.6.17-1.2396.fc6 and 2.6.17-1.2401.fc6 things seem to be normal enough when startup scripts are running but trying status of various services from /etc/init.d/ shows up things of that sort: sendmail dead but subsys locked acpid dead but subsys locked automount is stopped portmap dead but subsys locked rpc.idmapd is stopped smartd dead but subsys locked dbus-daemon is stopped An attempt of 'service sendmail restart' produces the following output Shutting down sendmail: [FAILED] Shutting down sm-client: [ OK ] Starting sendmail: [ OK ] Starting sm-client: [ OK ] with "sendmail dead but subsys locked" status and NOQUEUE: SYSERR(root): opendaemonsocket: daemon MTA: cannot bind: Address already in use daemon MTA: problem creating SMTP socket complaints in logs. This is not selinux problem because selinux on that system is turned off. Booting 2.6.17-1.2366.fc6 makes all of the above, and more, to work again. Version-Release number of selected component (if applicable): 2.6.17-1.2396.fc6 and 2.6.17-1.2401.fc6 How reproducible: all the time (but tried only on x86_64)
Roland, could this be your utrace stuff ?
I see the same behavior for acpid under the 2405 kernel. Under kernel-2.6.17-1.2356.fc6, the command 'service acpid status' properly reports "acpid (pid xxxx) is running...". Under kernel-2.6.17-1.2405.fc6, the same command reports that acpid is dead, even though it's running. This happens on both an i386 and an x86_64 system. [root@gadwall ~]# ps -ef | grep acpid root 11 7 0 10:50 ? 00:00:00 [kacpid] root 1636 1 0 10:52 ? 00:00:00 /usr/sbin/acpid root 1976 1926 0 10:57 pts/0 00:00:00 grep acpid [root@gadwall ~]# service acpid status acpid dead but subsys locked [root@gadwall ~]# service acpid stop Stopping acpi daemon: [FAILED] [root@gadwall ~]# ps -ef | grep acpid root 11 7 0 10:50 ? 00:00:00 [kacpid] root 1636 1 0 10:52 ? 00:00:00 /usr/sbin/acpid root 2002 1926 0 10:58 pts/0 00:00:00 grep acpid [root@gadwall ~]# service acpid status acpid is stopped
When shutting down the computer, I see sevral services which fail to shutdown. Joining the bug for tracking progress.
> ... I see sevral services which fail to shutdown. Failures on a shutdown are a side-effect of various services to be "dead"; or at least to be reported that way. After I dropped '-c' option to pidof in /etc/init.d/functions all services listed in the original report, and more, are now seen as "running" with 2.6.17-1.2405.fc6. Consequently a shutdown also works without troubles. Well, with an exception of sm-client which appears to be shut down earlier then /etc/init.d/sendmail is trying to do that explicitely and hence a reported failure. Why switching a kernel version has such effect on pidof, and whos bug is that really, I have no idea.
Unless there is some info in dmesg to go on, I have no speculation about these problems. I can believe that random instability was caused by the utrace changes, just on principle, but I don't have anything to go on.
(In reply to comment #5) > Unless there is some info in dmesg to go on, I have no speculation about these > problems. I can believe that random instability was caused by the utrace > changes, just on principle, but I don't have anything to go on. Your recommendation then is to file bugs against each individual service exhibiting the behavior?
> I can believe that random instability There is no much "random" about it. A behaviour is consistent and depends on a kernel version in use. If you will look closer then programs in question actually do run but, with recent kernel versions, are reported as dead. I other words - kernels and user-space got out of sync. Should this issue be changed to SysVinit and/or initscripts? It is not clear what else could be affected by what happened in kernel.
This is definitely a kernel issue... /proc/*/root is now only readable for the current task. Is this really intentional?
Ah, the specific diagnosis of the kernel behavior makes all the difference. Now that I know what the issue is, that sounds like it's probably my bug.
FWIW, the usage case we're using it for is for pidof's '-c' option ; we use it in the init scripts to make sure we only find/kill processes that are running in the same root as the script, so we don't kill daemons in chroots, or similar.
Handy reproducer: -bash-3.1$ ls -ld /proc/$$/root /proc/self/root ls: cannot read symbolic link /proc/2106/root: Permission denied lrwxrwxrwx 1 roland roland 0 Jul 19 23:42 /proc/2106/root lrwxrwxrwx 1 roland roland 0 Jul 20 17:05 /proc/self/root -> / I should have a fix shortly.
This is fixed as of kernel-2.6.17-1.2431.fc6
services shut down properly, without the errors seen with previous kernels. Fixed in this regard.