Bug 760251 - pidof breaks after prelink
pidof breaks after prelink
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sysvinit (Show other bugs)
6.1
All Linux
unspecified Severity high
: rc
: ---
Assigned To: Lukáš Nykrýn
Tereza Cerna
:
Depends On:
Blocks: 947782 1159825
  Show dependency treegraph
 
Reported: 2011-12-05 12:23 EST by Dave Dykstra
Modified: 2015-07-22 03:04 EDT (History)
8 users (show)

See Also:
Fixed In Version: sysvinit-2.87-6.dsf.el6
Doc Type: Bug Fix
Doc Text:
Cause: If you replace a running binary, its exe symlink in /proc will be appended with "(deleted)" Consequence: pidof does not count with that and falsely reports that there is no running binary with the original path. Fix: Remove "(deleted)" suffix when parsing proc. Result: Pidof works in mentioned case.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-22 03:04:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Dave Dykstra 2011-12-05 12:23:56 EST
Description of problem:
After a process such as prelink changes the contents of an executable file, pidof will no longer show the process id unless arg0 of the process exactly matches the full path of the executable.  This worked properly on RHEL5 (at least RHEL5.5) but doesn't work correctly on RHEL6.1.

Version-Release number of selected component (if applicable):
At least sysvinit-tools-2.87-4.dsf.el6.x86_64 but also still fails in the upstream latest version sysvinit-2.88dsf.

How reproducible:
One way is with the squid package which has a parent process with an arg0 of /usr/sbin/squid and a child process with an arg0 of "(squid)".

Steps to Reproduce:
1. Install the squid package on a system that didn't have it, or downgrade+upgrade it so it gets a new executable.
2. As root or the squid user run /sbin/pidof /usr/sbin/squid
3. As root run prelink with /etc/cron.daily/prelink
4. Repeat step 2, you'll see only one 
  
Actual results:
The process that doesn't have an arg0 of /usr/sbin/squid doesn't show up in step 4.  This is especially a problem for squid because the pid it stores in squid.pid is the child process, the one that doesn't show up, so pidof cannot be used to match that.

Expected results:
The same process ids should appear in steps 2 & 4.

Additional info:
I have reported this to the upstream supplier and given them a suggested simple patch at http://savannah.nongnu.org/bugs/?34992
Comment 3 Suzanne Yeghiayan 2012-02-14 18:22:44 EST
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.
Comment 4 Lukáš Nykrýn 2012-03-08 05:57:03 EST
Thank you for report. I have tried to reproduces this with this:
 
#!/bin/bash
yum -y remove squid
yum -y localinstall squid-3.1.4-1.el6.x86_64.rpm
yum -y localinstall squid-3.1.10-1.el6.x86_64.rpm
/etc/init.d/squid start
/sbin/pidof /usr/sbin/squid
/etc/cron.daily/prelink
/sbin/pidof /usr/sbin/squid

but pidof showed me two pids in both calling.


You wrote, that you don't have this problem in rhel-5. Can you please try if this also occurs with rhel5 version of pidof in rhel6? There is a possibility, that it can be caused by some regression in sysvinit.
Comment 5 Dave Dykstra 2012-03-09 16:05:59 EST
Lukáš,

Thanks for trying it.

I just made a new RHEL6.2 instance on Amazon EC2 and found out that squid is no longer a good example package, because its binaries have been made to be "not prelinkable".  If I run 'prelink -p' after /etc/cron.daily/prelink, that's what it says about /usr/sbin/squid.  I haven't found exactly why or when that change happened, but it appears it was probably made into a Position Independent Executable, most likely for security reasons.

Unfortunately, offhand I don't know another example application to demonstrate this problem, one that has a second process where arg0 is changed.  I can tell you how to simulate what prelink does to executables that are prelinkable, however.  You could do these steps as root:

yum install squid
/etc/init.d/squid start
/sbin/pidof /usr/sbin/squid  # prints 2 process ids
cp /usr/sbin/squid /usr/sbin/squid.copy
rm -f /usr/sbin/squid
mv /usr/sbin/squid.copy /usr/sbin/squid
/sbin/pidof /usr/sbin/squid  # doesn't print any process ids

I know that there was a regression in sysvinit, because when I compiled the current sysvinit on RHEL5, it exhibited the same problem until my patch was applied.  I tried copying a RHEL5 pidof binary to RHEL6 and it didn't work at all; it probably would need to be recompiled but I'm not yet convinced that it's worth the time to try that.

- Dave
Comment 6 Lukáš Nykrýn 2012-03-13 06:43:37 EDT
In this example is the behavior of pidof same in rhel5 and rhel6. I am starting thinking, that this is not a bug but a feature, when I call pidof to concrete binary I want to know its pids and not for any other binary, even if it had the same name and was in the same path. 

Other question is calling pidof only with name of program ("pidof squid"), it seems that this in all cases return only one pid and it is not the one in .pid file.
Comment 7 Dave Dykstra 2012-03-13 10:53:22 EDT
I was surprised to find you're right that my test case in comment #5 causes the same behavior of pidof on rhel5 & rhel6.  pidof after prelink on rhel5, however, does still work even though it changes the inode & size of the binary.  The test case I gave you was bad.  Change it instead to eliminate the "rm -f /usr/sbin/squid" step.  Then on rhel5 it does the expected thing and prints both process ids.  The mv step then prompts to overwrite; just answer "y" or use mv -f to avoid the prompt.

I don't think the behavior I observed for pidof on rhel6 is a feature, it is a bug.  A big use for the pidof command is for /etc/init.d scripts to find out whether or not a copy of the same program is already running, and they need to know any process running with the the same path, not just the same binary file.

This exercise has now shown me what the real difference is between rhel5 and rhel6.  It isn't that pidof has regressed, it is just that it hasn't kept up with changes to the kernel.  With rhel6, doing a copy of a running binary and then overwriting the binary with mv -f causes the /proc/NNNN/exe symlink to have a "(deleted)" appended to the name of the destination.  That doesn't happen on rhel5 unless you do a rm -f of the running binary in between.  Pidof, without my patch, can't cope with that difference.
Comment 8 Lukáš Nykrýn 2012-03-15 05:47:26 EDT
With this modification I can reproduce the difference, but I have tried this with more simple binary and result was same on rhel5 and 6:

[root@rhel5 x]# cat a.c 
#include <unistd.h>
int main()
{while (1) sleep(3600);}
[root@rhel5 x]# gcc a.c -o a
[root@rhel5 x]# ./a &
[1] 12159
[root@rhel5 x]# cp a b
[root@rhel5 x]# mv -f b a
[root@rhel5 x]# readlink /proc/12159/exe
/root/x/a (deleted)

I think that there is probably bug in rhel5 kernel and it's causing that the symlink stays same. I will discuss it with the kernel team.

About the usage of pidof in initscripts, you can use pidofproc from /etc/init.d/functions.
Comment 9 Dave Dykstra 2012-03-16 16:19:32 EDT
With your simple binary, the difference of whether or not the " (deleted)" shows up on rhel5 appears to be whether or not the filename you copy to has a single character.  I tried "b", "c", "A" which all showed " (deleted)" but "a.copy" or "a2" did not show " (deleted)".

- Dave
Comment 10 Dave Dykstra 2012-03-16 17:58:32 EDT
A patch for this has now been put into the HEAD of the upstream source, and I tested it on RHEL6.2 and it works.

https://savannah.nongnu.org/bugs/index.php?34992#comment7
Comment 16 Vladislav Bogdanov 2013-07-12 03:15:53 EDT
Bug is still there, and clvmd process is affected.

Proposed patch:
--- a/src/killall5.c    2013-07-12 07:05:25.000000000 +0000
+++ b/src/killall5.c    2013-07-12 07:13:05.342450210 +0000
@@ -328,7 +328,12 @@
                if (readlink(path, p->pathname, PATH_MAX) == -1) {
                        p->pathname = NULL;
                } else {
+                       char *ptr = NULL;
                        p->pathname[PATH_MAX-1] = '\0';
+                       ptr = strstr(p->pathname, " (deleted)");
+                       if (ptr) {
+                               *ptr = '\0';
+                       }
                }
 
                /* Link it into the list. */
Comment 17 Dave Dykstra 2013-07-12 10:04:30 EDT
Vladislav,

In what version is the bug, and what version is your patch for?  It does not appear to match the current upstream HEAD at http://svn.savannah.nongnu.org/viewvc/sysvinit/trunk/src/killall5.c?root=sysvinit&view=log.

Dave
Comment 18 Vladislav Bogdanov 2013-07-12 10:17:40 EDT
Dave,

That is for EL6's sysvinit-2.87.
Comment 27 Tereza Cerna 2015-03-05 04:12:30 EST
======================================
Verified in version:
    sysvinit-tools-2.87-6.dsf.el6.i686
PASSED
======================================

# /root/aaa 3600 &
[4] 28006
# cp /root/aaa /root/bbb
# rm /root/aaa
rm: smazat běžný soubor „/root/aaa“? y
# mv /root/bbb /root/aaa
# pidof /root/aaa      # --->> there are pids ---> OK
28006 27829 27566
# rpm -Uvh sysvinit-tools-2.87-6.dsf.el6.i686.rpm 
Připravuji...              ########################################### [100%]
	balíček sysvinit-tools-2.87-6.dsf.el6.i686 je již nainstalován
# pidof /root/aaa      # --->> there are pids ---> OK    
28006 27829 27566
#

======================================
Reproduced in version:
    sysvinit-tools-2.87-5.dsf.el6.i686
FAIL
======================================

# /root/aaa 3600 &
[3] 27829
# cp /root/aaa /root/bbb
# rm /root/aaa
rm: smazat běžný soubor „/root/aaa“? y
# mv /root/bbb /root/aaa
# pidof /root/aaa      # --->> there are no pids ---> BAD
# rpm -Uvh /root/sysvinit-tools-2.87-5.dsf.el6.i686.rpm 
Připravuji...              ########################################### [100%]
	balíček sysvinit-tools-2.87-5.dsf.el6.i686 je již nainstalován
# pidof /root/aaa      # --->> there are no pids ---> BAD
#
Comment 29 errata-xmlrpc 2015-07-22 03:04:18 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1362.html

Note You need to log in before you can comment on or make changes to this bug.