Bug 614874 - mcelogd service does not honour already running service
mcelogd service does not honour already running service
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: mcelog (Show other bugs)
6.0
All Linux
low Severity medium
: rc
: ---
Assigned To: Prarit Bhargava
Jan Tluka
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-15 08:56 EDT by Jan Tluka
Modified: 2015-09-27 22:09 EDT (History)
5 users (show)

See Also:
Fixed In Version: mcelog-1.0pre3_20101112-0.3.el6
Doc Type: Bug Fix
Doc Text:
The mcelog service did not check whether another instance of mcelog was running, which could result in multiple mcelog service instances on a single system. This could result in lost or over-reported Machine Check Exceptions. mcelog now detects whether another instance is already running, preventing multiple instances from being launched on a single system simultaneously.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-05-19 07:51:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
mcelog lockfile patch (1.01 KB, patch)
2010-11-15 13:33 EST, Prarit Bhargava
no flags Details | Diff

  None (edit)
Description Jan Tluka 2010-07-15 08:56:52 EDT
Description of problem:

I'm doing LSB compliance  review for mcelog initscript and I found an issue. According to the LSB SysVInit specification starting an already started service should return 0 exit code. Exit code 1 is returned instead and mcelogd tries to start again. This might be the result of mcelogd not honouring already running instance.

I have inspected a bit further. Seems that there's no /var/run/mcelogd.pid file. Therefore initscripts does not detect the mcelogd is already running and attempts to start another instance which fails.

Version-Release number of selected component (if applicable):
RHEL6.0-20100707.4
mcelog-1.0pre3-0.2.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
0. service mcelogd stop; service mcelogd status
1. service mcelogd start; service mcelogd status
2. service mcelogd start

  
Actual results:

# service mcelogd start
Starting mcelog daemon
# service mcelogd status
Checking for mcelog
mcelog (pid 6270) is running...
# service mcelogd start
Starting mcelog daemon
# echo $?
1
# service mcelogd status
Checking for mcelog
mcelog (pid 6270) is running...
# tail /var/log/messages | grep mcelog
Jul 15 08:43:28 intel-s3ea2-04 mcelog: mcelog server already running

# ls -l /var/run/mcelog*
srwxr-xr-x. 1 root root 0 Jul 15 08:43 /var/run/mcelog-client
# find /var/run/ -iname mcelog*
/var/run/mcelog-client


Expected results:
mcelog correctly detects that it's already running and does not start. 

Additional info:
The issue has been found while running test in Beaker:
test name: /CoreOS/mcelog/sanity/lsb-compliance
full log: https://beaker.engineering.redhat.com/recipes/13279#task160088
Comment 1 RHEL Product and Program Management 2010-07-15 10:24:24 EDT
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
Comment 3 Jan Tluka 2010-07-19 16:02:26 EDT
> Description of problem:
> 
> I'm doing LSB compliance  review for mcelog initscript and I found an issue.
> According to the LSB SysVInit specification starting an already started service
> should return 0 exit code. Exit code 1 is returned instead and mcelogd tries to
> start again. This might be the result of mcelogd not honouring already running
> instance.
> 
> I have inspected a bit further. Seems that there's no /var/run/mcelogd.pid
> file. Therefore initscripts does not detect the mcelogd is already running and
> attempts to start another instance which fails.
> 

Update:

I've talked with Yulia Kopkova about the importance of pid file. She said that more important is to have /var/lock/subsys/mcelog file. And not having it is considered as a bug.

Another thing is that the actual cause of initial bug report is not the absence of pid file (or mcelog subsys lock file). The problem is that mcelogd script should check for the status of the service (e.g. a line containing 'status mcelogd') and depending on the result start the service (if not running) or return 0 (if running). Currently the script is not checking that and any attempt to start the service with already running one returns 1.
Comment 6 Prarit Bhargava 2010-11-15 13:33:34 EST
Created attachment 460602 [details]
mcelog lockfile patch

Committed to RHEL6 mcelog.

P.
Comment 11 Prarit Bhargava 2010-12-15 09:43:24 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: The mcelogd service does not check to see if other instances are already running.  This may cause multiple mcelogd service instances on a single system.
Consequence: This may cause MCE events to be lost or over-reported.
Fix: Modify the mcelogd script to check for an existing instance of mcelogd before starting
Result: The mcelog service checks to see if other instances are already running.  This prevents multiple instances of mcelogd from running on a single system.
Comment 14 Jan Tluka 2011-02-21 09:04:24 EST
Hi, the recent version of mcelog contains a typo in init script. 

# rpm -qa mcelog
mcelog-1.0pre3_20101112-0.4.el6.x86_64

# grep subsys /etc/init.d/mcelogd 
LOCKFILE="var/lock/subsys/mcelogd"

The path is ambiguous because of missing slash at the beginning. I've checked other initscripts (iptables, rsyslog) and they contain absolute path (with slash). However running service mcelogd start/stop works ok even with the typo I think this should be fixed.
Comment 15 Prarit Bhargava 2011-02-21 09:05:42 EST
(In reply to comment #14)
> Hi, the recent version of mcelog contains a typo in init script. 
> 
> # rpm -qa mcelog
> mcelog-1.0pre3_20101112-0.4.el6.x86_64
> 
> # grep subsys /etc/init.d/mcelogd 
> LOCKFILE="var/lock/subsys/mcelogd"
> 
> The path is ambiguous because of missing slash at the beginning. I've checked
> other initscripts (iptables, rsyslog) and they contain absolute path (with
> slash). However running service mcelogd start/stop works ok even with the typo
> I think this should be fixed.

Will fix ASAP.

P.
Comment 18 Jan Tluka 2011-02-23 08:45:15 EST
# rpm -qa mcelog
mcelog-1.0pre3_20101112-0.5.el6.x86_64

# tail -f /var/log/messages &
# service mcelogd start
Starting mcelog daemon
Feb 23 14:36:14 dhcp-26-223 mcelog: failed to prefill DIMM database from DMI data                [  OK  ]
# echo $?
0
# ls -l /var/lock/subsys/mcelogd 
-rw-r--r-- 1 root root 0 Feb 23 14:41 /var/lock/subsys/mcelogd
# service mcelogd start
# echo $?
0
# service mcelogd stop
Stopping mcelog                        [  OK  ]
# ls -l /var/lock/subsys/mcelogd 
ls: cannot access /var/lock/subsys/mcelogd: No such file or directory

Setting to verified.
Comment 19 Laura Bailey 2011-05-04 23:17:43 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-Cause: The mcelogd service does not check to see if other instances are already running.  This may cause multiple mcelogd service instances on a single system.
+The mcelog service did not check whether another instance of mcelog was running, which could result in multiple mcelog service instances on a single system. This could result in lost or over-reported Machine Check Exceptions. mcelog now detects whether another instance is already running, preventing multiple instances from being launched on a single system simultaneously.-Consequence: This may cause MCE events to be lost or over-reported.
-Fix: Modify the mcelogd script to check for an existing instance of mcelogd before starting
-Result: The mcelog service checks to see if other instances are already running.  This prevents multiple instances of mcelogd from running on a single system.
Comment 20 errata-xmlrpc 2011-05-19 07:51:42 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0519.html

Note You need to log in before you can comment on or make changes to this bug.