RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 614874 - mcelogd service does not honour already running service
Summary: mcelogd service does not honour already running service
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: mcelog
Version: 6.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Jan Tluka
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-15 12:56 UTC by Jan Tluka
Modified: 2015-09-28 02:09 UTC (History)
5 users (show)

Fixed In Version: mcelog-1.0pre3_20101112-0.3.el6
Doc Type: Bug Fix
Doc Text:
The mcelog service did not check whether another instance of mcelog was running, which could result in multiple mcelog service instances on a single system. This could result in lost or over-reported Machine Check Exceptions. mcelog now detects whether another instance is already running, preventing multiple instances from being launched on a single system simultaneously.
Clone Of:
Environment:
Last Closed: 2011-05-19 11:51:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
mcelog lockfile patch (1.01 KB, patch)
2010-11-15 18:33 UTC, Prarit Bhargava
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0519 0 normal SHIPPED_LIVE mcelog bug fix update 2011-05-19 09:47:48 UTC

Description Jan Tluka 2010-07-15 12:56:52 UTC
Description of problem:

I'm doing LSB compliance  review for mcelog initscript and I found an issue. According to the LSB SysVInit specification starting an already started service should return 0 exit code. Exit code 1 is returned instead and mcelogd tries to start again. This might be the result of mcelogd not honouring already running instance.

I have inspected a bit further. Seems that there's no /var/run/mcelogd.pid file. Therefore initscripts does not detect the mcelogd is already running and attempts to start another instance which fails.

Version-Release number of selected component (if applicable):
RHEL6.0-20100707.4
mcelog-1.0pre3-0.2.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
0. service mcelogd stop; service mcelogd status
1. service mcelogd start; service mcelogd status
2. service mcelogd start

  
Actual results:

# service mcelogd start
Starting mcelog daemon
# service mcelogd status
Checking for mcelog
mcelog (pid 6270) is running...
# service mcelogd start
Starting mcelog daemon
# echo $?
1
# service mcelogd status
Checking for mcelog
mcelog (pid 6270) is running...
# tail /var/log/messages | grep mcelog
Jul 15 08:43:28 intel-s3ea2-04 mcelog: mcelog server already running

# ls -l /var/run/mcelog*
srwxr-xr-x. 1 root root 0 Jul 15 08:43 /var/run/mcelog-client
# find /var/run/ -iname mcelog*
/var/run/mcelog-client


Expected results:
mcelog correctly detects that it's already running and does not start. 

Additional info:
The issue has been found while running test in Beaker:
test name: /CoreOS/mcelog/sanity/lsb-compliance
full log: https://beaker.engineering.redhat.com/recipes/13279#task160088

Comment 1 RHEL Program Management 2010-07-15 14:24:24 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 Jan Tluka 2010-07-19 20:02:26 UTC
> Description of problem:
> 
> I'm doing LSB compliance  review for mcelog initscript and I found an issue.
> According to the LSB SysVInit specification starting an already started service
> should return 0 exit code. Exit code 1 is returned instead and mcelogd tries to
> start again. This might be the result of mcelogd not honouring already running
> instance.
> 
> I have inspected a bit further. Seems that there's no /var/run/mcelogd.pid
> file. Therefore initscripts does not detect the mcelogd is already running and
> attempts to start another instance which fails.
> 

Update:

I've talked with Yulia Kopkova about the importance of pid file. She said that more important is to have /var/lock/subsys/mcelog file. And not having it is considered as a bug.

Another thing is that the actual cause of initial bug report is not the absence of pid file (or mcelog subsys lock file). The problem is that mcelogd script should check for the status of the service (e.g. a line containing 'status mcelogd') and depending on the result start the service (if not running) or return 0 (if running). Currently the script is not checking that and any attempt to start the service with already running one returns 1.

Comment 6 Prarit Bhargava 2010-11-15 18:33:34 UTC
Created attachment 460602 [details]
mcelog lockfile patch

Committed to RHEL6 mcelog.

P.

Comment 11 Prarit Bhargava 2010-12-15 14:43:24 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: The mcelogd service does not check to see if other instances are already running.  This may cause multiple mcelogd service instances on a single system.
Consequence: This may cause MCE events to be lost or over-reported.
Fix: Modify the mcelogd script to check for an existing instance of mcelogd before starting
Result: The mcelog service checks to see if other instances are already running.  This prevents multiple instances of mcelogd from running on a single system.

Comment 14 Jan Tluka 2011-02-21 14:04:24 UTC
Hi, the recent version of mcelog contains a typo in init script. 

# rpm -qa mcelog
mcelog-1.0pre3_20101112-0.4.el6.x86_64

# grep subsys /etc/init.d/mcelogd 
LOCKFILE="var/lock/subsys/mcelogd"

The path is ambiguous because of missing slash at the beginning. I've checked other initscripts (iptables, rsyslog) and they contain absolute path (with slash). However running service mcelogd start/stop works ok even with the typo I think this should be fixed.

Comment 15 Prarit Bhargava 2011-02-21 14:05:42 UTC
(In reply to comment #14)
> Hi, the recent version of mcelog contains a typo in init script. 
> 
> # rpm -qa mcelog
> mcelog-1.0pre3_20101112-0.4.el6.x86_64
> 
> # grep subsys /etc/init.d/mcelogd 
> LOCKFILE="var/lock/subsys/mcelogd"
> 
> The path is ambiguous because of missing slash at the beginning. I've checked
> other initscripts (iptables, rsyslog) and they contain absolute path (with
> slash). However running service mcelogd start/stop works ok even with the typo
> I think this should be fixed.

Will fix ASAP.

P.

Comment 18 Jan Tluka 2011-02-23 13:45:15 UTC
# rpm -qa mcelog
mcelog-1.0pre3_20101112-0.5.el6.x86_64

# tail -f /var/log/messages &
# service mcelogd start
Starting mcelog daemon
Feb 23 14:36:14 dhcp-26-223 mcelog: failed to prefill DIMM database from DMI data                [  OK  ]
# echo $?
0
# ls -l /var/lock/subsys/mcelogd 
-rw-r--r-- 1 root root 0 Feb 23 14:41 /var/lock/subsys/mcelogd
# service mcelogd start
# echo $?
0
# service mcelogd stop
Stopping mcelog                        [  OK  ]
# ls -l /var/lock/subsys/mcelogd 
ls: cannot access /var/lock/subsys/mcelogd: No such file or directory

Setting to verified.

Comment 19 Laura Bailey 2011-05-05 03:17:43 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-Cause: The mcelogd service does not check to see if other instances are already running.  This may cause multiple mcelogd service instances on a single system.
+The mcelog service did not check whether another instance of mcelog was running, which could result in multiple mcelog service instances on a single system. This could result in lost or over-reported Machine Check Exceptions. mcelog now detects whether another instance is already running, preventing multiple instances from being launched on a single system simultaneously.-Consequence: This may cause MCE events to be lost or over-reported.
-Fix: Modify the mcelogd script to check for an existing instance of mcelogd before starting
-Result: The mcelog service checks to see if other instances are already running.  This prevents multiple instances of mcelogd from running on a single system.

Comment 20 errata-xmlrpc 2011-05-19 11:51:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0519.html


Note You need to log in before you can comment on or make changes to this bug.