Red Hat Bugzilla – Bug 639850
BMC watchdog doesn't start if setup in bios with a bootup timeout
Last modified: 2013-04-15 05:10:47 EDT
Description of problem: If the BMC is setup in bios with a bootup timeout (for failed boots), the service will not start because the timer is already running and bmc-watchdog requires the timer be stopped before allowing itself to be daemonised. Version-Release number of selected component (if applicable): freeipmi-0.5.1-6 How reproducible: Not quite sure... Steps to Reproduce: 1. Setup BMC with a timeout in the BIOS 2. Setup the freeipmi-bmc-watchdog Actual results: machine doesn't boot Expected results: machine boots up normally Additional info: Suggested fix is to force stopping the timer before launching the daemon in the init script. Stopping the timer is safe as an unconditional call during the script's "start" segment. No error is generated if a stop command is sent to an already stopped timer.
Created attachment 451350 [details] Proposed patch
I'm the person who reported this to RH support and provided the proposed patch. The "actual results" are wrong unless the BMC boot timeout is set to a very low value. The machine boots, but is subsequently killed by the BMC watchdog after a few minutes with the BMC recording a bootup timeout. The first part of the patch addresses this issue by unconditionally unsetting the timer. ADDITIONALLY: If the service is running: Stopping the service doesn't stop the timer. As a result the watchdog triggers. The second part of the patch addresses the issue using the same method. The first issue is 100% reproducable on any machine with a bootup timer option in the BMC section of bios. The second is 100% reproducable on any machine fitted with a BMC watchdog timer. Freeipmi services interfere with access to the BMC timer, so they may need to be disabled in order to get a reproducer.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1499.html