Red Hat Bugzilla – Full Text Bug Listing
|Summary:||ipmi_watchdog driver and ipmi init script broken on systems with certain types of watchdog hardware|
|Product:||Red Hat Enterprise Linux 5||Reporter:||IBM Bug Proxy <bugproxy>|
|Component:||OpenIPMI||Assignee:||Jan Safranek <jsafrane>|
|Status:||CLOSED ERRATA||QA Contact:||BaseOS QE <qe-baseos-auto>|
|Version:||5.4||CC:||curtis, jjarvis, mcermak, rvokal|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2009-12-02 07:20:32 EST||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description IBM Bug Proxy 2009-07-30 02:00:24 EDT
=Comment: #0================================================= Roger Mach <email@example.com> - ---Problem Description--- ipmi_watchdog driver and ipmi init script broken on systems with certain types of watchdog hardware Contact Information = Roger Mach <firstname.lastname@example.org> and Carol Hebert <email@example.com> ---Additional Hardware Info--- On-chip watchdog hardware supported by i6300esb driver or an Intel TCO Watchdog Timer device supported by the iTCO_wdt driver ---uname output--- 2.6.18-152.el5xen Machine Type = HS20 blade, x3550 M2 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Configure the ipmi_watchdog to load by setting IPMI_WATCHDOG=yes in /etc/sysconfig/ipmi and then load ipmi service with "service ipmi start". Observe that the ipmi drivers and ipmi_watchdog driver appear to load, however these errors can be seen in dmesg: IPMI Watchdog: Unable to register misc device IPMI Watchdog: driver initialized Subsequent testing proves that although the ipmi_watchdog driver appears to be loaded, "ipmitool mc watchdog get" shows that the config settings in the /etc/sysconfig/ipmi file have not been set: # grep WATCHDOG /etc/sysconfig/ipmi ## Description: Enable IPMI_WATCHDOG if you want the IPMI watchdog # Enable IPMI_WATCHDOG if you want the IPMI watchdog IPMI_WATCHDOG=yes IPMI_WATCHDOG_OPTIONS="timeout=60 action=power_cycle start_now=1" # ipmitool mc watchdog get Watchdog Timer Use: BIOS FRB2 (0x01) Watchdog Timer Is: Stopped Watchdog Timer Actions: No action (0x00) Pre-timeout interval: 0 seconds Timer Expiration Flags: 0x02 Initial Countdown: 0 sec Present Countdown: 0 sec [root@elm3a27 PAE]# lsmod |grep ipmi ipmi_watchdog 21129 0 ipmi_devintf 13129 0 ipmi_si 42829 0 ipmi_msghandler 39153 3 ipmi_watchdog,ipmi_devintf,ipmi_si The problem is that the i6300esb driver has the same major/minor device numbers and is already using /dev/watchdog as of boot-time: # lsmod |grep i6300 i6300esb 10841 0 # modinfo i6300esb filename: /lib/modules/2.6.18-152.el5xen/kernel/drivers/char/watchdog/i6300esb.ko alias: char-major-10-130 license: GPL description: Watchdog driver for Intel 6300ESB chipsets author: Ross Biro and David HÃ¤rdeman srcversion: 2A37792AAD84EC032278ECA alias: pci:v00008086d000025ABsv*sd*bc*sc*i* depends: vermagic: 2.6.18-152.el5xen SMP mod_unload 686 REGPARM 4KSTACKS gcc-4.1 parm: heartbeat:Watchdog heartbeat in seconds. (1<heartbeat<2046, default=30) (int) parm: nowayout:Watchdog cannot be stopped once started (default=CONFIG_WATCHDOG_NOWAYOUT) (int) module_sig: 883f3504a27af664a752b13a67179611257f509b6e87ea7e62e95c15ef1c26a12e3c75ff23aca409e33dcbcb4b2b5a9bbc6880e99fd6082d91b292c Note that if the ipmi drivers are unloaded (must use "service ipmi stop-all" or an equivalent to unload the ipmi_watchdog driver) and then if the i6300esb driver is unloaded (the /dev/watchdog device node is removed at that point) and then if the ipmi drivers are reloaded ("service ipmi start"), all appears to be working well with the ipmi_watchdog driver. So, it seems there are two problems with ipmi on systems that have a second on-board watchdog chip: 1) the init.d/ipmi script does not properly return a failure when the ipmi_watchdog driver is improperly loaded. 2) the ipmi driver does not return a failure to the startup script for reporting to the user when the ipmi_watchdog driver can not be fully loaded. Additionally, the ipmi_watchdog driver remains "partially" loaded so a user might think it was operational but it is not. This can also be reproduced on platforms with an Intel TCO Watchdog Timer device (supported by the iTCO_wdt driver, which supports the ICH10 TCO device): 00:1f.0 ISA bridge: Intel Corporation 82801JIB (ICH10) LPC Interface Controller ---System Management Component Data--- System management type: BMC supported by OpenIPMI driver Note that this problem was originally reported in a comment to LTC bugzilla 50564, Mirrored to Red Hat bugzilla 475536. That bugzilla has been updated to point at this one. =Comment: #2================================================= Roger Mach <firstname.lastname@example.org> - A release note or Tech Tip is needed to help customers recognize when they will need to move any on-board (very spartan) watchdog driver out of the way to allow use of the ipmi_watchdog driver and how to accomplish this move/switch.
Comment 1 IBM Bug Proxy 2009-08-18 03:20:48 EDT
------- Comment From email@example.com 2009-08-18 03:13 EDT------- Hello Redhat, Any updates on the bug ? Thanks.
Comment 3 Jan Safranek 2009-08-25 04:27:51 EDT
I have finally found HW where I can reproduce this. (In reply to comment #0) > 1) the init.d/ipmi script does not properly return a failure when the > ipmi_watchdog driver is improperly loaded. Question is, *how* can the init.d/ipmi tell, if ipmi_watchdog driver is improperly loaded. Modprobe does not say anything and returns exit code 0. I don't think that watching dmesg for "Unable to register misc device" is way to go. Do you have better idea? However, it might be possible to check, if there already is /dev/watchdog present in the system and print something like: Starting ipmi drivers: [ OK ] Starting ipmi_watchdog driver: /dev/watchdog is already present, ipmi_watchdog might not be initialized correctly [WARNING] Does it sound acceptable to you?
Comment 4 IBM Bug Proxy 2009-08-31 12:30:30 EDT
------- Comment From firstname.lastname@example.org 2009-08-31 12:24 EDT------- (In reply to comment #9) > I have finally found HW where I can reproduce this. > > (In reply to comment #0) > > 1) the init.d/ipmi script does not properly return a failure when the > > ipmi_watchdog driver is improperly loaded. > > Question is, *how* can the init.d/ipmi tell, if ipmi_watchdog driver is > improperly loaded. Modprobe does not say anything and returns exit code 0. I > don't think that watching dmesg for "Unable to register misc device" is way to > go. Do you have better idea? > > However, it might be possible to check, if there already is /dev/watchdog > present in the system and print something like: > > Starting ipmi drivers: [ OK ] > Starting ipmi_watchdog driver: > /dev/watchdog is already present, ipmi_watchdog might not be initialized > correctly [WARNING] > > Does it sound acceptable to you? Yes, I believe checking for /dev/watchdog is a good approach. However, I think the init script should fail (and unload the ipmi modules) instead of simply issuing a warning, which would force the user to resolve the conflict. Otherwise we are relying on the user noticing an error message during boot, which could be easily overlooked.
Comment 5 IBM Bug Proxy 2009-09-09 01:30:38 EDT
------- Comment From email@example.com 2009-09-09 01:29 EDT------- Hello Redhat, Does the above approach sounds ok ? Thanks.
Comment 6 Jan Safranek 2009-09-14 12:14:52 EDT
ipmi service has sophisticated error handling and it seems it's used to return exit code, which indicates what went wrong during service startup. E.g. when /dev/watchdog cannot be created from whatever reasons, IPMI module stays loaded and exit code is 8 (+ appropriate [FAILED] is displayed on console). I'd follow this approach. I changed the [WARNING] to [FAILED] and sent a patch upstream, let's follow the discussion there: http://sourceforge.net/mailarchive/forum.php?thread_name=20090914160902.29060.78511.stgit%40honza-ntb&forum_name=openipmi-developer
Comment 7 IBM Bug Proxy 2009-10-07 05:00:39 EDT
------- Comment From firstname.lastname@example.org 2009-10-07 04:56 EDT------- Hello Redhat, Patch has been committed upstream (http://openipmi.cvs.sourceforge.net/viewvc/openipmi/OpenIPMI/ipmi.init?view=diff&r1=1.10&r2=1.11), which release would be having this patch ? Thanks.
Comment 8 Jan Safranek 2009-10-07 06:03:21 EDT
(In reply to comment #7) > Patch has been committed upstream > (http://openipmi.cvs.sourceforge.net/viewvc/openipmi/OpenIPMI/ipmi.init?view=diff&r1=1.10&r2=1.11), > which release would be having this patch ? Well... the next one, I guess? :) I already have request to 'include latest version of OpenIPMI', see bug #514816, so if there is new OpenIPMI release and everything goes well, I'll include it in RHEL 5.5 If not, I am going to include this patch anyway, it's really simple and harmless.
Comment 10 RHEL Product and Program Management 2009-11-06 13:55:42 EST
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Comment 15 IBM Bug Proxy 2009-11-11 18:00:22 EST
------- Comment From email@example.com 2009-11-11 17:58 EDT------- (In reply to comment #17) > This request was evaluated by Red Hat Product Management for > inclusion, but this component is not scheduled to be updated in > the current Red Hat Enterprise Linux release. If you would like > this request to be reviewed for the next minor release, ask your > support representative to set the next rhel-x.y flag to "?". Hi RedHat, This bug along with other OpenIPMI fixes are important to us. Could you please confirm that this and RH Bugzilla 514816 will be included in RHEL 5.5? Thanks!
Comment 16 Jan Safranek 2009-11-12 06:10:39 EST
(In reply to comment #15) > This bug along with other OpenIPMI fixes are important to us. Could you please > confirm that this and RH Bugzilla 514816 will be included in RHEL 5.5? This bug should be fixed in 5.5. In general, all bugs which are in ON_QA state will be there. Bug #514816 is about updating OpenIPMI-2.0.16 to newer release. But there has not been any new release so far, 2.0.16 is still the latest one and the bug was closed.
Comment 19 errata-xmlrpc 2009-12-02 07:20:32 EST
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1629.html