Red Hat Bugzilla – Bug 659480
UV: WAR for interrupt-IOPort deadlock
Last modified: 2011-05-19 08:04:24 EDT
Created attachment 464378 [details] Tested patch Description of problem: Problem originally noticed when 'hwclock --systohc' halted a running system. Version-Release number of selected component (if applicable): kernel-2.6.32-71 How reproducible: Easily, but not 100%. Steps to Reproduce: 1. boot system on UV hardware 2. hwclock --systohc ## ususally sufficient 3. hwclock On one system also had to do the following before running it: echo 4 >/proc/irq/8/smp_affinity then run hwclock. Actual results: System halts with CATERR. Expected results: System continues to run. Additional info: The attached patch has been tested inside SGI on multiple systems. PV# 1012363
Went ahead and posted as this was seen at a customer site.
Reposted per Don's and Aristeu's request. George
Patch(es) available on kernel-2.6.32-92.el6
This has been tested on both UV100 and a very large UV1000 system inside SGI; the problem is solved :). George
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Prior to this update, running the hwclock --systohc command could halt a running system. This was due to the interrupt transactions being looped back from a local IOH (Input/Output Hub), through the IOH to a local CPU (erroneously), which caused a conflict with I/O port operations and other transactions. With this update, the conflicts are avoided and the system continues to run after executing the hwclock --systohc command.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0542.html