Bug 110170
Summary: | [PATCH] LTC5381- rhel 3 will need to pick up the cyclone-lpj-fix patch | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | IBM Bug Proxy <bugproxy> | ||||
Component: | kernel | Assignee: | Doug Ledford <dledford> | ||||
Status: | CLOSED ERRATA | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.0 | CC: | ckloiber, echevreau, johnstul, jrichard, lcm, minowicz, petrides, riel, sebastian.wenner, tao | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2004-05-12 01:07:47 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 107562 | ||||||
Attachments: |
|
Description
IBM Bug Proxy
2003-11-15 21:25:20 UTC
Created attachment 96003 [details]
linux-2.4.23-pre9_cyclone-lpj-fix_A0.patch
------ Additional Comments From khoa.com 2003-25-11 14:29 ------- Just put this bug report in the correct state/ownership.... Chris, adding you to this bug, I thought all was good with timer fixes for x440 and x445 ? Bob: Not quite. There is a subtle race in the lost-ticks compensation code. It appears to only bite us with certain combinations of cpu numbers, frequencies and SMT. (ie: 1 cpu @ 2Ghz w/ HT, 8 cpus @ 2.8Ghz w/o HT). It was not seen in testing RHEL 3.0, but was discovered by Andrea Arcangeli while testing for SLES8 SP3 (after RHEL 3.0 had gone gold). The patch (now accepted into 2.4.23-rc1) was ported and submitted as soon as possible after the issue was found. Let me know if you have any further questions. RHEL 2.1 Update 4 should also take this fix. I still see bogoMips of 1.5 on a x440 with 2.4.21-7.ELsmp. Please take this patch. We have customer in the field seeing this issue with X445's. They report that running with HyperThreading disabled they see the issue (and the clock runs so fast they can't login) but the issue is not apparent when HyperThreading is enabled. This customer wishes to run with HyperThreading off for their application. (CRM #282865) Please see Bug numbers #108595 and #110999 for aparently related side effects. We are seeing both issues (SCSI hangs on on-board disk array, and Fast system clock) on 2 xSeries 445s, one with dual 2.5 Ghz and one with quad 2.8 Ghz processors. We are currently running RHEL AS 3.0 Update 1 Kernel ( 2.4.21-9.ELsmp). We have also seen both problems in hugemem kernel. I've attached dmesg and other details to both bugs. Also we see both problems with HyperThreading on and HyperThreading off, when set from the boot parms, we have yet to test with HyperThreading off via the bios. I opened bug #115061 to track this issue under RHEL 2.1 The patch in comment #1 has been committed to the RHEL 3 U2 patch pool tonight, and it will first be available internally in the Red Hat Engineering build of kernel version 2.4.21-9.11.EL. Adding to the blocker to keep track of it. Tested the 2.4.21-11.ELsmp kernel and the issue looks to be resolved. BogoMIPS looks correct for all cpus and the earlier attached patch is present in the kernel src directory. I believe this bug can be closed. Bug #108595 should also be closable after the reporters have verified it works for them. ----- Additional Comments From jstultz.com(prefers email via johnstul.com) 2004-04-12 19:03 ------- I'm closing the LTC bug, as this issue is resolved. *** Bug 110999 has been marked as a duplicate of this bug. *** An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2004-188.html *** Bug 108595 has been marked as a duplicate of this bug. *** I have an HP Proliant DL360 with Internal SCSI Card (SmartArray 6i) ==> System Disk Aditional SCSI CARD 1 (LSI Logic 53c1030) ==> SCSI Disk Array Additional SCSI Card 2 (LSI Logic 53c1030) ==> Tape DRIVE HP Ultrium 215 System is RHEL3 U4 ES with kernel 2.4.21-37 (I also try with updated kernel 2.4.21-40) I have similary bug when usin LTO Tape drive /var/log/messages : Mar 30 14:55:44 galibier kernel: scsi : aborting command due to timeout : pid 313, scsi1, channel 0, id 5, lun 0 Log Sense 00 7e 00 00 00 00 00 ff 40 Mar 30 14:55:44 galibier kernel: mptscsih: ioc1: id=5 OldAbort: scheduling ABORT SCSI IO (sc=c3588600) Mar 30 14:55:45 galibier kernel: SCSI host 1 abort (pid 313) timed out - resetting Mar 30 14:55:45 galibier kernel: SCSI bus is being reset for host 1 channel 0. Mar 30 14:55:45 galibier kernel: mptscsih: ioc1: id=5 OldReset: scheduling BUS_RESET SCSI IO (sc=c3588600) Mar 30 14:55:45 galibier kernel: mptbase: ioc1: WARNING - IOCStatus(0x0048): SCSI Task Terminated I think it is the same bug, so I ask it to be reoppen. Thanks Emmanuel, while your particular problem may be a result of missed/lost interrupts due to bugs in the lost tick compensation code (as described in this bug), it won't be addressed by the patch attached to this bugzilla. This bug report and attached patch applies only to specific IBM xSeries hardware utilizing the IBM Summit chipset. |