Description of problem: On my system here in Westford, 55.14 seems to have fixed the issues as reported in 226947. However, on RDU's system there appears to be another series of errors ***UNRELATED*** to my changes. Additionally, the 32BIT change I made in BZ 226947 comment #19, as well as the Legacy AHCI issue in comment #3 are BOTH in 59.EL. 55.14 yields on RDU's xw4550 correctly initializes the sb600: ahci 0000:00:12.0: controller can't do 64bit DMA, forcing 32bit ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:12.0: flags: 64bit ncq ilck pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xFFFFFF000001C100 ctl 0x0 bmdma 0x0 irq 169 ata2: SATA max UDMA/133 cmd 0xFFFFFF000001C180 ctl 0x0 bmdma 0x0 irq 169 ata3: SATA max UDMA/133 cmd 0xFFFFFF000001C200 ctl 0x0 bmdma 0x0 irq 169 ata4: SATA max UDMA/133 cmd 0xFFFFFF000001C280 ctl 0x0 bmdma 0x0 irq 169 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA/100, 156301488 sectors: LBA48 NCQ (depth 31/32) ata1.00: ata1: dev 0 multi count 16 ata1.00: configured for UDMA/100 scsi1 : ahci ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATAPI, max UDMA/100 ata2.00: configured for UDMA/100 scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) and no longer generates errors during this phase of booting. Later on (while initscripts are running?) I see new unreported errors from the ata controller: ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/33 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to UDMA/25 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:fd:b7:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) This is seen on RDU's xw9400 + > 4G memory with 2.6.9-55.14.EL and higher. System can be grabbed by ping'ing dwa.
The log is interesting. It's almost as ata1.00 is failing to get into UDMA/100 mode and finally (after a bunch of failing) gets thrown into PIO4. P.
This has nothing to do with >4G of memory. RDU system as configured only has 4 x 1G memory. /me starts looking for HW differences between RDU & Westford. P.
Upgraded to bios 1.02 -- this got rid of lost ticks & ethernet failure. P.
Created attachment 210631 [details] RHEL4 Error messages
Created attachment 210641 [details] RHEL4 32bit DMA addressing fix
From private email (from me to others): Over the past week-and-a-half I've been attempting to get the xw4550's SATA controller to work with 4 x 1G of ECC memory. The xw4550 is the "newer" model with the ALC262, and has BIOS version 1.02. The problem is that the SATA controller appears to have been initialized properly, ahci 0000:00:12.0: controller can't do 64bit DMA, forcing 32bit ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:12.0: flags: 64bit ncq ilck pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xFFFFFF000001C100 ctl 0x0 bmdma 0x0 irq 169 ata2: SATA max UDMA/133 cmd 0xFFFFFF000001C180 ctl 0x0 bmdma 0x0 irq 169 ata3: SATA max UDMA/133 cmd 0xFFFFFF000001C200 ctl 0x0 bmdma 0x0 irq 169 ata4: SATA max UDMA/133 cmd 0xFFFFFF000001C280 ctl 0x0 bmdma 0x0 irq 169 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA/100, 156301488 sectors: LBA48 NCQ (depth 31/32) ata1.00: ata1: dev 0 multi count 16 ata1.00: configured for UDMA/100 scsi1 : ahci ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATAPI, max UDMA/100 ata2.00: configured for UDMA/100 scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) Using cfq io scheduler Vendor: ATA Model: ST380815AS Rev: 3.CH Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back sda: sda1 sda2 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Vendor: ATAPI Model: DVD D DH16DYS Rev: XH39 Type: CD-ROM ANSI SCSI revision: 05 and then enters error handling when attempting to do a DMA read from sector 0 of ata1.00 when doing the initial mount in init. EXT3-fs: mounted filesystem with ordered data mode. ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:fd:b7:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) This leads to a cascading failure: ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:fd:b7:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:fd:b7:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to UDMA/66 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:fd:b7:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/66 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to UDMA/44 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:fd:b7:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/44 SCSI error : <0 0 0 0> return code = 0x8000002 Invalid sda: sense key No Sense end_request: I/O error, dev sda, sector 98940925 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to UDMA/33 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:95:b9:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/33 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to UDMA/25 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:95:b9:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting portata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/25 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to UDMA/16 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:95:b9:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/16 ata1: EH complete and we eventually end up in PIO mode: SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back ata1.00: limiting speed to PIO4 ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen ata1.00: cmd 60/00:00:95:b9:e5/01:00:05:00:00/40 tag 0 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for PIO4 ata1: EH complete SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) SCSI device sda: drive cache: write back Obviously, DMA is not working well on this system. A few things: - RHEL5, Fedora, Upstream all seem to work on this system so I've isolated the issue to RHEL4. - Specifying mem=4095M on the RHEL4 boot line works (mem actually is "4095M + 1M = 4096M") - Specifying mem=4096M fails (which is "4096M + 1M = 4097M"). The moment I exceed 4G of virtual memory I hit this issue. It's almost as if there is a "forgotten" virt_to_phys in the code. I have been unable to find such an error though... - I have verified that the returned DMA address in drivers/ata/ahci.c is a *valid* DMA'able address by compariing it back to the e820 maps. I have also verified that the physical address is below 4G (ie, we're not getting junk back from the dma_alloc_consistent call in ahci.c). - I verified the pci_dev's DMA masks were set appropriately (0xffffffff). - I did notice that HOST_CAP_64 was still set in ahci.c when we were in 32bit mode -- I modified the code to mask out this capability. - The built in Ethernet controller works if I force a 32bit DMA mask on the device and the OCHI & EHCI controllers both work, as does the Audio controller. At this point, I'm out of options as to what I could look at. I'm worried some bit is being set on the SB600 controller to allow DMA above 4G but I haven't found any code that does so.
From private email (sent to ronp & HP): Here is an overview of the xw4550 and BZ 300861. The patch that resolves this issue is a complex 1700+ line patch from 2.6.15-git8 which essentially rewrites the PCI DMA layer in Linux. The patch is attached to this email. This took roughly two and a half weeks to resolve down to this patch due to the amount of backporting of upstream patches and rebuilding of kernels. This patch is much too involved to put into RHEL4.6 at this time. I'm not sure if there is another option available to resolve this issue -- artificially limiting end_pfn to 0x10000000 resolves the problem, but this must be done prior to the kernel initializing the page tables, mapping ISA space, etc. This is not a possibility. My suggestion is that during automated installs (via kickstart, network, etc.) HP specify "mem=4095" on the boot line. The result of this workaround will be that available memory will decrease by roughly 0.8G (from 4.0G to ~3.2G) because of memory holes.
Created attachment 229871 [details] Upstream patch that resolves this issue in 2.6.15 Note that a key difference between RHEL4 and 2.6.15 is that 2.6.15 has 3 zones of memory ....
Does this problem occur with both 32 bit and 63 bit RHEL 4, or is it 32 bit only?
(In reply to comment #20) > Does this problem occur with both 32 bit and 63 bit RHEL 4, or is it 32 bit only? Our testing has been done on x86_64
From my testing with RHEL 4.6 beta, this does *not* happen on the 32 bit version, only on the x86-64 version.
RHEL4.6 boot up SATA drive in dma mode, I didn't see the cascading errors to PIO in the logs. The system booted up is Mako - SB600 chipset AMD development system. I will attach the dmesg and lspci logs to compare with HP system which has the issue. This proves that there is at least another AMD SB600 system to the pile of Prarit's SB600 systems that can boot up with R4.6 in dma mode without an isuse. Linux dhcp83-162.boston.redhat.com 2.6.9-64.ELsmp #1 SMP Wed Oct 17 17:15:46 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
Created attachment 237431 [details] Mako system dmesg
Created attachment 237441 [details] Mako system lspci
Bhavana, This might be a naive question, but is there a way to compare BIOS settings between the AMD reference design and our HP BIOS on the xw4550? These kinds of problems that might or might not be related to BIOS setup are difficult to hand to the BIOS engineer and say 'go fix this'. If we were able to compare hardware setup, we might get a clue as to what setting, if any, might be causing this problem. Jeff
Jeff, So basically you are looking for a way to compare outputs. AMD has a utility called SysCat which dumps all MSR and PCI config registers. You can do a brute force comparison between the output files, but that can be time consuming as there are usually many insignificant differences to sort through. A better option might be BIOS Test Suite program. It provides a more human readable report, let me work through it on my system. We can hook up next week. Bhavana
Running an "lspci -xxx" and comparing the ATA PCI devices via the PCI config space of the two systems will be handy. The comparison is not possible at the BIOS level as the settings will be stored in different formats. I'll bring by the BIOS Check if I can find an Linux version. I'll stop by tomorrow.
I have been able to reproduce this problem on an AMD SB600-based reference motherboard we have here in Ft. Collins. A AMD-provided "moray" reference board with 4 GB installed shows the same error messages as the xw4550 does. As well, using the kernel boot parameter "mem=4095M" eliminates the error messages. Given this result, I would say there's a very good chance the problem is not specifically in the HP BIOS, not is the problem specifically HP's. Jeff This event sent from IssueTracker by dwa issue 130852
The failure pattern (a zero sector transfer) isn't consistent with a DMA boundary problem (that would DMA to the wrong place and crash or report weird identify data) but suggests a serious bug somewhere in the bounce buffer logic. A zero sector DMA I/O will cause the symptoms described with some drives because its an invalid nonsense command. Its also a situation the drivers don't in general consider possible so may have other side effects that need fixing. Trace the origin of the corrupt I/O request back and if need be BUG() in the bounce buffer logic until you find out where the zero length entry comes from (eg a merge mishandling). Even if this is the only driver which shows it up we cannot have zero length sg entries appearing out of the bounce logic. Priority set to high because I'm rather worried what else this bug will cause if it can occur elsewhere as seems likely.
I have one SB600 board(RS485 A12) here, this issue exists under RHEL4.6 x86_64, the symptom is same as the messages in Comment 13. While we do NOT find the issue exist under RHEL5.1 x86_64. I find that removing the "forcing 32 bit DMA" patch can fix this issue under RHEL4.6: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux- 2.6.git;a=commit;h=c7a42156d99bcea7f8173ba7a6034bbaa2ecb77c Jeff or other HP guys: Can you try to remove the "forcing 32 bit DMA" patch on your SB600 platform? rebuild the kernel and give us the result under RHEL4.6 and RHEL5.1? I will create another patch, which can be used to remove that patch. I will contact the author of this patch(Tejun Heo) for more information about it. As to the SB600/SB700 64 bit DMA capacity, we are doing the double check with our hardware engineers. Thanks.
Created attachment 291697 [details] Withdraw forcing 32bit DMA patch
I don't believe this is the problem. It is merely showing up a problem in the DMA layer that must be fixed. The SB600 isn't the issue here, someone passed down a 0 length sg list which in turn issued a totally invalid 0 length DMA read to the drive. Focussing on the SB600/700 doesn't look likely to fix the bug, in fact it is probably a distraction. Re: SB600 - we saw upstream problems with > 4GB and dumps consistent to failure to support 64bit direct DMA.
>Re: SB600 - we saw upstream problems with > 4GB and dumps consistent to failure >to support 64bit direct DMA. I am 100% sure someone from AMD confirmed that the device was 32-bit only... P.
We don't care. Something is producing 0 length sg lists. Feed those into other 32bit only drivers and you have a potential serious corruptor. That bug *has* to be found and not papered over.
On my Moray board(RS690/SB600), it shows "timeout" when issue FPDMA read command. sg count is 1. ahci_sg is (040e5000,00000000,00000000,0001ffff). It tries to read 256 sectors. Here is part of log: ata_scsi_dump_cdb: CDB (3:0,0,0) 28 00 0e c3 68 2e 00 00 20 ata_sg_setup: 1 sg elements mapped TT:n_elem=1:(040e1000,00000000,00000000,00003fff) ata_scsi_dump_cdb: CDB (3:0,0,0) 28 00 0e c3 6b 76 00 01 00 ata_sg_setup: 1 sg elements mapped TT:n_elem=1:(040e5000,00000000,00000000,0001ffff) ata_scsi_timed_out: ENTER ata_scsi_timed_out: EXIT, ret=0 ata_scsi_error: ENTER ata_port_flush_task: ENTER ata_port_flush_task: flush #1 __ata_port_freeze: ata3 port frozen ata_eh_autopsy: ENTER ata_eh_autopsy: EXIT ata3.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x2 frozen ata3.00: cmd 60/00:08:76:6b:c3/01:00:0e:00:00/40 tag 1 cdb 0x0 data 131072 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Enlightenment achieved I think - 256 sectors is encoded as 0 for LBA28 transfers (hence I thought we got a zero sized block) - With the MMU doing remapping its possible (although rare) to get a linear MMU mapping of 256 pages - It seems the SB600 doesn't understand 0 is 256 ? If I read the AHCI spec right that would be a hardware bug in the AMD chip. Also a trivial one to work around as we do the same for some other controllers already.
If this is the bug then we can either split the sg entry (if the 256 is showing up a DMA transfer engine bug) or limit the sg length to 255 (also trivial) if its a state machine problem.
Alan, do you believe the issue has been root caused? Prarit, are you working on this, or plan to?
We have a convincing explanation that would explain all the observations.
The HW engineer confirmed that SB600 SATA AHCI controller can only support single PRD entry less than 256 sectors, so the PRD entry must be split if its length is >= 256 sectors.
If a FPDMA command has more than one PRD, it can transfer larger than 256 sectors data.
Excellent, that explains it all nicely.
A simple work around: add "iommu=nomerge" to boot parameters. By default, GART will merge small sgs to bigger one. If nomerge's specified, it will still keep in small sgs. The performance is the same as "mem=4000M".
Reassigning to jgarzik. P.
Not a good workaround - although it is very rare you can get such writes very occasionally especially when writing from 4MB pages, which some databases do - having it occasionally blow up on database data would be very bad..
There was already an issue with the vanilla kernel end of last year. See this commit: commit bc84cf17b50ca5b49bec0a5fef63c58c1526d46b Author: Ingo Molnar <mingo> Date: Mon Nov 26 20:42:19 2007 +0100 x86: turn off iommu merge by default revert this commit for now: commit 948062683004d13ca21c8c05ac052d387978a449 Author: Andi Kleen <ak> Date: Fri Oct 19 20:35:03 2007 +0200 x86: enable iommu_merge by default it's causing regressions: http://bugzilla.kernel.org/show_bug.cgi?id=9412 The kernel-bugzilla lists the same symptoms for AMD690. And this all was caused by iommu_merge.
iommu merge is simply showing up a long standing bug in the SB600 hardware. It's making it happen more often but the real bug is not the iommu merge, and in fact we can hopefully re-enable that once the driver is fixed.
Do we have a patch for this?
Jeff provided one patch for this issue: http://marc.info/?l=linux-ide&m=120423157501944&w=2
upstream commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux- 2.6.git;a=commit;h=a878539ef994787c447a98c2e3ba0fe3dad984ec
Committed in 68.24.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
There is another patch, which is from Jeff and can fix one regression issue. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux- 2.6.git;a=commit;h=4cde32fc4b32e96a99063af3183acdfd54c563f0 Jeff: Have you added it to RHEL4.7 too?
Is there any guy who can ask HP to provide "lspci -xxx" output on their xw4550 platform, where the bug can be duplicated without any patch? Thanks
Created attachment 301502 [details] backport a878539ef994787c447a98c2e3ba0fe3dad984ec and recover AHCI_FLAG_32BIT_ONLY
Jeff and Bhavana, The git commit a878539ef994787c447a98c2e3ba0fe3dad984ec must be backported for RHEL4 otherwise the condition "if (hpriv->flags & AHCI_HFLAG_SECT255)" will always be FALSE to RHEL4. I tried to backport it and added the debug message, besides that, I also recovered the AHCI_FLAG_32BIT_ONLY flag. Can you guys review my attached patch? Thanks
Created attachment 301603 [details] backport AHCI_FLAG_SECT255 & add AHCI_FLAG_32BIT_ONLY to update the patch in comment #60
The patch in comment #60 has been replaced by the one in comment #62, to exclude any potential side effect, please review and use this updated one. The patch in comment #62 was generated on base of kernel-2.6.9-68.29.EL.src.rpm Thanks
Changing the bug status back to ASSIGNED as this bug needs another patch to complete the 32-bit SB600 limit support.
Bhavana, it may be better to track this additional patch in a new bugzilla since the new patch may not make Beta. Vivek - what do you prefer?
Vivek told me to change this back to ASSIGNED, my apologies for missing that key piece of info.
I am ok with new bug too but in this case looks like without new patches this patch will fail QE testing. Am I right Bhavana. Bhavana' patch looks small. Pete is kind of ok with that patch. He has requested one message change. So in this case it might not be a bad idea to pull in the patch against this bug itself.
That's right Vivek. I have a patch that ready to go and tested. I'm waiting for the go-ahead from Jeff Garzik before resubmitting.
Posted to RHML on Apr 11.
Bhavana has posted another patch which in combination of previously committed patch is supposed to solve the issue fully. Committed this new patch in 68.34 build.
This bug is already part of errata. Had to temporarily bring it back to POST state as Bhavana in the mean time posted one more patch. Putting the bug back to ON_QA.
The kernel-2.6.9-68.34.EL.x86_64.rpm can work on my SB600 board with 4.5G system memory. Is there any guy who can ask HP to test it too on their platform?
Created attachment 312501 [details] lspci -xxx
I have HP dc5070mt systems exhibiting these symptoms. I would like ot get my hands on kernel-2.6.9-68.34.EL.x86_64.rpm to test and share results. Comment #77 and attachment of a lspci -xxx from one of these systems. [root@auslx121 tmp]# uname -a Linux auslx121 2.6.9-67.0.20.ELsmp #1 SMP Wed Jun 18 12:35:02 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@auslx121 tmp]# rpm -q redhat-release redhat-release-4WS-7
If you have a current Red Hat Enterprise Linux subscription please open a case with a support representative. They will be able to attach your ticket to this bugzilla and provide packages for you to test and provide feedback on. Alternately, you can find Vivek Goyal's RHEL4 testing kernels here: http://people.redhat.com/vgoyal/rhel4/ These have moved past 68.34.EL and are now at release 2.6.9-78.2.EL but these kernels will contain the patch that was merged to resolve this issue.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0665.html