From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040206 Firefox/0.8 Description of problem: When using a large memory mapped file for high rate access on a SATA drive occasionally a cache-line (64 byte) sized chunk of data is written into the wrong page. Version-Release number of selected component (if applicable): kernel-smp-2.4.21-20.ELsmp How reproducible: Sometimes Steps to Reproduce: 1. Compile and run the following program on AMD64 with on serial ata drive. #define LARGEFILE_SOURCE #define _FILE_OFFSET_BITS 64 #include <stdio.h> #include <unistd.h> #include <time.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/fcntl.h> #include <sys/mman.h> #include <assert.h> #define FILESIZE ((off_t)(2LL*1024*1024*1024*sizeof(off_t))) int main(int argc,char *argv[]) { int fd; off_t i; off_t *p; fd=open("erase.me",O_CREAT|O_RDWR,0777); ftruncate(fd,FILESIZE); p=mmap((void *)0,FILESIZE,PROT_READ|PROT_WRITE,MAP_NORESERVE|MAP_SHARED,fd,(off_t)0); for (i=0;i<FILESIZE/sizeof(off_t);i++) { p[i]=i; } for (i=0;i<FILESIZE/sizeof(off_t);i++) { if (p[i] != i) { fprintf(stderr,"Error at offset %lld != %lld\n",i,p[i]); } } return 0; } Actual Results: The program produces the output similar to: Error at offset 641248800 != 641248288 Error at offset 641248801 != 641248289 Error at offset 641248802 != 641248290 Error at offset 641248803 != 641248291 Error at offset 641248804 != 641248292 Error at offset 641248805 != 641248293 Error at offset 641248806 != 641248294 Error at offset 641248807 != 641248295 Error at offset 718280736 != 722050080 Error at offset 718280737 != 722050081 Error at offset 718280738 != 722050082 Error at offset 718280739 != 722050083 Error at offset 718280740 != 722050084 Error at offset 718280741 != 722050085 Error at offset 718280742 != 722050086 Error at offset 718280743 != 722050087 Expected Results: The program should produce no output. On a SCSI raid array attached to the same system the program produces no output. Additional info: System is a dual opteron 246 with 6 GB RAM. SATA drive syslog entries: Oct 7 10:58:59 zork kernel: ata1: SATA max UDMA/100 cmd 0xFFFFFF0000021080 ctl 0xFFFFFF000002108A bmdma 0xFFFFFF00 00021000 irq 25 Oct 7 10:58:59 zork kernel: ata2: SATA max UDMA/100 cmd 0xFFFFFF00000210C0 ctl 0xFFFFFF00000210CA bmdma 0xFFFFFF00 00021008 irq 25 Oct 7 10:58:59 zork kernel: ata3: SATA max UDMA/100 cmd 0xFFFFFF0000021280 ctl 0xFFFFFF000002128A bmdma 0xFFFFFF00 00021200 irq 25 Oct 7 10:58:59 zork kernel: ata4: SATA max UDMA/100 cmd 0xFFFFFF00000212C0 ctl 0xFFFFFF00000212CA bmdma 0xFFFFFF00 00021208 irq 25 Oct 7 10:58:59 zork kernel: ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48 Oct 7 10:58:59 zork kernel: ata1: dev 0 configured for UDMA/100 Oct 7 10:58:59 zork kernel: ata2: no device found (phy stat 00000000) Oct 7 10:58:59 zork kernel: ata3: no device found (phy stat 00000000) Oct 7 10:58:59 zork kernel: ata4: no device found (phy stat 00000000) Oct 7 10:58:59 zork kernel: scsi1 : sata_sil Oct 7 10:58:59 zork kernel: scsi2 : sata_sil Oct 7 10:58:59 zork kernel: scsi3 : sata_sil Oct 7 10:58:59 zork kernel: scsi4 : sata_sil Oct 7 10:58:59 zork kernel: Vendor: ATA Model: WDC WD2500SD-01K Rev: 08.0 Oct 7 10:58:59 zork kernel: Type: Direct-Access ANSI SCSI revision: 05 Oct 7 10:58:59 zork kernel: Attached scsi disk sdg at scsi1, channel 0, id 0, lun 0 Oct 7 10:58:59 zork kernel: SCSI device sdg: 488397168 512-byte hdwr sectors (250059 MB) Oct 7 10:58:59 zork kernel: sdg: sdg1 sdg2 sdg3
Adding jgarzik to cc: list Jeff - does this tickle any memories or suggest any obvious places to go look? Whole pages getting corrupted I can understand, but cache lines are just a bit odd to me...
Has this been reproduced on more than one machine? I ask because it smells like bad RAM or bad cache RAM to me.
memtest86 reports no problems after several runs. Stand alone disk tests run overnight with no errors. The problem cannot be reproduced on SCSI disks on the same machine. It appears to definitely be related to a page being written to disk. The only thing I can think of would be something related to L1 or L2 cache not being fully flushed to the main RAM before a page is written to disk. I don't know enough about linux device drivers and the smp kernel to know if this is possible.
I only have one machine to test on at present. Any suggestions as to where I could get access to a similar machine?
Do you get the same results if you limit the memory to 4G (i.e. boot with "mem=4G")? This would suggest whether it may be an IOMMU issue...
Actually mem=1G would probably be better test (but in general, I agree w/ Jim's comment #5)
I'm out of the office today, but I will try to get back in to test it ASAP.
Using mem=1G did infact prevent the problem from occurring. One thing I do now notice is that even though IOMMU is enabled in the BIOS, I get messages like the following in the boot log. Oct 10 14:57:49 zork kernel: Checking aperture... Oct 10 14:57:49 zork kernel: CPU 0: aperture @ 0 size 32768 KB Oct 10 14:57:49 zork kernel: Your BIOS doesn't leave a aperture memory hole Oct 10 14:57:49 zork kernel: Please enable the IOMMU option in the BIOS setup Oct 10 14:57:49 zork kernel: Mapping aperture over 65536 KB of RAM @ 8000000 and elsewhere Oct 10 14:57:50 zork kernel: PCI-DMA: aperture base @ 8000000 size 65536 KB Oct 10 14:57:50 zork kernel: PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
Since last report the drive had gotten corrupted enough that I needed to reformat and reinstall. Additional potential hints... mem=4G causes a kernel panic for the SMP kernel, but is OK with non-SMP. mem=4G-64M is OK. iommu=off causes a kernel panic. iommu=merge causes no change in errors iommu=fullflush also causes no change
I have seen an extreme case of this that I believe is related. With Western Digital drives and more than 4GB of RAM, the system will suffer extreme file corruption. In most cases, the RHEL 3 Update 3 install will appear to succeed but upon reboot, a lot of file corruption occurs and eventually renders the system useless. I have seen the problem on three different configs that had only two things in common: Tyan S2885 w/ onboard SiI3114 SATA and a Western Digital drive. I have checked with Eric and his drive is also Western Digital. All my RAM configurations passed memtest86+: 8x 1GB ATP 8x 1GB Corsair 8x 2GB ATP (and 4x of the same 2GB ATP) We have used three different S2885 motherboards each with a different video card. One system had an add-in sound card. One had an add-in 3ware SATA RAID card. One had no add-in PCI cards. Other drives (Segate non-blacklist and Maxtor) do not suffer from the extreme (can't reboot after install) case of this problem. We are working to determine whether these drives suffer from the more subtle case exhibited but Eric's C program. The same Western Digital drive will not show the extreme case when attached to a 3ware RAID card. The add-in SIIG 3114 does not suffer from the extreme (can't reboot after install) case of this issue. We are checking to see if it passes Eric's program. The Western Digital drives that suffer extreme failure: WD360GD-00FNA0 (WD360 Raptor) WD2500JD-55HBB0 WD2500JD-00HBB0 WD1600JD-00HBB0
The on-board Silicon Image 3114 controller is on the 32-bit/33MHz PCI bus from the AMD-8111 south bridge. Add-in 3Ware and SIIG controllers were probably plugged into a 64-bit slot on the AMD-8131 PCI-X bridge. Could that difference be important?
I concur with the comment that is is restricted to Western Digital drives. Replacement of the WD drive with a Seagate of equivalent capacity has solved the problem on the server where it was initially reported.
This is the "SATA 4GB boundary corruption" problem, which was recently fixed.
Can you provide more information on this "4GB boundary corruption"? Is there another bugzilla tracking that problem? Western Digital, Tyan and Silicon Image were able to reproduce the problem and WD reported that Silicon Image said there was an issue with the 3114 chip and memory accesses. Silicon Image and Tyan released a new BIOS for the motherboard (with new 3114 BIOS code) that solved the problem in the test system.
On x86-64 (EM64T only) and >= 4GB of memory, memory corruption would occur. However, looking at the bug report again, I see that it's AMD64 not EM64T. Nonetheless, you say a new BIOS fixed things, so I'll leave it closed.
Since this seems to have been a BIOS/firmware issue, I'm closing it as NOTABUG (not a kernel bug, that is).
For closure, the specific version of Silicon Image Option ROM BIOS code needed is: Silicon Image Oprom v5.0.48 Tyan released new BIOS (Feb, 2005) for the S2885 and S4882 with that version of the Option ROM. The S2882 motherboard has a BIOS with 5.0.44 which may also be OK?