Bug 447507 - md5sum errors when copying files
md5sum errors when copying files
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.2
All Linux
low Severity high
: rc
: ---
Assigned To: Anton Arapov
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-20 04:48 EDT by Carlos Lopez
Modified: 2014-06-18 04:01 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-09-11 02:51:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Carlos Lopez 2008-05-20 04:48:23 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14

Description of problem:
When I try to copy files to a RHEL 5.2 or 5.1 all md5sums are wrong. And occasionally, host freezes.

I have disabled ADMA on sata_nv driver, without a result. md5sums continues wrong.

If I copy same files to /dev/shm all it is ok.

These problems appears on kernel-2.6.18-53.x.x in RHEL 5.1 too (on i386 and x86_64).

My motherboard is a GigaByte nForce4 SLI Intel Edition.

These problems doesn't appears if I install Windows 2003 R2 SP2 or Windows 2008 SP1 (using Micorsoft native drivers).



Version-Release number of selected component (if applicable):
kernel-2.6.18-84.el5

How reproducible:
Always


Steps to Reproduce:
1. Copy file over ssh or local
2. md5sum is wrong
3.

Actual Results:
Every time md5sums are worng.

Expected Results:
md5sum needs to be ok.

Additional info:
My dmesg output:

Linux version 2.6.18-84.el5 (brewbuilder@hs20-bc1-7.build.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-39)) #1 SMP Fri Feb 29 16:26:41 EST 2008
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
 BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bfff0000 (usable)
 BIOS-e820: 00000000bfff0000 - 00000000bfff3000 (ACPI NVS)
 BIOS-e820: 00000000bfff3000 - 00000000c0000000 (ACPI data)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
2175MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000f53e0
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
disabling kdump
Using x86 segment limits to approximate NX protection
On node 0 totalpages: 786416
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 225280 pages, LIFO batch:31
  HighMem zone: 557040 pages, LIFO batch:31
DMI 2.3 present.
Using APIC driver default
ACPI: RSDP (v000 MSTEST                                ) @ 0x000f6cd0
ACPI: RSDT (v001 MSTEST TESTONLY 0x42302e31 AWRD 0x01010101) @ 0xbfff3040
ACPI: FADT (v001 MSTEST TESTONLY 0x42302e31 AWRD 0x01010101) @ 0xbfff30c0
ACPI: SLIC (v001 MSTEST TESTONLY 0x42302e31 AWRD 0x01010101) @ 0xbfff77c0
ACPI: MCFG (v001 MSTEST TESTONLY 0x42302e31 AWRD 0x01010101) @ 0xbfff7980
ACPI: MADT (v001 MSTEST TESTONLY 0x42302e31 AWRD 0x01010101) @ 0xbfff76c0
ACPI: SSDT (v001  PmRef  Cpu0Ist 0x00003000 INTL 0x20040311) @ 0xbfff7a00
ACPI: SSDT (v001  PmRef    CpuPm 0x00003000 INTL 0x20040311) @ 0xbfff7e90
ACPI: DSDT (v001 MSTEST AWRDACPI 0x00001000 MSFT 0x0100000c) @ 0x00000000
ACPI: PM-Timer IO Port: 0x1408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:6 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 15:6 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled)
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at c2000000 (gap: c0000000:20000000)
Detected 3214.047 MHz processor.
Built 1 zonelists.  Total pages: 786416
Kernel command line: ro root=LABEL=/
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c074a000 soft=c072a000
PID hash table entries: 4096 (order: 12, 16384 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 3112644k/3145664k available (2095k kernel code, 31880k reserved, 875k data, 224k init, 2228160k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 6430.53 BogoMIPS (lpj=3215269)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000e4bd 00000000 00000001
CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000e4bd 00000000 00000001
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU: After all inits, caps: bfebf3ff 20100000 00000000 00000180 0000e4bd 00000000 00000001
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (24) available
CPU0: Thermal monitoring enabled
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
CPU0: Intel(R) Pentium(R) D CPU 3.20GHz stepping 04
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
CPU 1 irqstacks, hard=c074b000 soft=c072b000
Initializing CPU#1
Calibrating delay using timer specific routine.. 6426.90 BogoMIPS (lpj=3213452)
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000e4bd 00000000 00000001
CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000e4bd 00000000 00000001
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: After all inits, caps: bfebf3ff 20100000 00000000 00000180 0000e4bd 00000000 00000001
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (24) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) D CPU 3.20GHz stepping 04
Total of 2 processors activated (12857.44 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
sizeof(vma)=84 bytes
sizeof(page)=32 bytes
sizeof(inode)=340 bytes
sizeof(dentry)=136 bytes
sizeof(ext3inode)=492 bytes
sizeof(buffer_head)=52 bytes
sizeof(skbuff)=172 bytes
migration_cost=2174
checking if image is initramfs... it is
Freeing initrd memory: 2347k freed
NET: Registered protocol family 16
No dock devices found.
ACPI: bus type pci registered
PCI: Using MMCONFIG
PCI: Buses that can't use MMCONFIG will use type 1 PCI conf access.
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:01:00.0
PCI: Transparent bridge - 0000:00:12.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 5 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 5 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSID] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LFID] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LPCA] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0, disabled.
ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCP] (IRQs 20 21 22 23) *0, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 15 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
pnp: 00:01: ioport range 0x1400-0x147f could not be reserved
pnp: 00:01: ioport range 0x1480-0x14ff has been reserved
pnp: 00:01: ioport range 0x1800-0x187f has been reserved
pnp: 00:01: ioport range 0x1880-0x18ff could not be reserved
pnp: 00:01: ioport range 0x1c00-0x1c7f has been reserved
pnp: 00:01: ioport range 0x1c80-0x1cff has been reserved
PCI: Bridge: 0000:00:02.0
  IO window: disabled.
  MEM window: d0000000-d2ffffff
  PREFETCH window: c0000000-cfffffff
PCI: Bridge: 0000:00:12.0
  IO window: b000-bfff
  MEM window: d3000000-d4ffffff
  PREFETCH window: disabled.
PCI: Setting latency timer of device 0000:00:02.0 to 64
PCI: Setting latency timer of device 0000:00:12.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
apm: disabled - APM is not SMP safe.
audit: initializing netlink socket (disabled)
audit(1211278010.111:1): initialized
highmem bounce pool size: 64 pages
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Initializing Cryptographic API
ksign: Installing public key data
Loading keyring
- Added public key E8484D5D19C4DEA
- User ID: Red Hat, Inc. (Kernel Module GPG key)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
PCI: Setting latency timer of device 0000:00:02.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:02.0:pcie00]
Allocate Port Service[0000:00:02.0:pcie03]
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI (exconfig-0456): Dynamic SSDT Load - OemId [ PmRef] OemTableId [ Cpu1Ist] [20060707]
ACPI Exception (acpi_processor-0681): AE_NOT_FOUND, Processor Device is not present [20060707]
ACPI: Getting cpuindex for acpiid 0x2
ACPI Exception (acpi_processor-0681): AE_NOT_FOUND, Processor Device is not present [20060707]
ACPI: Getting cpuindex for acpiid 0x3
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE-MCP04: IDE controller at PCI slot 0000:00:0f.0
NFORCE-MCP04: chipset revision 242
NFORCE-MCP04: not 100% native mode: will probe irqs later
NFORCE-MCP04: 0000:00:0f.0 (rev f2) UDMA133 controller
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
hda: SONY DVD-ROM DDU1615, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
Probing IDE interface ide1...
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
ACPI: (supports S0 S1 S4 S5)
Time: tsc clocksource has been installed.
Freeing unused kernel memory: 224k freed
Write protecting the kernel read-only data: 393k
input: AT Translated Set 2 keyboard as /class/input/input0
ACPI: PCI Interrupt Link [APCL] enabled at IRQ 23
ACPI: PCI Interrupt 0000:00:0b.2[C] -> Link [APCL] -> GSI 23 (level, low) -> IRQ 193
PCI: Setting latency timer of device 0000:00:0b.2 to 64
ehci_hcd 0000:00:0b.2: EHCI Host Controller
ehci_hcd 0000:00:0b.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:0b.2: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:0b.2
ehci_hcd 0000:00:0b.2: irq 193, io mem 0xd5001000
ehci_hcd 0000:00:0b.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 10 ports detected
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt Link [APCF] enabled at IRQ 22
ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [APCF] -> GSI 22 (level, low) -> IRQ 201
PCI: Setting latency timer of device 0000:00:0b.0 to 64
ohci_hcd 0000:00:0b.0: OHCI Host Controller
ohci_hcd 0000:00:0b.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:0b.0: irq 201, io mem 0xd5004000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 5 ports detected
ACPI: PCI Interrupt Link [APCG] enabled at IRQ 21
ACPI: PCI Interrupt 0000:00:0b.1[B] -> Link [APCG] -> GSI 21 (level, low) -> IRQ 209
PCI: Setting latency timer of device 0000:00:0b.1 to 64
ohci_hcd 0000:00:0b.1: OHCI Host Controller
ohci_hcd 0000:00:0b.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:0b.1: irq 209, io mem 0xd5000000
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 5 ports detected
input: ImPS/2 Generic Wheel Mouse as /class/input/input1
USB Universal Host Controller Interface driver v3.0
SCSI subsystem initialized
libata version 3.00 loaded.
sata_nv 0000:00:10.0: version 3.5
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 20
ACPI: PCI Interrupt 0000:00:10.0[A] -> Link [APSI] -> GSI 20 (level, low) -> IRQ 217
PCI: Setting latency timer of device 0000:00:10.0 to 64
scsi0 : sata_nv
scsi1 : sata_nv
ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xdc00 irq 217
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xdc08 irq 217
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: HPA detected: current 312579695, native 312581808
ata1.00: ATA-7: Maxtor 6V160E0, VA111900, max UDMA/133
ata1.00: 312579695 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATA-6: ST380013AS, 3.18, max UDMA/133
ata2.00: 156301488 sectors, multi 16: LBA48 
ata2.00: configured for UDMA/133
  Vendor: ATA       Model: Maxtor 6V160E0    Rev: VA11
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 312579695 512-byte hdwr sectors (160041 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 312579695 512-byte hdwr sectors (160041 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3
sd 0:0:0:0: Attached scsi disk sda
  Vendor: ATA       Model: ST380013AS        Rev: 3.18
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb: sdb1
sd 1:0:0:0: Attached scsi disk sdb
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
audit(1211278017.798:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
security:  3 users, 6 roles, 1672 types, 213 bools, 1 sens, 1024 cats
security:  61 classes, 59381 rules
SELinux:  Completing initialization.
SELinux:  Setting up existing superblocks.
SELinux: initialized (dev sda2, type ext3), uses xattr
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev eventpollfs, type eventpollfs), uses task SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev cpuset, type cpuset), uses genfs_contexts
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
audit(1211278018.046:3): policy loaded auid=4294967295 ses=4294967295
input: PC Speaker as /class/input/input2
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:0:0: Attached scsi generic sg1 type 0
i2c_adapter i2c-0: nForce2 SMBus adapter at 0x700
i2c_adapter i2c-1: nForce2 SMBus adapter at 0x900
hda: ATAPI 40X DVD-ROM drive, 1725kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.20
8139cp: 10/100 PCI Ethernet driver v1.2 (Mar 22, 2004)
8139cp 0000:02:06.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip
8139cp 0000:02:06.0: Try the "8139too" driver instead.
8139cp 0000:02:07.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip
8139cp 0000:02:07.0: Try the "8139too" driver instead.
forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.60.
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 23
ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link [APCH] -> GSI 23 (level, low) -> IRQ 193
PCI: Setting latency timer of device 0000:00:0e.0 to 64
forcedeth: using HIGHDMA
8139too Fast Ethernet driver 0.9.27
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
eth0: forcedeth.c: subsystem: 01458:e000 bound to 0000:00:0e.0
ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
ACPI: PCI Interrupt 0000:02:06.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 225
eth1: RealTek RTL8139 at 0xf8820000, 00:80:5a:4a:2b:fc, IRQ 225
eth1:  Identified 8139 chip type 'RTL-8100B/8139D'
ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19
ACPI: PCI Interrupt 0000:02:07.0[A] -> Link [APC4] -> GSI 19 (level, low) -> IRQ 233
eth2: RealTek RTL8139 at 0xf8946000, 00:08:54:4c:55:60, IRQ 233
eth2:  Identified 8139 chip type 'RTL-8100B/8139D'
lp0: using parport0 (interrupt-driven).
lp0: console ready
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ibm_acpi: ec object not found
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel@redhat.com
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on sda2, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sdb1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev sdb1, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
Adding 3148732k swap on /dev/sda3.  Priority:-1 extents:1 across:3148732k
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
ip6_tables: (C) 2000-2006 Netfilter Core Team
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
ip_conntrack version 2.4 (8192 buckets, 65536 max) - 228 bytes per conntrack
eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
eth0: no IPv6 routers present
Comment 1 Carlos Lopez 2008-05-20 12:26:53 EDT

Ok more info. I have installed SLES 10 SP2 (today released). SLES 10 uses kernel
2.6.16 and md5sum errors doesn't appears. All it is ok. 

Can be a problem with kernel released after 2.6.16???
Comment 2 Anton Arapov 2008-06-10 04:50:29 EDT
I'd like to have more info:
1. size of test file, which you get corrupted
2. lspci -v output

And steps to try in order to avoid the problem:
1. check your BIOS version and try to update, there are some known issues with
memory mapping.
2. try to disable memwholemapping in BIOS
3. pass to kernel mem=3072M
4. pass to kernel iommu=soft
Please, try everything separately, and let me know the results.
Comment 3 Carlos Lopez 2008-06-11 04:15:07 EDT
Dear Anton,

 My results:

 1. File is 1,8 GB large. It is an official iso image of Windows 2008 i386 from
Microsoft WebSite: 89fbc4c7baafc0b0c05f0fa32c192a17  win2k8stdeval.iso

 2. My lspci -v output:

00:00.0 Host bridge: nVidia Corporation: Unknown device 0071 (rev a3)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: bus master, 66Mhz, fast devsel, latency 0
        Capabilities: [40] HyperTransport: Host or Secondary Interface

00:00.1 RAM memory: nVidia Corporation: Unknown device 007f (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:00.2 RAM memory: nVidia Corporation: Unknown device 0075 (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:00.3 RAM memory: nVidia Corporation: Unknown device 006f (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:00.4 RAM memory: nVidia Corporation: Unknown device 00b4 (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: bus master, 66Mhz, fast devsel, latency 0

00:01.0 RAM memory: nVidia Corporation: Unknown device 0076 (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:01.1 RAM memory: nVidia Corporation: Unknown device 0078 (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:01.2 RAM memory: nVidia Corporation: Unknown device 0079 (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:01.3 RAM memory: nVidia Corporation: Unknown device 007a (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:01.4 RAM memory: nVidia Corporation: Unknown device 007b (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:01.5 RAM memory: nVidia Corporation: Unknown device 007c (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:01.6 RAM memory: nVidia Corporation: Unknown device 007d (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5000
        Flags: 66Mhz, fast devsel

00:02.0 PCI bridge: nVidia Corporation: Unknown device 007e (rev a2) (prog-if 00
[Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Memory behind bridge: d0000000-d2ffffff
        Prefetchable memory behind bridge: 00000000c0000000-00000000cff00000
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
        Capabilities: [80] Express Root Port (Slot+) IRQ 0
        Capabilities: [100] Virtual Channel

00:09.0 RAM memory: nVidia Corporation: Unknown device 003f (rev a1)
        Subsystem: Giga-byte Technology: Unknown device 5001
        Flags: bus master, 66Mhz, fast devsel, latency 0
        Capabilities: [44] HyperTransport: Slave or Primary Interface

00:0a.0 ISA bridge: nVidia Corporation: Unknown device 0030 (rev a3)
        Subsystem: Giga-byte Technology: Unknown device 5001
        Flags: bus master, 66Mhz, fast devsel, latency 0

00:0a.1 SMBus: nVidia Corporation MCP04 SMBus (rev a2)
        Subsystem: Giga-byte Technology: Unknown device 0034
        Flags: 66Mhz, fast devsel, IRQ 177
        I/O ports at e000 [size=32]
        I/O ports at 0700 [size=64]
        I/O ports at 0900 [size=64]
        Capabilities: [44] Power Management version 2

00:0b.0 USB Controller: nVidia Corporation MCP04 USB Controller (rev a1)
(prog-if 10 [OHCI])
        Subsystem: Giga-byte Technology: Unknown device 5004
        Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 185
        Memory at d5004000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [44] Power Management version 2

00:0b.1 USB Controller: nVidia Corporation MCP04 USB Controller (rev a1)
(prog-if 10 [OHCI])
        Subsystem: Giga-byte Technology: Unknown device 5004
        Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 193
        Memory at d5000000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [44] Power Management version 2

00:0b.2 USB Controller: nVidia Corporation MCP04 USB Controller (rev a2)
(prog-if 20 [EHCI])
        Subsystem: Giga-byte Technology: Unknown device 5004
        Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 201
        Memory at d5001000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [44] Debug port
        Capabilities: [80] Power Management version 2

00:0e.0 Ethernet controller: nVidia Corporation MCP04 Ethernet Controller (rev a2)
        Subsystem: Giga-byte Technology: Unknown device e000
        Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 177
        Memory at d5002000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at c800 [size=8]
        Capabilities: [44] Power Management version 2
        Capabilities: [50] Message Signalled Interrupts: 64bit+ Queue=0/2 Enable-

00:0f.0 IDE interface: nVidia Corporation MCP04 IDE (rev f2) (prog-if 8a [Master
SecP PriP])
        Subsystem: Giga-byte Technology: Unknown device b000
        Flags: bus master, 66Mhz, fast devsel, latency 0
        I/O ports at f000 [size=16]
        Capabilities: [44] Power Management version 2

00:10.0 IDE interface: nVidia Corporation MCP04 Serial ATA Controller (rev f2)
(prog-if 85 [Master SecO PriO])
        Subsystem: Giga-byte Technology: Unknown device b002
        Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 185
        I/O ports at 09f0 [size=8]
        I/O ports at 0bf0 [size=4]
        I/O ports at 0970 [size=8]
        I/O ports at 0b70 [size=4]
        I/O ports at dc00 [size=16]
        Memory at d5003000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [44] Power Management version 2
        Capabilities: [b0] Message Signalled Interrupts: 64bit+ Queue=0/2 Enable-

00:12.0 PCI bridge: nVidia Corporation MCP04 PCI Bridge (rev a2) (prog-if 01
[Subtractive decode])
        Flags: bus master, 66Mhz, fast devsel, latency 0
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
        I/O behind bridge: 0000b000-0000bfff
        Memory behind bridge: d3000000-d4ffffff

01:00.0 VGA compatible controller: nVidia Corporation NV44 [GeForce 6200 LE]
(rev a1) (prog-if 00 [VGA])
        Flags: bus master, fast devsel, latency 0, IRQ 209
        Memory at d0000000 (32-bit, non-prefetchable) [size=16M]
        Memory at c0000000 (64-bit, prefetchable) [size=256M]
        Memory at d1000000 (64-bit, non-prefetchable) [size=16M]
        Capabilities: [60] Power Management version 2
        Capabilities: [68] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [78] Express Endpoint IRQ 0
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting

02:06.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 64, IRQ 217
        I/O ports at b000 [size=256]
        Memory at d4000000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] Power Management version 2

02:07.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 64, IRQ 225
        I/O ports at b400 [size=256]
        Memory at d4001000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] Power Management version 2

Steps:

 1. My BIOS version is uptodate.
 2. I don't see this option under BIOS setup (Award BIOS)
 3. Same result: md5sums are wrong
 4. Occasionaly md5sums are wrong.

 I have do it another test: I passed mem86test all night and it doesn't reports
any error. 

 And another test: I have change this server to rhel4.6 x86_64 fully patched and
almost host doesn't freezes like occurs with rhel5.1 and rhel5.2, but md5sums
continues wrong, but using iommu=soft option sometimes are wrong and sometimes
not, like occurs with rehl5.1 and rhel5.2

 Many thanks.
Comment 4 Anton Arapov 2008-06-11 05:34:24 EDT
Carlos, have you seen any oops/debug messages when host were frozen? Did your
host work after, or reboot was needed? 
Comment 5 Carlos Lopez 2008-06-11 06:57:17 EDT
> Carlos, have you seen any oops/debug messages when host were frozen?
No. Console goes black
> Did your host work after, or reboot was needed?
I need to do a hard reboot. Host doesn't responds to ping, or ssh, etc. 

Comment 6 Carlos Lopez 2008-06-11 07:42:33 EDT
Dear Anton,

 Host just freeze, but now using rhel4.6 x86_64. In resume: md5sums are wrong
and host occasionally freezes using rhel5.1, rhel5.2 and rhel4.6 (i386 and
x86_64) with all patches applied.
Comment 7 Anton Arapov 2008-06-11 09:19:39 EDT
Carlos, I'd like to ask you to reproduce this problem again in RHEL5, crash the
kernel and provide me vmcore file.

I have a thoughts at the moment, but still investigating. vmcore can be helpful,
but not definitely.

http://kbase.redhat.com/faq/FAQ_105_9036.shtm
Comment 8 Anton Arapov 2008-06-11 09:25:28 EDT
Forgot to add in comment #7...

give me an output of:
  hdparm -i /dev/sda
  hdparm -i /dev/sdb

and what hard drive was used for read and write file?
can you reproduce this issue on both your drives?

Comment 9 Carlos Lopez 2008-06-11 10:04:16 EDT
hdparm -i /dev/sd(a,b) returns me this error:

/dev/sd(a,b):
 HDIO_GET_IDENTITY failed: Inappropriate ioctl for device

But executing hdparm /de/sd(a,b) result is:

/dev/sdb:
 HDIO_GET_MULTCOUNT failed: Inappropriate ioctl for device
 IO_support   =  0 (default 16-bit)
 readonly     =  0 (off)
 readahead    = 256 (on)
 geometry     = 9729/255/63, sectors = 80026361856, start = 0
[root@asfaloth ~]# hdparm /dev/sda

/dev/sda:
 HDIO_GET_MULTCOUNT failed: Inappropriate ioctl for device
 IO_support   =  0 (default 16-bit)
 readonly     =  0 (off)
 readahead    = 256 (on)
 geometry     = 19457/255/63, sectors = 160040803840, start = 0

 I use both drives to read and write at the same time. And yes I can reproduce
this issue on both drives.

 I will install Rhel5.2 as soon as possible.

 Many thanks Anton.

Comment 10 Anton Arapov 2008-06-11 10:15:08 EDT
Hmm, seems you executed hdparm on RHEL4? try it on RHEL5.

and the question about reproducing the issue on both drives independently...
Comment 11 Carlos Lopez 2008-06-13 04:34:09 EDT
Dear Anton,

 I have re-installed this host with rhel5.2 x86_64 with kernel 2.6.18-92.1.1.el5.

 Output commands:
 
 root@test ~]# hdparm -i /dev/sda

/dev/sda:

 Model=Maxtor 6V160E0                          , FwRev=VA111900,
SerialNo=V39FSLNG            
 Config={ Fixed }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=?16?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0:  ATA/ATAPI-1 ATA/ATAPI-2
ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

 * signifies the current active mode



[root@test ~]# hdparm -i /dev/sdb

/dev/sdb:

 Model=ST380013AS                              , FwRev=3.18    ,
SerialNo=4JV72CVW            
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=8192kB, MaxMultSect=16, MultSect=?16?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=156301488
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 
 AdvancedPM=no WriteCache=enabled
 Drive conforms to: ATA/ATAPI-6 T13 1410D revision 2:  ATA/ATAPI-1 ATA/ATAPI-2
ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6

 * signifies the current active mode


 I am using two drives to read and write test file. First I have copied iso file
via ssh to both drives. After I have copy iso file from one drive to another and
viceversa. Results are the same: md5sums are wrong.

 And yes, I can reproduce this issue on both drives. Sometimes md5sum is ok on
sdb and wrong on sda, or sometimes is good on sda and bad on sdb.

 I will send you vmcore when host freezes but occupies 3GB. How can I send you??

 
Comment 12 Anton Arapov 2008-06-13 04:55:18 EDT
perfect ... can you share vmcore somehow? And I will download it myself.
Comment 13 Carlos Lopez 2008-06-13 05:10:29 EDT
Server doesn't freeze until now. I'll let you know as soon as crash is produced.
I can only share via my public server with a home adsl (160Kb download).
Comment 14 Anton Arapov 2008-06-13 05:12:51 EDT
let's wait for freeze, I'll try to find a place where you can upload vmcore.
Comment 15 Anton Arapov 2008-06-13 05:34:34 EDT
place for upload:
  http://kbase.redhat.com/faq/FAQ_80_11089.shtm
Comment 16 Carlos Lopez 2008-06-13 06:22:43 EDT
Oops, sorry Anton but rhel5.2 is reporting this errors:

ata1.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0
ata1.00: CPB resp_flags 0x11: , CMD error
ata1.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0
         res 51/04:00:0b:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata1.00: status: { DRDY ERR }
ata1.00: error: { ABRT }
ata2.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0
ata2.00: CPB resp_flags 0x11: , CMD error
ata2.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0
         res 51/04:00:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ABRT }

 Under rhel4.6 x86_64 doesn't appears ... What does it means??

Comment 17 Anton Arapov 2008-06-13 07:53:40 EDT
hmmm.... sweet, looks like a problem with sata cables. Make sure you have
everything well attached. And try to change your sata cables to another ones.
Comment 18 Anton Arapov 2008-06-16 04:21:48 EDT
putting to NEEDINFO:
  - check with another cables, due to last message from kernel
  - vmcore, to obtain even more info... seems unknown yet issue, if the cables
are good.
  - will be also good to know, whether the issue is reproducible in latest Fedora
Comment 19 Carlos Lopez 2008-07-01 12:53:20 EDT
(In reply to comment #18)
> putting to NEEDINFO:
>   - check with another cables, due to last message from kernel
>   - vmcore, to obtain even more info... seems unknown yet issue, if the cables
> are good.
>   - will be also good to know, whether the issue is reproducible in latest Fedora
> 
Hi Anton,

 I have used 6 different sata cables and all returns me same error that I have
posted. Server has halted without doing any vmcore file ...
Comment 20 Anton Arapov 2008-07-02 03:23:01 EDT
Thanks Carlos, for such an extensive cable test. 

Have you tried to crash kernel manually, when server halted? *Alt-SysRq-c*
...don't think, that it will work, but...
Comment 21 Anton Arapov 2008-07-02 07:26:37 EDT
Carlos, have no idea why I didn't asked you before...
Please try to get md5sum of several identical big(couple of gigs) files
consequently. Try to do it several times.
without any copying in your broken system, we should be assured that the files
is really identical. Prepare files on some well-working hardware.

and please, try to reproduce the issue on the latest Fedora.

/me still puzzled ... seems nobody else complaint about MCP04 @ Intel Edition...
but the issue definitely requires further investigation. And now, I really hope
to get different md5sums from the identical files... :-\
Comment 22 Carlos Lopez 2008-07-03 05:46:36 EDT
Hi Anton,

 When I use rhel5.x:

 I can do a kernel crash manually with *Alt-SysRq-c*, but not when server is halted.

 I have used offical iso RHEL5.2 dvd to do new md5sums tests:

 Original iso from my laptop with CentOS 5.2 that doesn't have any errors:

 [carlos@silmarillion iso-images]$ md5sum rhel-5.2-server-x86_64-dvd.iso 
 5390f9f703e083cf1470fb438ea49082  rhel-5.2-server-x86_64-dvd.iso

 When I do md5sum several times on problematic server, md5sums are wrong but
every time is the same wrong md5sum:

 [root@pepe tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 ffeb4fe9997771500a825a0ad1e6a322  rhel-5.2-server-x86_64-dvd.iso
 [root@pepe tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 ffeb4fe9997771500a825a0ad1e6a322  rhel-5.2-server-x86_64-dvd.iso
 [root@pepe tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 ffeb4fe9997771500a825a0ad1e6a322  rhel-5.2-server-x86_64-dvd.iso
 [root@pepe tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 ffeb4fe9997771500a825a0ad1e6a322  rhel-5.2-server-x86_64-dvd.iso
 [root@pepe tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 ffeb4fe9997771500a825a0ad1e6a322  rhel-5.2-server-x86_64-dvd.iso
 [root@pepe tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 ffeb4fe9997771500a825a0ad1e6a322  rhel-5.2-server-x86_64-dvd.iso


 When I use Fedora 9 x86_64 (default install) on this problematic server:

 md5sums continues wrong, but results are the same that occurs on rhel5.2

 [root@fedorasrv tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 8d712c92acbd9878098a3dccb7da4845  rhel-5.2-server-x86_64-dvd.iso
 [root@fedorasrv tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 8d712c92acbd9878098a3dccb7da4845  rhel-5.2-server-x86_64-dvd.iso
 [root@fedorasrv tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 8d712c92acbd9878098a3dccb7da4845  rhel-5.2-server-x86_64-dvd.iso
 [root@fedorasrv tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 8d712c92acbd9878098a3dccb7da4845  rhel-5.2-server-x86_64-dvd.iso
 [root@fedorasrv tmp]# md5sum rhel-5.2-server-x86_64-dvd.iso 
 8d712c92acbd9878098a3dccb7da4845  rhel-5.2-server-x86_64-dvd.iso

 
 Any ideas Anton??

 P.D: I put dmesg output from Fedora, and previouslly errors with sata bus
doesn't appears:

 Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.25-14.fc9.x86_64 (mockbuild@) (gcc version 4.3.0 20080428 (Red
Hat 4.3.0-8) (GCC) ) #1 SMP Thu May 1 06:06:21 EDT 2008
Command line: ro root=UUID=e08a19c7-fc79-4285-850d-c15fbf3e1bc3
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
 BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bfff0000 (usable)
 BIOS-e820: 00000000bfff0000 - 00000000bfff3000 (ACPI NVS)
 BIOS-e820: 00000000bfff3000 - 00000000c0000000 (ACPI data)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 786416) 1 entries of 3200 used
end_pfn_map = 1048576
DMI 2.3 present.
ACPI: RSDP 000F6CD0, 0014 (r0 MSTEST)
ACPI: RSDT BFFF3040, 003C (r1 MSTEST TESTONLY 42302E31 AWRD  1010101)
ACPI: FACP BFFF30C0, 0074 (r1 MSTEST TESTONLY 42302E31 AWRD  1010101)
ACPI: DSDT BFFF3180, 44E0 (r1 MSTEST AWRDACPI     1000 MSFT  100000C)
ACPI: FACS BFFF0000, 0040
ACPI: SLIC BFFF77C0, 0176 (r1 MSTEST TESTONLY 42302E31 AWRD  1010101)
ACPI: MCFG BFFF7980, 003C (r1 MSTEST TESTONLY 42302E31 AWRD  1010101)
ACPI: APIC BFFF76C0, 008E (r1 MSTEST TESTONLY 42302E31 AWRD  1010101)
ACPI: SSDT BFFF7A00, 019E (r1  PmRef  Cpu0Ist     3000 INTL 20040311)
ACPI: SSDT BFFF7E90, 0275 (r1  PmRef    CpuPm     3000 INTL 20040311)
No NUMA configuration found
Faking a node at 0000000000000000-00000000bfff0000
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 786416) 1 entries of 3200 used
Bootmem setup node 0 0000000000000000-00000000bfff0000
  NODE_DATA [000000000000d000 - 0000000000014fff]
  bootmap [0000000000015000 -  000000000002cfff] pages 18
early res: 0 [0-fff] BIOS data page
early res: 1 [6000-7fff] SMP_TRAMPOLINE
early res: 2 [200000-75db3b] TEXT DATA BSS
early res: 3 [37c88000-37fefa7a] RAMDISK
early res: 4 [9f800-a07ff] EBDA
early res: 5 [8000-cfff] PGTABLE
 [ffffe20000000000-ffffe200001fffff] PMD ->ffff810001200000 on node 0
 [ffffe20000200000-ffffe200003fffff] PMD ->ffff810001600000 on node 0
 [ffffe20000400000-ffffe200005fffff] PMD ->ffff810001a00000 on node 0
 [ffffe20000600000-ffffe200007fffff] PMD ->ffff810001e00000 on node 0
 [ffffe20000800000-ffffe200009fffff] PMD ->ffff810002200000 on node 0
 [ffffe20000a00000-ffffe20000bfffff] PMD ->ffff810002600000 on node 0
 [ffffe20000c00000-ffffe20000dfffff] PMD ->ffff810002a00000 on node 0
 [ffffe20000e00000-ffffe20000ffffff] PMD ->ffff810002e00000 on node 0
 [ffffe20001000000-ffffe200011fffff] PMD ->ffff810003200000 on node 0
 [ffffe20001200000-ffffe200013fffff] PMD ->ffff810003600000 on node 0
 [ffffe20001400000-ffffe200015fffff] PMD ->ffff810003a00000 on node 0
 [ffffe20001600000-ffffe200017fffff] PMD ->ffff810003e00000 on node 0
 [ffffe20001800000-ffffe200019fffff] PMD ->ffff810004200000 on node 0
 [ffffe20001a00000-ffffe20001bfffff] PMD ->ffff810004600000 on node 0
 [ffffe20001c00000-ffffe20001dfffff] PMD ->ffff810004a00000 on node 0
 [ffffe20001e00000-ffffe20001ffffff] PMD ->ffff810004e00000 on node 0
 [ffffe20002000000-ffffe200021fffff] PMD ->ffff810005200000 on node 0
 [ffffe20002200000-ffffe200023fffff] PMD ->ffff810005600000 on node 0
 [ffffe20002400000-ffffe200025fffff] PMD ->ffff810005a00000 on node 0
 [ffffe20002600000-ffffe200027fffff] PMD ->ffff810005e00000 on node 0
 [ffffe20002800000-ffffe200029fffff] PMD ->ffff810006200000 on node 0
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  1048576
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
    0:        0 ->      159
    0:      256 ->   786416
On node 0 totalpages: 786319
  DMA zone: 56 pages used for memmap
  DMA zone: 1384 pages reserved
  DMA zone: 2559 pages, LIFO batch:0
  DMA32 zone: 10695 pages used for memmap
  DMA32 zone: 771625 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
  Movable zone: 0 pages used for memmap
Nvidia board detected. Ignoring ACPI timer override.
If you got timer trouble try acpi_use_timer_override
ACPI: PM-Timer IO Port: 0x1408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled)
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
Allocating PCI resources starting at c2000000 (gap: c0000000:20000000)
SMP: Allowing 4 CPUs, 2 hotplug CPUs
PERCPU: Allocating 45232 bytes of per cpu data
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 774184
Policy zone: DMA32
Kernel command line: ro root=UUID=e08a19c7-fc79-4285-850d-c15fbf3e1bc3
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
TSC calibrated against PM_TIMER
time.c: Detected 3213.851 MHz processor.
spurious 8259A interrupt: IRQ7.
Console: colour VGA+ 80x25
console [tty0] enabled
Checking aperture...
Calgary: detecting Calgary via BIOS EBDA area
Calgary: Unable to locate Rio Grande table in EBDA - bailing!
Memory: 3092712k/3145664k available (2656k kernel code, 52564k reserved, 1396k
data, 356k init)
CPA: page pool initialized 1 of 1 pages preallocated
SLUB: Genslabs=13, HWalign=64, Order=0-1, MinObjects=4, CPUs=4, Nodes=1
Calibrating delay using timer specific routine.. 6430.88 BogoMIPS (lpj=3215444)
Security Framework initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Mount-cache hash table entries: 256
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU 0/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
ACPI: Core revision 20070126
Using local APIC timer interrupts.
APIC timer calibration result 12554101
Detected 12.554 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 6427.83 BogoMIPS (lpj=3213917)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU 1/1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU1: Thermal monitoring enabled (TM1)
              Intel(R) Pentium(R) D CPU 3.20GHz stepping 04
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
sizeof(vma)=176 bytes
sizeof(page)=56 bytes
sizeof(inode)=560 bytes
sizeof(dentry)=208 bytes
sizeof(ext3inode)=760 bytes
sizeof(buffer_head)=104 bytes
sizeof(skbuff)=224 bytes
sizeof(task_struct)=6272 bytes
CPU0 attaching sched-domain:
 domain 0: span 00000000,00000003
  groups: 00000000,00000001 00000000,00000002
  domain 1: span 00000000,00000003
   groups: 00000000,00000003
CPU1 attaching sched-domain:
 domain 0: span 00000000,00000003
  groups: 00000000,00000002 00000000,00000001
  domain 1: span 00000000,00000003
   groups: 00000000,00000003
net_namespace: 1016 bytes
Time:  9:15:36  Date: 07/03/08
NET: Registered protocol family 16
No dock devices found.
ACPI: bus type pci registered
PCI: Using MMCONFIG at e0000000 - efffffff
PCI: Using configuration type 1
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Transparent bridge - 0000:00:12.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSID] (IRQs 3 4 5 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LFID] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LPCA] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0
ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0
ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0
ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCP] (IRQs 20 21 22 23) *0, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 15 devices
ACPI: ACPI bus type pnp unregistered
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
DMAR:parse DMAR table failure.
PCI-GART: No AMD northbridge found.
ACPI: RTC can wake from S4
system 00:01: ioport range 0x1400-0x147f has been reserved
system 00:01: ioport range 0x1480-0x14ff has been reserved
system 00:01: ioport range 0x1800-0x187f has been reserved
system 00:01: ioport range 0x1880-0x18ff has been reserved
system 00:01: ioport range 0x1c00-0x1c7f has been reserved
system 00:01: ioport range 0x1c80-0x1cff has been reserved
system 00:01: iomem range 0x0-0x0 could not be reserved
system 00:02: ioport range 0x4d0-0x4d1 has been reserved
system 00:02: ioport range 0x800-0x87f has been reserved
system 00:02: ioport range 0x290-0x29f has been reserved
system 00:02: ioport range 0x290-0x294 has been reserved
system 00:02: ioport range 0x880-0x88f has been reserved
system 00:0d: iomem range 0xe0000000-0xefffffff could not be reserved
system 00:0e: iomem range 0xcee00-0xcffff has been reserved
system 00:0e: iomem range 0xf0000-0xf7fff could not be reserved
system 00:0e: iomem range 0xf8000-0xfbfff could not be reserved
system 00:0e: iomem range 0xfc000-0xfffff could not be reserved
system 00:0e: iomem range 0xbfff0000-0xbfffffff could not be reserved
system 00:0e: iomem range 0xffff0000-0xffffffff has been reserved
system 00:0e: iomem range 0x0-0x9ffff could not be reserved
system 00:0e: iomem range 0x100000-0xbffeffff could not be reserved
system 00:0e: iomem range 0xfec00000-0xfec00fff has been reserved
system 00:0e: iomem range 0xfee00000-0xfee00fff could not be reserved
PCI: Bridge: 0000:00:02.0
  IO window: disabled.
  MEM window: 0xd0000000-0xd2ffffff
  PREFETCH window: 0x00000000c0000000-0x00000000cfffffff
PCI: Bridge: 0000:00:12.0
  IO window: a000-afff
  MEM window: 0xd3000000-0xd4ffffff
  PREFETCH window: disabled.
PCI: Setting latency timer of device 0000:00:02.0 to 64
PCI: Setting latency timer of device 0000:00:12.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 3486k freed
audit: initializing netlink socket (disabled)
type=2000 audit(1215076536.679:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci 0000:01:00.0: Boot video device
PCI: Setting latency timer of device 0000:00:02.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:02.0:pcie00]
Allocate Port Service[0000:00:02.0:pcie03]
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: ACPI0007:00 is registered as cooling_device0
ACPI: SSDT BFFF7E00, 0087 (r1  PmRef  Cpu1Ist     3000 INTL 20040311)
ACPI: ACPI0007:01 is registered as cooling_device1
Non-volatile memory driver v1.2
Linux agpgart interface v0.103
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
Switched to high resolution mode on CPU 1
Switched to high resolution mode on CPU 0
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
brd: module loaded
input: Macintosh mouse button emulation as /devices/virtual/input/input0
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
rtc_cmos 00:04: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one year, y3k
cpuidle: using governor ladder
cpuidle: using governor menu
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
registered taskstats version 1
  Magic number: 0:901:271
Freeing unused kernel memory: 356k freed
Write protecting the kernel read-only data: 1120k
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
PCI: Enabling device 0000:00:0b.0 (0004 -> 0006)
ACPI: PCI Interrupt Link [APCG] enabled at IRQ 23
ACPI: PCI Interrupt 0000:00:0b.0[B] -> Link [APCG] -> GSI 23 (level, low) -> IRQ 23
PCI: Setting latency timer of device 0000:00:0b.0 to 64
ohci_hcd 0000:00:0b.0: OHCI Host Controller
ohci_hcd 0000:00:0b.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:0b.0: irq 23, io mem 0xd5000000
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 5 ports detected
usb usb1: New USB device found, idVendor=1d6b, idProduct=0001
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: OHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.25-14.fc9.x86_64 ohci_hcd
usb usb1: SerialNumber: 0000:00:0b.0
USB Universal Host Controller Interface driver v3.0
SCSI subsystem initialized
Driver 'sd' needs updating - please use bus_type methods
libata version 3.00 loaded.
sata_nv 0000:00:10.0: version 3.5
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 22
ACPI: PCI Interrupt 0000:00:10.0[A] -> Link [APSI] -> GSI 22 (level, low) -> IRQ 22
sata_nv 0000:00:10.0: Using ADMA mode
PCI: Setting latency timer of device 0000:00:10.0 to 64
scsi0 : sata_nv
scsi1 : sata_nv
ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xcc00 irq 22
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xcc08 irq 22
input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio1/input/input2
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: HPA detected: current 156299375, native 156301488
ata1.00: ATA-6: ST380013AS, 3.18, max UDMA/133
ata1.00: 156299375 sectors, multi 16: LBA48 
ata1.00: configured for UDMA/133
ata2: SATA link down (SStatus 0 SControl 300)
scsi 0:0:0:0: Direct-Access     ATA      ST380013AS       3.18 PQ: 0 ANSI: 5
ata1: DMA mask 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61
sd 0:0:0:0: [sda] 156299375 512-byte hardware sectors (80025 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
sd 0:0:0:0: [sda] 156299375 512-byte hardware sectors (80025 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
 sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI disk
ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 21
ACPI: PCI Interrupt 0000:00:11.0[A] -> Link [APSJ] -> GSI 21 (level, low) -> IRQ 21
sata_nv 0000:00:11.0: Using ADMA mode
PCI: Setting latency timer of device 0000:00:11.0 to 64
scsi2 : sata_nv
scsi3 : sata_nv
ata3: SATA max UDMA/133 cmd 0x9e0 ctl 0xbe0 bmdma 0xe000 irq 21
ata4: SATA max UDMA/133 cmd 0x960 ctl 0xb60 bmdma 0xe008 irq 21
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
PCI: Setting latency timer of device 0000:00:0f.0 to 64
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.13.0-ioctl (2007-10-18) initialised: dm-devel@redhat.com
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
type=1404 audit(1215076544.602:2): enforcing=1 old_enforcing=0 auid=4294967295
ses=4294967295
SELinux:8192 avtab hash slots allocated. Num of rules:172315
SELinux:8192 avtab hash slots allocated. Num of rules:172315
security:  8 users, 12 roles, 2301 types, 116 bools, 1 sens, 1024 cats
security:  72 classes, 172315 rules
SELinux:  Completing initialization.
SELinux:  Setting up existing superblocks.
SELinux: initialized (dev dm-0, type ext3), uses xattr
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
SELinux: policy loaded with handle_unknown=allow
type=1403 audit(1215076545.212:3): policy loaded auid=4294967295 ses=4294967295
input: PC Speaker as /devices/platform/pcspkr/input/input3
forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61.
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 20
ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link [APCH] -> GSI 20 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:0e.0 to 64
sd 0:0:0:0: Attached scsi generic sg0 type 0
input: Power Button (FF) as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input4
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input5
ACPI: Power Button (CM) [PWRB]
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
parport_pc 00:0a: reported by Plug and Play ACPI
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
8139too Fast Ethernet driver 0.9.28
forcedeth 0000:00:0e.0: ifname eth0, PHY OUI 0x5043 @ 1, addr 00:14:85:ed:e6:03
forcedeth 0000:00:0e.0: highdma csum timirq gbit lnktim desc-v3
i2c-adapter i2c-0: nForce2 SMBus adapter at 0x700
i2c-adapter i2c-1: nForce2 SMBus adapter at 0x900
pata_amd 0000:00:0f.0: version 0.3.10
PCI: Setting latency timer of device 0000:00:0f.0 to 64
scsi4 : pata_amd
scsi5 : pata_amd
ata5: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14
ata6: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15
ata5.00: ATAPI: SONY DVD-ROM DDU1615, GYS1, max UDMA/66
ata5: nv_mode_filter: 0x1f39f&0x1f01f->0x1f01f, BIOS=0x1f000 (0xc5000000)
ACPI=0x1f01f (30:600:0x13)
ata5.00: configured for UDMA/66
ppdev: user-space parallel port driver
scsi 4:0:0:0: CD-ROM            SONY     DVD-ROM DDU1615  GYS1 PQ: 0 ANSI: 5
scsi 4:0:0:0: Attached scsi generic sg1 type 5
ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
ACPI: PCI Interrupt 0000:02:06.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 18
eth1: RealTek RTL8139 at 0xffffc2000063a000, 00:80:5a:4a:2b:fc, IRQ 18
eth1:  Identified 8139 chip type 'RTL-8100B/8139D'
ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19
ACPI: PCI Interrupt 0000:02:07.0[A] -> Link [APC4] -> GSI 19 (level, low) -> IRQ 19
eth2: RealTek RTL8139 at 0xffffc20000642000, 00:08:54:4c:55:60, IRQ 19
eth2:  Identified 8139 chip type 'RTL-8100B/8139D'
8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
Driver 'sr' needs updating - please use bus_type methods
sr0: scsi3-mmc drive: 4x/40x cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 4:0:0:0: Attached scsi CD-ROM sr0
loop: module loaded
EXT3 FS on dm-0, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev sda1, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
Adding 2031608k swap on /dev/mapper/VolGroup00-LogVol01.  Priority:-1 extents:1
across:2031608k
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ip6_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ip_tables: (C) 2000-2006 Netfilter Core Team
eth2: link up, 100Mbps, full-duplex, lpa 0x45E1
eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
eth0: no link during initialization.
ADDRCONF(NETDEV_UP): eth0: link is not ready
eth2: no IPv6 routers present
eth1: no IPv6 routers present

 
Comment 23 Anton Arapov 2008-07-03 06:02:50 EDT
Thanks a lot, Carlos. I have to think a lot. Will let you know, once I will have
any possible solution.
Comment 24 Anton Arapov 2008-07-03 06:34:40 EDT
Issue has been escalated to kernel.org's bugzilla, in order to get more eyes on
the issue.

http://bugzilla.kernel.org/show_bug.cgi?id=11029
Comment 25 Carlos Lopez 2008-07-30 16:44:36 EDT
Hi Anton,

 Any news about this??
Comment 26 Anton Arapov 2008-07-31 01:43:35 EDT
  Unfortunately - no news. No any other people who hit the same problem and
absence of similar hardware around....
Comment 27 Carlos Lopez 2008-07-31 04:08:23 EDT
Thanks Anton, but only one question. Do you think that problem will be on SATA
controller or memory??
Comment 28 Anton Arapov 2008-07-31 04:15:46 EDT
it does not look like a memory problem at all. it can be the physical problem of
south bridge, sata controller, or/and how the Linux Kernel handle this chipset.
Comment 29 Carlos Lopez 2008-07-31 05:24:24 EDT
Ok, I will do a last test when I can. i will put a new pci sata device and
attach hard disks on it, and I will try all tests...
Comment 30 Anton Arapov 2008-07-31 05:26:16 EDT
This would be nice! Thanks!
Comment 31 Anton Arapov 2008-08-07 07:31:11 EDT
any results?
Comment 32 Carlos Lopez 2008-08-07 07:49:54 EDT
(In reply to comment #31)
> any results?

Yes. I have changed SATA controller by another SATA controller based on VIA chip ... and results are the same: bad md5sums. But i have diabled one RAM module, and all md5sums are ok.

 Actually, I am testing this server to see if sometimes freezes ... I am using it as a NFS server to serve disks to two ESX 3.5 servers ... and I am waiting ...
Comment 33 Anton Arapov 2008-08-07 08:06:50 EDT
hmm... very, _very_ strange, it was so stable(repeatable) corruption and the one disabled RAM helped... But I'm happy. :)
Please, let me know the results.
Comment 34 Carlos Lopez 2008-09-10 06:00:18 EDT
Hi Anton,

 After 30 days with this server in production environment, all works ok. I think that we can close this bug. What is your opinion Anton??

 Only exists one thing that I don't understand ... why memtest doesn't returns me any error??
Comment 35 Anton Arapov 2008-09-11 02:50:43 EDT
Frankly, I've seen such issue with RAM only ~5 years ago ... that's very strange, and I can't say anything in particular. I need hardware to play in order to investigate what could happend. It was because two RAM modules from different vendors didn't like each other, but by specification they was identical, even the chips were identical.

Anyway, I'm happy that issue was found, and it can be fixed by disabling RAM.
Comment 36 Anton Arapov 2008-09-11 02:51:35 EDT
CLOSED as NOTABUG: problem in RAM module.

Note You need to log in before you can comment on or make changes to this bug.