Bug 5986

Summary: Data currution in MSDOS and VFAT filesystems with SMP kernel 2.2.12
Product: [Retired] Red Hat Linux Reporter: fjp
Component: kernelAssignee: Michael K. Johnson <johnsonm>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 6.1CC: fjp, graeme, juanco
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2000-02-13 02:36:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description fjp 1999-10-15 15:53:37 UTC
I think that I found what seems to be a very sensitive BUG,
because I causes data corruption, without any sending
any warnings to the user.

I just upgraded redHat 6.0 to 6.1 and started to have
problems after reading data in VFAT and MSDOS filesystems.

This problem only happens with the new kernel-smp-2.2.12
that comes in RedHat 6.1

Old kernel SMP 2.2.5 from RedHat 6.0 doesn't have this
problem.
Also, the new 2.2.12, without SMP support is working fine.

My system configuration is:
A Tekram DUAL BX mainboard (upgraded to the latest BIOS).
2 Pentium III 450 CPUs.
1 DIMM 64 Mb of PC100 SDRAM
1 Tekram DC390F (Ultra Wide SCSI) NCR/Symbios 53C875
(using module NCR53c8xx)
1 Quantum Atlas III Ultra Wide SCSI 8Gb Hard Drive
1 Fujitsu 230Mb MO Drive SCSI-2
1 Plextor 12Plex CDROM DRIVE SCSI-2
1 Plextor Plexriter 4/12X ULTRA-SCSI
1 Nvidea RIVA TNT 16 Mb AGP Graph. card
1 RTL 8139 10/100 PCI Ethernet CARD
1 Tekram TV Card (Bt848)
1 AWE 32 Sound Card
1 Standard Floppy disk (1.44)

I swaped the DIMM memory module, with another one and the
error persisted
I phisically swapped processor 1 with processor 2 , and the
problem persisted.
The CPUs are not overclocked and are not over-heated.

This is the procedure I used to detect the problem:

1 - Copy a large ammount of files (20 to 30 Mb) from a
MSDOS or VFAT file-system to a regular ext2 filesystem

2 - Use diff, to check for any differences beetwwen the
copied files and the original ones.

If this procedure is made using kernel-2.2.12-20smp, "diff"
will start detecting random errors in some (random) files.

If we run "diff" several times, it will detect differences
in different files/locations.

If I run this test using kernel 2.2.5-15smp, or kernel
2.2.12-20 (without smp) it runs fine. Diff won't find any
differences.

I tried this test using other file-systems, like ext2, and
iso9960, but I never had any errors.

I tried to format the same partition using mkfs.ext2 and
next using mkfs.msdos, to check for any surface errors.

Using the same partition, I got errors with the MSDOS or
VFAT filesystem, but no errors with the ext2.

I verified the integrity of the kernel-smp-2.2.12.20smp.rmp
"RPM File", and the MD5 checksum is correct. This seems to
indicate that there were no errors in the download.

Right now, I'm using the old kernel 2.2.5 with SMP and
it's working fine.

I hope that this report will contribute to solve this
problem.

Best Regards,

Fernando Pereira

PS: I algo got problems while installing RPMs from DOS/VFAT
filesystms in the 2.2.12smp kernel. It stops with a "cpio"
bad-magick error.

Also, all my "MSDOS" file-systems were formatted using
mkfs.msdos and never with DOS/Win since I only run Linux on
my systems.

Comment 1 Cristian Gafton 2000-01-04 22:27:59 UTC
Assigned to dledford

Comment 2 Graeme Vetterlein 2000-02-12 23:38:59 UTC
I may have hit a related problem.

Attempts to mount a VFAT file system while running an SMP kernel fail. Running
a UP kernel works jusst dandy. Simply having the line in /etc/fstab is enough
to stop the boot in it's tracks. Taking it out boots fine. Then doing the mount
by hand results in a 'hung' mount command. This will not die with a kill -9
(prty > PZERO?) so I guess it's not hung on the user side. This also causes
the shutdown to fail as the filesystems cannot be unmounted.

Comment 3 fjp 2000-02-13 02:36:59 UTC
Just upgraded my system to kernel 2.2.14 (SMP) that I downloaded from
kernel.org.

It seems to be working just fine.

Fernando Pereira
(fjp)

Comment 4 Alan Cox 2000-08-08 13:44:38 UTC
This seems to have been sorted in 2.2.14 (thus in 6.2) - reopen if not