Bug 129436

Summary: Kernel 2.6.7-1.494.2.2 oops while writing large dataset to vfat partition
Product: [Fedora] Fedora Reporter: Paul W. Frields <stickster>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2CC: wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-01 21:39:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Kernel oops message (hand copied)
none
Kernel panic messages (hand-copied) none

Description Paul W. Frields 2004-08-09 02:05:10 UTC
Description of problem:
Performing large copy from an NTFS partition (/dev/hda1) to VFAT
(/dev/hdc1) for backup purposes. I've also seen kernel oopsing when
doing a tar over the network to the same partition. It appears not to
matter what the source is, but only that the destination is the VFAT
partition. The /dev/hdc drive is SMART capable and enabled, with no
failure warnings to date.

Version-Release number of selected component (if applicable):
2.6.7-1.494.2.2

How reproducible:
Every time

Steps to Reproduce:
1. cd /mnt/ntfs/source/path
2. find . | cpio -paumd /mnt/vfat/targetdir
 - OR -
1. On remote machine:
   tar c . | ssh user@fc2box "cat - > /mnt/vfat/targetdir/backup.tar"
  
Actual results:
Kernel oops. Have not been able to determine if this occurs at the
same place every time, but does not appear so due to timing of the oops.

Expected results:
Flawless copy.

Additional info:
Kernel oops messages follow. (These are copied by hand and thus could
contain an error, although I do not think so.)

Comment 1 Paul W. Frields 2004-08-09 02:05:38 UTC
Created attachment 102507 [details]
Kernel oops message (hand copied)

Comment 2 Arjan van de Ven 2004-08-09 06:28:55 UTC
just to rule out, can you reproduce this without ntfs module in the game ?

Comment 3 Paul W. Frields 2004-08-09 12:07:39 UTC
Just as an FYI, the second case I cited (unfortunately not the one for
which I wrote the oops messages), in which a tar written via ssh to
the same partition, did not involve ntfs. That module is not normally
loaded on my system unless I mount the /dev/hda1 partition, since I
use autofs to take care of it.

The machine's at home, so I will try to do this tonight and post the
resulting messages.

Comment 4 Paul W. Frields 2004-08-09 22:10:57 UTC
Created attachment 102548 [details]
Kernel panic messages (hand-copied)

Arjan,

I took the following steps to eliminate a few variables from the testing:

1. Did not mount NTFS.
2. Mounted VFAT partition at /mnt/prob.
3. Booted machine2 using FC2 install CD 1, rescue mode.
4. At shell on machine2 (192.168.1.2 = system running 494 kernel):
    dd if=/dev/hda5 bs=32k | ssh 192.168.1.2 "cat - > /mnt/prob/hda5-dd.img"

The attached is a hand-copied dump to text of the kernel panic that happened
this time. I'm not a programmer and don't know exactly what to make of this,
but I see a few areas of coincidence... The call trace on the "oops" I caught
last night started with ide_build_sglist, and tonight the panic has the EIP at
ide_build_sglist. Last night there were a lot of symbols for IDE/DMA stuff
listed, tonight the kernel BUG message flags include/asm/dma-mapping.h.

Hope this helps. Please feel free to ask for more information if I can provide
it. In case it helps, this machine is an Athlon Tbird 1.4GHz, Iwill KK266+ (VIA
KT133a) mobo, VIA VT82C686b southbridge (UDMA100); both drives are reported by
/proc/ide/via and hdparm to be running at UDMA5 (100).

Comment 5 Paul W. Frields 2004-08-22 03:06:07 UTC
Repeated test from comment #4 under new kernel 2.6.8-1.521 and results
are now as expected. Recommend closing this bug. Thanks Arjan.

Comment 6 Paul W. Frields 2004-08-22 03:09:09 UTC
Actually, don't close it quite yet. I am going to try the NTFS -> VFAT
test again, on the off chance that the problem has to do with
exercising IDE0 -> IDE1 on this system. My NTFS partition is on
/dev/hda1, the VFAT partition on /dev/hdc1. (And yes, I *do* keep
threatening my wife with removing W2K.)

Comment 7 Paul W. Frields 2004-09-01 21:39:27 UTC
Under 2.6.8-1.521, I ran:

  dd if=/dev/hda of=/mnt/vfat/ddtest.bin bs=32k

Again, /mnt/vfat is on /dev/hdc. I skipped using NTFS as the source
since that wouldn't tell us anything about IDE bus exercising per se.
I let it go for 4 GB (max reached, that partition uses a 64K cluster),
no problems. Consider this CLOSED, CURRENTRELEASE.