Bug 474078 - Kernel crash after copying files between partitions
Summary: Kernel crash after copying files between partitions
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 10
Hardware: i386
OS: Linux
low
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-02 01:03 UTC by Wellington Uemura
Modified: 2009-02-02 23:42 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-02-02 23:42:28 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg output (34.98 KB, text/plain)
2008-12-02 01:03 UTC, Wellington Uemura
no flags Details
Fedora 10 Kernel crash 1 (608.71 KB, image/jpeg)
2008-12-11 01:41 UTC, Wellington Uemura
no flags Details
Fedora 10 Kernel crash 2 (continuation) (589.94 KB, image/jpeg)
2008-12-11 01:42 UTC, Wellington Uemura
no flags Details
Fedora 10 Kernel crash 3 (continuation) (565.01 KB, image/jpeg)
2008-12-11 01:44 UTC, Wellington Uemura
no flags Details
Kernel Panic 1280x1024 FB (545.01 KB, image/jpeg)
2008-12-11 11:29 UTC, Wellington Uemura
no flags Details
One more kernel crash (2.38 KB, text/plain)
2008-12-18 23:48 UTC, Wellington Uemura
no flags Details

Description Wellington Uemura 2008-12-02 01:03:09 UTC
Created attachment 325321 [details]
dmesg output

Description of problem:
Something like the BUG 473757

1) I also had no problem with fedora 9.
2) System has the last update. (kernel 2.6.27.5-117.fc10.i686)

If you copy a 512MB or bigger file from the linux partition (EXT3) to a NTFS partition (XP) the system crash and all the mounted partition get corrupted, including the NTFS.

How reproducible:

1) Copy the Fedora 10 ISO from EXT3 to a mounted NTFS partition.
2) Copy the Dedora 10 ISO to any other EXT3 partition.

Steps to Reproduce:
1. cp any_big_file.iso /media/disk
2. cp any_big_file.iso /boot

  
Actual results:

1) System get inoperative.
2) No way to restart.
3) Computer stop responding.
4) You are forced to press reset button.

Additional info:

lspci:

00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (int gfx)
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 5)
00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [IDE mode]
00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:12.1 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI1 Controller
00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:13.1 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI1 Controller
00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3a)
00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller
00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon HD 3200 Graphics
01:05.1 Audio device: ATI Technologies Inc RS780 Azalia controller
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
03:0e.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)

CPU0: AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping 01
CPU1: AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping 01
powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ processors 
Athlon 4400+

scsi 0:0:0:0: Direct-Access     ATA      ST3400620AS      3.AA PQ: 0 ANSI: 5
scsi 2:0:0:0: Direct-Access     ATA      ST3400620AS      3.AA PQ: 0 ANSI: 5
ata1.00: ATA-7: ST3400620AS, 3.AAK, max UDMA/133
ata3.00: ATA-7: ST3400620AS, 3.AAK, max UDMA/133

ata5.00: ATAPI: SONY    DVD RW DRU-830A, SS25, max UDMA/66
ata6.00: ATAPI: PIONEER DVD-RW  DVR-216D, 1.09, max UDMA/66

Comment 1 Eric Sandeen 2008-12-10 17:51:05 UTC
Do you only see the corruption when copying to NTFS?  what if it's ext3->ext3?

Comment 2 Wellington Uemura 2008-12-10 20:22:07 UTC
Yes, there is corruption also with ext3 because when you are doing a copy, the kernel crash immediately. Some times it shows a message that the kernel crashed and if I want to send the report to developers, this message appeared only one time. But in 99% of cases there is no log to send, there is no error messages, no core dump, nothing.

Comment 3 Eric Sandeen 2008-12-10 20:30:01 UTC
That kernel message will be critical to getting to the root cause of this problem.

A digital photo of the screen is fine if that is all you can do.

Comment 4 Wellington Uemura 2008-12-10 22:56:19 UTC
Believe me, I've tried that.

I've set a 'tail -f /var/log/messages' on my screen (full screen) to see if i can get any error before the crash but the kernel crash before the system could register anything.

I've searched the partitions for any core dump, any error messages, but not was found. In my believe is impossible to any application to work or register any error if the kernel just crash.

All I know for sure is that the problem relies with the F10 kernel and the last F9 kernel update because if I install F9 and let it be all day long, it works just fine. The problem begins after the update of F9 or after install of F10.

The common error that I see soon as "Kernel is Alive" message is the error:

ata1: device not ready
ata3: device not ready

This appear soon after F10 is installed, and F9 after the update.

Comment 5 Eric Sandeen 2008-12-10 23:07:31 UTC
So, try getting into single user mode or at least no X; tailing /var/log/messages won't help as that's a file that has to be written, and the crash will likely stop that from happening.

So ctrl-alt-F1 or whatnot to the primary console, then log in and do your copy from the commandline; you'll see the kernel panic (if that's what it is) on the text console.

Maybe do "dmesg -n 8" first for good measure.

-Eric

Comment 6 Wellington Uemura 2008-12-11 01:41:25 UTC
Created attachment 326569 [details]
Fedora 10 Kernel crash 1

Fedora 10 Kernel crash 1, screen photo.

Comment 7 Wellington Uemura 2008-12-11 01:42:35 UTC
Created attachment 326570 [details]
Fedora 10 Kernel crash 2 (continuation)

Fedora 10 Kernel crash 1, screen photo. (continuation)

Comment 8 Wellington Uemura 2008-12-11 01:44:45 UTC
Created attachment 326571 [details]
Fedora 10 Kernel crash 3 (continuation)

Fedora 10 Kernel crash 1, screen photo. (continuation)

Comment 9 Wellington Uemura 2008-12-11 01:50:30 UTC
I've tried my best to give you much information as possible.

The crash happens as the bug title says "Kernel crash after copying files between partitions".

I've just copied a 4GB ISO file from one ext3 partition to another ext3 partition.

I did try to copy the /var/log/messages that had the logged the kernel panic, but as soon as I try the computer gives a continuous  and non stop 'beep'.

Not to forget the message at the boot:

ata1: softreset device not ready
ata3: softreset device not ready

Comment 10 Eric Sandeen 2008-12-11 02:43:04 UTC
It's getting better :)  can you scroll up to get the beginning of the oops, please.  (pgup, or shift-pgup, or ctrl-pgup, etc... I never remember)

Comment 11 Wellington Uemura 2008-12-11 02:51:26 UTC
Sorry, the terminal freeze when this happens, but it wold help if I could remember the command to pass the kernel to change the resolution, i think it was something related with a frame buffer, maybe a 1280x1024 terminal would have a lot of space for that panic information.

Comment 12 Wellington Uemura 2008-12-11 11:29:30 UTC
Created attachment 326606 [details]
Kernel Panic 1280x1024 FB

This is the best i can do because the terminal freeze after the kernel panic, I've set the FB to 1280x1024 (add a vga=794 after quiet option).

This is made in the same conditions, copy a 4GB file from one partition to another (both ext3).

Comment 13 Eric Sandeen 2008-12-11 17:21:09 UTC
Ok, that's a bit more of a clue... but, <shift><PgUp> does not do anything in this state?

I ask because from comment #8 it appears that you were still able to type at the prompt, even.  Scrolling back to get the beginning of the oops would be super-helpful.

The line I can see, "Taint: G      W" indicates that a warning happened some time in the past.... it'd be nice to know what that was.

This isn't looking like an ext3 bug per se; it's looking more like something in the VM, but let's see if we can possibly find out any more.

This looks vaguely like http://www.kerneloops.org/raw.php?rawid=65514&msgid= and others reported....

Comment 14 Wellington Uemura 2008-12-11 18:23:33 UTC
Sorry, no go.

That comment #8 thing was the only time that i could do something, it never happen again, any key I press after the oops, the computer gives a continuous 'BEEP'. I've tried 10 times after that photo, but I stop pushing because every time I start Windows XP and do a scan disk it show a lot of errors.

Comment 15 Eric Sandeen 2008-12-11 21:18:59 UTC
Is there any chance you're using an ext3 driver under Windows?

Comment 16 Wellington Uemura 2008-12-11 22:50:09 UTC
ext3 driver for windows?
Never heard of it.

Comment 17 Eric Sandeen 2008-12-11 22:55:55 UTC
(In reply to comment #16)
> ext3 driver for windows?
> Never heard of it.

Good!   Now forget I mentioned it :)

Comment 18 Wellington Uemura 2008-12-18 23:48:19 UTC
Created attachment 327391 [details]
One more kernel crash

This time, I've removed the gnome-desktop and xorg keeping only the terminal and doing nothing, not copying files, nothing, the kernel crash by it self.

I've setup kdump to try to dump more information.

Comment 19 Eric Sandeen 2008-12-18 23:55:00 UTC
the list corruption is *probably* the result of some other memory corruption... 

Here's an idea; can you try running the debug kernel variant, http://kojipkgs.fedoraproject.org/packages/kernel/2.6.27.5/117.fc10/i686/kernel-debug-2.6.27.5-117.fc10.i686.rpm, and see if you get any more info?

(yum install kernel-debug could do it for you)

-Eric

Comment 20 Eric Sandeen 2008-12-18 23:59:15 UTC
Also, thank you for setting up kdump btw!  Perhaps you can get the whole dmesg out of the crash image?

Comment 21 Wellington Uemura 2008-12-19 00:30:25 UTC
I will try Eric, thanks for the info about kernel-debug.

Comment 22 Dave Airlie 2009-01-29 23:18:15 UTC
Okay I've found a bug in the -ati driver if you are running X when this happens.

please grab 
http://kojipkgs.fedoraproject.org/packages/xorg-x11-drv-ati/6.10.0/2.fc10/

and let me know if it helps.

Comment 23 Wellington Uemura 2009-01-30 01:12:29 UTC
Thanks Dave Airlie.

Could you please replay to Bug 473757 at https://bugzilla.redhat.com/show_bug.cgi?id=473757

Sorry, i'm not using Fedora 10 anymore, too many things are broken in this Fedora version, I was a Fedora user for many years, Fedora 7 and 8 was the best Fedora i ever used.

The guy in Bug 473757 has the same issues than I and also have the same motherboard model. Maybe he can do this tests with this new driver.

Thank you.

Comment 24 François Cami 2009-02-02 23:42:28 UTC
Wellington, if ever you have some time to test, that would be greatly appreciated.
Closing as INSUFFICIENT_DATA for now, feel free to reopen and add comments especially if Dave's build fixes the problem.


Note You need to log in before you can comment on or make changes to this bug.