Red Hat Bugzilla – Bug 474078
Kernel crash after copying files between partitions
Last modified: 2009-02-02 18:42:28 EST
Created attachment 325321 [details]
Description of problem:
Something like the BUG 473757
1) I also had no problem with fedora 9.
2) System has the last update. (kernel 18.104.22.168-117.fc10.i686)
If you copy a 512MB or bigger file from the linux partition (EXT3) to a NTFS partition (XP) the system crash and all the mounted partition get corrupted, including the NTFS.
1) Copy the Fedora 10 ISO from EXT3 to a mounted NTFS partition.
2) Copy the Dedora 10 ISO to any other EXT3 partition.
Steps to Reproduce:
1. cp any_big_file.iso /media/disk
2. cp any_big_file.iso /boot
1) System get inoperative.
2) No way to restart.
3) Computer stop responding.
4) You are forced to press reset button.
00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (int gfx)
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 5)
00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [IDE mode]
00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:12.1 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI1 Controller
00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:13.1 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI1 Controller
00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3a)
00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller
00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon HD 3200 Graphics
01:05.1 Audio device: ATI Technologies Inc RS780 Azalia controller
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
03:0e.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)
CPU0: AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping 01
CPU1: AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping 01
powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ processors
scsi 0:0:0:0: Direct-Access ATA ST3400620AS 3.AA PQ: 0 ANSI: 5
scsi 2:0:0:0: Direct-Access ATA ST3400620AS 3.AA PQ: 0 ANSI: 5
ata1.00: ATA-7: ST3400620AS, 3.AAK, max UDMA/133
ata3.00: ATA-7: ST3400620AS, 3.AAK, max UDMA/133
ata5.00: ATAPI: SONY DVD RW DRU-830A, SS25, max UDMA/66
ata6.00: ATAPI: PIONEER DVD-RW DVR-216D, 1.09, max UDMA/66
Do you only see the corruption when copying to NTFS? what if it's ext3->ext3?
Yes, there is corruption also with ext3 because when you are doing a copy, the kernel crash immediately. Some times it shows a message that the kernel crashed and if I want to send the report to developers, this message appeared only one time. But in 99% of cases there is no log to send, there is no error messages, no core dump, nothing.
That kernel message will be critical to getting to the root cause of this problem.
A digital photo of the screen is fine if that is all you can do.
Believe me, I've tried that.
I've set a 'tail -f /var/log/messages' on my screen (full screen) to see if i can get any error before the crash but the kernel crash before the system could register anything.
I've searched the partitions for any core dump, any error messages, but not was found. In my believe is impossible to any application to work or register any error if the kernel just crash.
All I know for sure is that the problem relies with the F10 kernel and the last F9 kernel update because if I install F9 and let it be all day long, it works just fine. The problem begins after the update of F9 or after install of F10.
The common error that I see soon as "Kernel is Alive" message is the error:
ata1: device not ready
ata3: device not ready
This appear soon after F10 is installed, and F9 after the update.
So, try getting into single user mode or at least no X; tailing /var/log/messages won't help as that's a file that has to be written, and the crash will likely stop that from happening.
So ctrl-alt-F1 or whatnot to the primary console, then log in and do your copy from the commandline; you'll see the kernel panic (if that's what it is) on the text console.
Maybe do "dmesg -n 8" first for good measure.
Created attachment 326569 [details]
Fedora 10 Kernel crash 1
Fedora 10 Kernel crash 1, screen photo.
Created attachment 326570 [details]
Fedora 10 Kernel crash 2 (continuation)
Fedora 10 Kernel crash 1, screen photo. (continuation)
Created attachment 326571 [details]
Fedora 10 Kernel crash 3 (continuation)
Fedora 10 Kernel crash 1, screen photo. (continuation)
I've tried my best to give you much information as possible.
The crash happens as the bug title says "Kernel crash after copying files between partitions".
I've just copied a 4GB ISO file from one ext3 partition to another ext3 partition.
I did try to copy the /var/log/messages that had the logged the kernel panic, but as soon as I try the computer gives a continuous and non stop 'beep'.
Not to forget the message at the boot:
ata1: softreset device not ready
ata3: softreset device not ready
It's getting better :) can you scroll up to get the beginning of the oops, please. (pgup, or shift-pgup, or ctrl-pgup, etc... I never remember)
Sorry, the terminal freeze when this happens, but it wold help if I could remember the command to pass the kernel to change the resolution, i think it was something related with a frame buffer, maybe a 1280x1024 terminal would have a lot of space for that panic information.
Created attachment 326606 [details]
Kernel Panic 1280x1024 FB
This is the best i can do because the terminal freeze after the kernel panic, I've set the FB to 1280x1024 (add a vga=794 after quiet option).
This is made in the same conditions, copy a 4GB file from one partition to another (both ext3).
Ok, that's a bit more of a clue... but, <shift><PgUp> does not do anything in this state?
I ask because from comment #8 it appears that you were still able to type at the prompt, even. Scrolling back to get the beginning of the oops would be super-helpful.
The line I can see, "Taint: G W" indicates that a warning happened some time in the past.... it'd be nice to know what that was.
This isn't looking like an ext3 bug per se; it's looking more like something in the VM, but let's see if we can possibly find out any more.
This looks vaguely like http://www.kerneloops.org/raw.php?rawid=65514&msgid= and others reported....
Sorry, no go.
That comment #8 thing was the only time that i could do something, it never happen again, any key I press after the oops, the computer gives a continuous 'BEEP'. I've tried 10 times after that photo, but I stop pushing because every time I start Windows XP and do a scan disk it show a lot of errors.
Is there any chance you're using an ext3 driver under Windows?
ext3 driver for windows?
Never heard of it.
(In reply to comment #16)
> ext3 driver for windows?
> Never heard of it.
Good! Now forget I mentioned it :)
Created attachment 327391 [details]
One more kernel crash
This time, I've removed the gnome-desktop and xorg keeping only the terminal and doing nothing, not copying files, nothing, the kernel crash by it self.
I've setup kdump to try to dump more information.
the list corruption is *probably* the result of some other memory corruption...
Here's an idea; can you try running the debug kernel variant, http://kojipkgs.fedoraproject.org/packages/kernel/22.214.171.124/117.fc10/i686/kernel-debug-126.96.36.199-117.fc10.i686.rpm, and see if you get any more info?
(yum install kernel-debug could do it for you)
Also, thank you for setting up kdump btw! Perhaps you can get the whole dmesg out of the crash image?
I will try Eric, thanks for the info about kernel-debug.
Okay I've found a bug in the -ati driver if you are running X when this happens.
and let me know if it helps.
Thanks Dave Airlie.
Could you please replay to Bug 473757 at https://bugzilla.redhat.com/show_bug.cgi?id=473757
Sorry, i'm not using Fedora 10 anymore, too many things are broken in this Fedora version, I was a Fedora user for many years, Fedora 7 and 8 was the best Fedora i ever used.
The guy in Bug 473757 has the same issues than I and also have the same motherboard model. Maybe he can do this tests with this new driver.
Wellington, if ever you have some time to test, that would be greatly appreciated.
Closing as INSUFFICIENT_DATA for now, feel free to reopen and add comments especially if Dave's build fixes the problem.