Bug 473757 - Filesystem corruption after install of Fedora 10
Filesystem corruption after install of Fedora 10
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-ati (Show other bugs)
10
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Airlie
Fedora Extras Quality Assurance
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-30 05:51 EST by Darryl Bond
Modified: 2009-12-18 02:02 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-12-18 02:02:44 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Xorg log for ATI driver when machine segfaulting on pretty much everything (89.02 KB, text/plain)
2008-12-18 03:42 EST, Darryl Bond
no flags Details
Xorg for drv-ati-6.9.0-62 which hasn't hda any problems (so far) (132.08 KB, text/plain)
2008-12-18 03:43 EST, Darryl Bond
no flags Details
Xorg log for radeonhd which did not display any problems (181.49 KB, text/plain)
2008-12-18 03:44 EST, Darryl Bond
no flags Details

  None (edit)
Description Darryl Bond 2008-11-30 05:51:35 EST
Description of problem:
Severe root filesystem corruption after running F10 for a few hours. The box has had F9 for some time and a re-install of F9 has not exhibited the problem

Version-Release number of selected component (if applicable):
Fedora 10 kernel-2.6.27.5-117.fc10.x86_64

How reproducible:
I did 4 clean installs with similar results.
* Errors such as directories seen as Stale NFS handles.
* Files with attributes like ?--- --- ---
* Filesystem would end up so corrupt that the machine would freeze and would not reboot.



Steps to Reproduce:
1.Install F10 from DVD successful.
2.Yum update is successful.
3.Reboot machine after the update
4.Leave the machine running for a few hours
 
Actual results:
The output of fsck -n /dev/sda1 (very truncated). The machine would not boot after 3 hours running after I did this fsck.

fsck 1.41.3 (12-Oct-2008)
Warning!  /dev/sda1 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/ contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 229106 is in use, but has dtime set.  Fix? no

Inode 229107 is in use, but has dtime set.  Fix? no

Inode 229107, i_blocks is 33554432, should be 0.  Fix? no

Inode 229108 is in use, but has dtime set.  Fix? no

Inode 229108, i_blocks is 83886080, should be 0.  Fix? no

Inode 229109 is in use, but has dtime set.  Fix? no

Inode 229109, i_blocks is 218103808, should be 0.  Fix? no

Inode 229110 is in use, but has dtime set.  Fix? no

Inode 229110, i_blocks is 874652194, should be 0.  Fix? no

Inode 229111 is in use, but has dtime set.  Fix? no

Inode 229111 has imagic flag set.  Clear? no

Inode 229111 has compression flag set on filesystem without compression support.  Clear? no

Inode 229111, i_blocks is 4294967295, should be 0.  Fix? no

Inode 229112 has EXTENTS_FL flag set on filesystem without extents support.
Clear? no

Inode 229112 is in use, but has dtime set.  Fix? no

Inode 229112 has imagic flag set.  Clear? no

Inode 229112 has compression flag set on filesystem without compression support.  Clear? no

Inode 229112 has INDEX_FL flag set but is not a directory.
Clear HTree index? no

HTREE directory inode 229112 has an invalid root node.
Clear HTree index? no

Inode 229112, i_blocks is 4289111718, should be 0.  Fix? no

Inode 229113 is in use, but has dtime set.  Fix? no

Inode 229113 has imagic flag set.  Clear? no

Inode 229113 has compression flag set on filesystem without compression support.  Clear? no

Inode 229113, i_blocks is 4279638318, should be 0.  Fix? no

Inode 229114 has EXTENTS_FL flag set on filesystem without extents support.
Clear? no

Inode 229114 is in use, but has dtime set.  Fix? no

Inode 229114 has compression flag set on filesystem without compression support.  Clear? no

Inode 229114, i_blocks is 4278914589, should be 0.  Fix? no

Inode 229115 is in use, but has dtime set.  Fix? no

Inode 229115, i_blocks is 4278848025, should be 0.  Fix? no

Inode 229116 is in use, but has dtime set.  Fix? no

Inode 229116 has compression flag set on filesystem without compression support.  Clear? no

Inode 229116, i_blocks is 4278716435, should be 0.  Fix? no

node 230210 has INDEX_FL flag set but is not a directory.
Clear HTree index? no

HTREE directory inode 230210 has an invalid root node.
Clear HTree index? no

Inode 230210, i_blocks is 4293914608, should be 0.  Fix? no

... etc

Expected results:


Additional info:
Tried a few things:
* Tried different RAM, same results.
* Updated BIOS
* Motherboard GA-MA78GM-S2H (onboard ATI HD 3200) with Sempron CPU
* Reinstalled F9 and did yum update, no problems.
Comment 1 Wellington Uemura 2008-12-01 19:35:36 EST
I can confirm this bug, I have a GA-MA78GM-SH2 Rev 1.0 F5 BIOS, Athlon 64 X2 4400.
Comment 2 Darryl Bond 2008-12-09 04:48:11 EST
What information would help get this moving?

The machine has run under F9 perfectly since I posted the bug.

I re-installed F10 on another partition and it was useless within an hour. There does not seem to be corruption on any other partition other than /.
The home partition and filesystem was created under F9.
Comment 3 Eric Sandeen 2008-12-09 15:02:34 EST
Are there any other relevant error messages, any IO errors or the like?

For the posted fsck output in the original description:

> Warning!  /dev/sda1 is mounted.
> Warning: skipping journal recovery because doing a read-only filesystem check.

Pointing fsck at a mounted filesystem is expected to show errors.  What sort of user-visible errors do you see when you say "useless within an hour?"

Can you please look at system logs etc to see if there is any more info, and/or fsck the root filesystem from a livecd or rescue disk, to get a better picture of what things look like...

Thanks,
-Eric
Comment 4 Darryl Bond 2008-12-10 04:48:04 EST
There are no disk errors, if that is what you mean.
I set up grub on F9 to point to and boot F10 soon after installing it. I also turned off the rhgb and quiet options. I then booted the machine and it ran fine. I did several reboots and ran it for a while without anything obvious.

I then did a yum update of the kernel to pick up the 2.6.27.7 kernel and copied the grub config to the F9 grub.conf but forgot to get rid of the rhgb and quiet.

I booted it and it pretty much died straight away, the gdm login did not come up. I tried ssh in from else where and got:
[dbond@gold ~]$ ssh root@192.168.12.120
root@192.168.12.120's password: 
Last login: Wed Dec 10 18:11:41 2008 from 192.168.12.100
/bin/bash: Exec format error
Connection to 192.168.12.120 closed.

I could not reboot it and had to reset it. Next time, the result was similar but the ssh session looked like this:
[dbond@gold ~]$ ssh root@192.168.12.120
root@192.168.12.120's password: 
Last login: Wed Dec 10 18:11:50 2008 from 192.168.12.100
-bash: error while loading shared libraries: !!/: cannot open shared object file: No such file or directory
Connection to 192.168.12.120 closed.

I reset again (had to, it was unresponsive on the console) and booted with rhgb and quiet off but on the new kernel. The boot completed and I could log in.

I rebooted onto F9 and did a fsck of the F10 root filesystem:
[root@tin ~]# fsck -f -n /dev/sda6
fsck 1.41.3 (12-Oct-2008)
e2fsck 1.41.3 (12-Oct-2008)
Pass 1: Checking inodes, blocks, and sizes
Inode 179953 is in use, but has dtime set.  Fix? no

Inode 179953, i_blocks is 352321536, should be 0.  Fix? no

Inode 179954 is in use, but has dtime set.  Fix? no

Inode 179954, i_blocks is 621743887, should be 0.  Fix? no

Inode 179955 is in use, but has dtime set.  Fix? no

Inode 179955 has imagic flag set.  Clear? no

Inode 179955 has compression flag set on filesystem without compression support.  Clear? no

Inode 179955 has INDEX_FL flag set but is not a directory.
Clear HTree index? no

HTREE directory inode 179955 has an invalid root node.
Clear HTree index? no

Inode 179955, i_blocks is 3215369894, should be 0.  Fix? no

Inode 179956 is in use, but has dtime set.  Fix? no

Inode 179956 has imagic flag set.  Clear? no

Inode 179956 has INDEX_FL flag set but is not a directory.
Clear HTree index? no

HTREE directory inode 179956 has an invalid root node.
Clear HTree index? no

Inode 179956, i_blocks is 4282400832, should be 0.  Fix? no

Inode 179957 is in use, but has dtime set.  Fix? no

Inode 179957, i_blocks is 469762048, should be 0.  Fix? no

Inode 179958 is in use, but has dtime set.  Fix? no

Inode 179958, i_blocks is 251658240, should be 0.  Fix? no

Inode 179959 is in use, but has dtime set.  Fix? no

Inode 179959, i_blocks is 83886080, should be 0.  Fix? no

Inode 179960 is in use, but has dtime set.  Fix? no

Inode 179960, i_blocks is 16777216, should be 0.  Fix? no

Inode 344513 is in use, but has dtime set.  Fix? no

Inode 344513, i_blocks is 16777216, should be 0.  Fix? no

Inode 344514 is in use, but has dtime set.  Fix? no

Inode 344514, i_blocks is 83886080, should be 0.  Fix? no

Inode 344515 is in use, but has dtime set.  Fix? no

Inode 344515 has imagic flag set.  Clear? no

Inode 344515 has INDEX_FL flag set but is not a directory.
Clear HTree index? no

HTREE directory inode 344515 has an invalid root node.
Clear HTree index? no

Inode 344515, i_blocks is 4282400832, should be 0.  Fix? no

Inode 344516 has EXTENTS_FL flag set on filesystem without extents support.
Clear? no

Inode 344516 is in use, but has dtime set.  Fix? no
.....
Inode 344524 (/usr/lib64/python2.5/lib-dynload/cStringIO.so) has invalid mode (00).
Clear? no

Entry 'cStringIO.so' in /usr/lib64/python2.5/lib-dynload (344489) has an incorrect filetype (was 1, should be 0).
Fix? no

Inode 344525 (/usr/lib64/python2.5/lib-dynload/cmathmodule.so) has invalid mode (00).
Clear? no

Entry 'cmathmodule.so' in /usr/lib64/python2.5/lib-dynload (344489) has an incorrect filetype (was 1, should be 0).
Fix? no

Inode 344527 (/usr/lib64/python2.5/lib-dynload/cryptmodule.so) has invalid mode (00).
Clear? no

Entry 'cryptmodule.so' in /usr/lib64/python2.5/lib-dynload (344489) has an incorrect filetype (was 1, should be 0).
Fix? no

Inode 344528 (/usr/lib64/python2.5/lib-dynload/datetime.so) has invalid mode (00).
Clear? no

Entry 'datetime.so' in /usr/lib64/python2.5/lib-dynload (344489) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry 'und' in /usr/share/locale (179254) has deleted/unused inode 179700.  Clear? no

Entry 'und' in /usr/share/locale (179254) has an incorrect filetype (was 2, should be 0).
Fix? no

Entry 'kg' in /usr/share/locale (179254) has deleted/unused inode 179697.  Clear? no

Entry 'kg' in /usr/share/locale (179254) has an incorrect filetype (was 2, should be 0).
Fix? no

Entry 'mis' in /usr/share/locale (179254) has deleted/unused inode 179698.  Clear? no

Entry 'mis' in /usr/share/locale (179254) has an incorrect filetype (was 2, should be 0).
Fix? no

Entry 'smn' in /usr/share/locale (179254) has deleted/unused inode 179699.  Clear? no

Entry 'smn' in /usr/share/locale (179254) has an incorrect filetype (was 2, should be 0).
Fix? no

Entry 'libsqlite3.so.0' in /usr/lib64 (179204) has deleted/unused inode 179965.  Clear? no

Entry 'libsqlite3.so.0' in /usr/lib64 (179204) has an incorrect filetype (was 7, should be 0).
Fix? no

Inode 179954 (/usr/lib64/libusbpp-0.1.so.4.4.4) has invalid mode (00).
Clear? no

Entry 'libusbpp-0.1.so.4.4.4' in /usr/lib64 (179204) has an incorrect filetype (was 1, should be 0).
Fix? no

Inode 179953 (/usr/lib64/libusbpp-0.1.so.4) has invalid mode (00).
Clear? no

Entry 'libusbpp-0.1.so.4' in /usr/lib64 (179204) has an incorrect filetype (was 7, should be 0).
Fix? no

Entry 'bmp2tiff' in /usr/bin (179196) has deleted/unused inode 179967.  Clear? no

Entry 'bmp2tiff' in /usr/bin (179196) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry 'fax2ps' in /usr/bin (179196) has deleted/unused inode 179968.  Clear? no

Entry 'fax2ps' in /usr/bin (179196) has an incorrect filetype (was 1, should be 0).
Fix? no

Entry 'libpaper-1.1.23' in /usr/share/doc (179172) has deleted/unused inode 229920.  Clear? no

Entry 'libpaper-1.1.23' in /usr/share/doc (179172) has an incorrect filetype (was 2, should be 0).
Fix? no

Entry 'portreserve-0.0.3' in /usr/share/doc (179172) has deleted/unused inode 229906.  Clear? no

Entry 'portreserve-0.0.3' in /usr/share/doc (179172) has an incorrect filetype (was 2, should be 0).
Fix? no

Inode 179960 (/usr/lib64/libhistory.so.5) has invalid mode (00).
Clear? no

Entry 'libhistory.so.5' in /usr/lib64 (179204) has an incorrect filetype (was 7, should be 0).
Fix? no

Entry 'sensors.1.gz' in /usr/share/man/man1 (195603) has deleted/unused inode 229914.  Clear? no

etc.

I did a fsck -y and rebooted and tried different combinations of rhgb and quiet. It is hard to say but the quiet option seemed to break every time and rhgb sometimes. I expect that the filesystem faults that were repaired have left some debris and may be causing some problems.

I think it is the quiet option that does the damage. I will do another clean install and try turning off the quiet option and see if it is reliable then.
Comment 5 Darryl Bond 2008-12-10 06:32:18 EST
Scratch that.
I did a clean install, and fixed the grub.conf to not use rhgb or quiet. I never did a graphical or quiet boot. The first boot was fine. I rebooted into F9 and fscked the filesystem (it was clean).
I did a yum update kernel.

Booted again and could not log in, also the CTRL-ALT-Fx keys did not respond.
I could not reboot so reset and did a fsck -f in F9. It was clean even after using the reset button.

I booted F10 again and successfully logged in and after a while saw this in the messages file:
Dec 10 21:19:57 localhost gnome-session[2538]: WARNING: Could not parse desktop file /etc/xdg/autostart/gnome-user-share.desktop: Stale NFS file handle
Dec 10 21:19:57 localhost gnome-session[2538]: WARNING: could not read /etc/xdg/autostart/gnome-user-share.desktop
[root@tin ~]# ls -l /etc/xdg/autostart/
ls: cannot access /etc/xdg/autostart/gnome-user-share.desktop: Stale NFS file handle
total 56
-rw-r--r-- 1 root root 6418 2008-11-07 10:46 gnome-at-session.desktop
-rw-r--r-- 1 root root 1923 2008-10-24 23:32 gnome-settings-daemon.desktop
-????????? ? ?    ?       ?                ? gnome-user-share.desktop
-rw-r--r-- 1 root root 4966 2008-11-11 00:57 gpk-update-icon.desktop
-rw-r--r-- 1 root root  286 2008-10-27 22:50 imsettings-applet.desktop
-rw-r--r-- 1 root root  238 2008-09-14 01:42 kerneloops-applet.desktop
-rw-r--r-- 1 root root  206 2008-02-18 12:56 krb5-auth-dialog.desktop
-rw-r--r-- 1 root root  290 2008-10-28 13:46 nm-applet.desktop
-rw-r--r-- 1 root root  197 2008-11-02 07:14 pulseaudio.desktop
-rw-r--r-- 1 root root 6431 2008-10-22 03:01 redhat-print-applet.desktop
-rw-r--r-- 1 root root  212 2008-10-16 07:36 sealertauto.desktop
-rw-r--r-- 1 root root 4039 2008-09-08 19:58 user-dirs-update-gtk.desktop

I did a find / -print for a while and dmesg shows
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
init_special_inode: bogus i_mode (0)
Comment 6 Eric Sandeen 2008-12-10 12:08:49 EST
I would be very surprised if the kernel's "quiet" option made this difference, but I guess I've seen stranger things.

Are you using any special mount options, or is there anything unique about /dev/sda?

Gathering the earliest kernel messages which indicate trouble would probably be most useful; if you boot into runlevel 3 (not X) and then ssh in, do some work, try to invoke a disaster, and see what showed up on the console, that might offer more clues.  Things like "/bin/bash: Exec format error" are too far removed from what is likely the underlying problem.

Does this machine go through suspend/resume of any sort?
Comment 7 Wellington Uemura 2008-12-10 12:45:47 EST
I'm have the same problem as the guy that open this BUG ticket, also have the same GA-MA78GM-S2H board and two SATA HDD, I also opened this bug ticket https://bugzilla.redhat.com/show_bug.cgi?id=474078

I have decided to let F10 got and get back to F9, but after a 650M or so of updates the F9 is having the same issues as F10, along with the same error message at the "Kernel Alive" screen and I'm not using any type of RAID.

ata1: device not ready
ata3: device not ready

If you leave the machine running for some time it lock/crash, only pressing the reset button solve the problem. The problem happens also if you copy a big data from one partition to another or create a ISO file with mkisofs, or copy a 450M video from CD to your desktop (ext3 partition).

This "quiet" option did not change the situation and also I doesn't suspend/resume my computer.
Comment 8 Darryl Bond 2008-12-10 16:17:20 EST
I have not used suspend/resume. 

I have also not seen the errors that Wellington saw on F9. I have 2 almost identical machines, one that I have installed F9 and F10 and one with just F9. I have kept up with the F9 updates on both machines and have not seen the same problem as F10 on either box.
I have seen the 'ata1: device not ready' since 2.6.26 on F9 though but it has not caused any issues on F9.

There is nothing special (just defaults after 6 attempts) about the F10 install except that I use:
/dev/sda1 / F9
/dev/sda6 / F10

The last install does not mount any other partitions except swap.
I shall reinstall F10 without X and see what the result is.
Comment 9 Darryl Bond 2008-12-11 05:31:05 EST
I did an install and unchecked gnome. 
The machine installed and ran up in run level 3.
I updated the kernel and rebooted onto it (like last night)
I ran a script from a remote ssh session
unalias cp
while true
do
cp -ar /lib /tmp
sync
sync
rm -rf /tmp/lib
sync
sync
done
for about 2 hours, nothing on the console.
While that was going, I thought that the Xserver might be causing the problem
I started Xorg -ac and ran glxgears from another box onto the display for about 1 hour.
I rebooted under F9 and fsck -y -f /dev/sda6. 
No errors???

What now?
Comment 10 Darryl Bond 2008-12-11 06:31:05 EST
I did a fresh install with gnome this time. Before it had booted, I edited /etc/inittab and grub.conf to disable quiet, rhgb and start in run level 3.

I booted F10, installed the latest kernel; and rebooted.
I created a user account and (thinking that the screen saver was suspending the machine) did a statx and disabled the screen saver and ran my filesystem test script for 10 minutes. It was fine.

I rebooted and logged in as root on tty0 and ordinary user on tty1 and ran startx.
The gnome display did not complete so I changed to tty0 just as root was kicked out and I could not log back in.

these are the /var/adm/messages from the relevant time (I did a ctrl-alt-delete and it rebooted):
 Dec 11 21:17:36 localhost acpid: client connected from 2577[0:500]
Dec 11 21:17:36 localhost kernel: pci 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
Dec 11 21:17:39 localhost kernel: fuse init (API version 7.9)
Dec 11 21:17:44 localhost pulseaudio[3199]: main.c: Called SUID root and real-time/high-priority scheduling was requested in the configuration. However, we lack the necessary privileges:
Dec 11 21:17:44 localhost pulseaudio[3199]: main.c: We are not in group 'pulse-rt' and PolicyKit refuse to grant us privileges. Dropping SUID again.
Dec 11 21:17:44 localhost pulseaudio[3199]: main.c: For enabling real-time scheduling please acquire the appropriate PolicyKit privileges, or become a member of 'pulse-rt', or increase the RLIMIT_NICE/RLIMIT_RTPRIO resource limits for this user.
Dec 11 21:17:44 localhost pulseaudio[3199]: main.c: High-priority scheduling enabled in configuration but not allowed by policy.
Dec 11 21:17:44 localhost pulseaudio[3199]: core-util.c: setpriority(): Permission denied
Dec 11 21:17:44 localhost pulseaudio[3205]: pid.c: Stale PID file, overwriting.
Dec 11 21:17:45 localhost pulseaudio[3211]: main.c: Called SUID root and real-time/high-priority scheduling was requested in the configuration. However, we lack the necessary privileges:
Dec 11 21:17:45 localhost pulseaudio[3211]: main.c: We are not in group 'pulse-rt' and PolicyKit refuse to grant us privileges. Dropping SUID again.
Dec 11 21:17:45 localhost pulseaudio[3211]: main.c: For enabling real-time scheduling please acquire the appropriate PolicyKit privileges, or become a member of 'pulse-rt', or increase the RLIMIT_NICE/RLIMIT_RTPRIO resource limits for this user.
Dec 11 21:17:45 localhost pulseaudio[3211]: main.c: High-priority scheduling enabled in configuration but not allowed by policy.
Dec 11 21:17:45 localhost pulseaudio[3211]: core-util.c: setpriority(): Permission denied
Dec 11 21:17:45 localhost kernel: hda-intel: Invalid position buffer, using LPIB read method instead.
Dec 11 21:17:45 localhost pulseaudio[3217]: pid.c: Daemon already running.
Dec 11 21:17:54 localhost gnome-session[2883]: WARNING: Application 'libcanberra-login-sound.desktop' failed to register before timeout
Dec 11 21:17:54 localhost pulseaudio[3301]: pid.c: Daemon already running.
Dec 11 21:18:13 localhost init: tty1 main process ended, respawning
Dec 11 21:18:13 localhost kernel: console-kit-dae[1699]: segfault at 3120726f66 ip 0000000000415785 sp 00007fff09e11830 error 4 in console-kit-daemon[400000+21000]
Dec 11 21:18:23 localhost init: tty1 main process ended, respawning
Dec 11 21:18:46 localhost init: tty4 main process ended, respawning
Dec 11 21:18:48 localhost acpid: client connected from 2577[0:500]
Dec 11 21:18:53 localhost init: tty2 main process ended, respawning
Dec 11 21:19:02 localhost init: tty3 main process ended, respawning
Dec 11 21:19:04 localhost init: tty5 main process (2074) killed by TERM signal
Dec 11 21:19:04 localhost init: tty6 main process (2082) killed by TERM signal
Dec 11 21:19:04 localhost init: tty1 main process (3832) killed by TERM signal
Dec 11 21:19:04 localhost init: tty4 main process (3941) killed by TERM signal
Dec 11 21:19:04 localhost init: tty2 main process (3965) killed by TERM signal
Dec 11 21:19:04 localhost init: tty3 main process (4044) killed by TERM signal
Dec 11 21:19:05 localhost avahi-daemon[2053]: Got SIGTERM, quitting.
Dec 11 21:19:05 localhost avahi-daemon[2053]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.12.120.
Dec 11 21:19:05 localhost kernel: K05atd[4101]: segfault at 0 ip 0000000000b81f20 sp 00007fff5de64478 error 6 in libc-2.9.so[aee000+168000]
Dec 11 21:19:05 localhost kernel: K10cups[4104]: segfault at 0 ip 00007ffdd07b7f20 sp 00007fffd8ca59b8 error 6 in libc-2.9.so[7ffdd0724000+168000]
Dec 11 21:19:05 localhost kernel: K15gpm[4107]: segfault at 0 ip 00007f32d8b78f20 sp 00007fffe1066e38 error 6 in libc-2.9.so[7f32d8ae5000+168000]
Dec 11 21:19:05 localhost kernel: K25sshd[4115]: segfault at 0 ip 0000000000b81f20 sp 00007ffff6f47808 error 6 in libc-2.9.so[aee000+168000]
Dec 11 21:19:05 localhost kernel: K30sendmail[4120]: segfault at 0 ip 0000000000b81f20 sp 00007fff09ec3fa8 error 6 in libc-2.9.so[aee000+168000]
Dec 11 21:19:05 localhost kernel: K60crond[4123]: segfault at 0 ip 0000000000b81f20 sp 00007fff8b024ab8 error 6 in libc-2.9.so[aee000+168000]
Dec 11 21:19:05 localhost acpid: exiting
Dec 11 21:19:05 localhost kernel: K74haldaemon[4135]: segfault at 0 ip 00007f5cfcdd6f20 sp 00007fff052c6098 error 6 in libc-2.9.so[7f5cfcd43000+168000]
Dec 11 21:19:05 localhost kernel: K74pcscd[4147]: segfault at 0 ip 0000000000b81f20 sp 00007fff1ef3ec48 error 6 in libc-2.9.so[aee000+168000]
Dec 11 21:19:05 localhost bluetoothd[1927]: bridge pan0 removed
Dec 11 21:19:05 localhost bluetoothd[1927]: Stopping SDP server
Dec 11 21:19:05 localhost bluetoothd[1927]: Exit
Dec 11 21:19:05 localhost kernel: K84NetworkManag[4186]: segfault at 0 ip 0000000000b81f20 sp 00007fff4cfe4da8 error 6 in libc-2.9.so[aee000+168000]
Dec 11 21:19:06 localhost rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"

Strange thing is, a fsck -f -y on F9 came up clean.

Might it be sound/pulse audio that is doing it? That is about all that is left that runs with high privileges???
Comment 11 Eric Sandeen 2008-12-11 16:21:37 EST
are you using any 3rd party drivers, or accessing the ext3 partition from windows, or anything like that?
Comment 12 Darryl Bond 2008-12-11 20:35:19 EST
No, Last night I broke in in 20 minutes after the install.
* Clean install from DVD, choosing place boot on first sector of the partition, no shared filesystems from F9, no other operating system on the box.
* Boot to F9
* fsck the F10 filesystem (clean)
* mount F10 fs and fix the grub.conf and inittab for a text boot
* Boot to F10 run level 3, log in as root 
* scp the new kernel RPM from another box (did not use yum)
* rpm -Uvh kernel...
* Create an ordinary user using useradd and passwd (no GUI)
* Reboot under F9
* fsck the F10 filesystem - clean
* reboot to F10
* Log in as the ordinary user (text terminal)
* Log in as root on another VT
* startx
* Do some stuff (including  the copying of /lib to /tmp and syncs for 10 min)
* Turned off screensaver, in case the screensaver was suspending the box.
* check dmesg & /var/log/messages
* Log out of X session as ordinary user
* Log out of text terminal
* init 6 as root
* reboot to F10
* log in as root (tty0)
* log in as ordinary user  (tty1)
( everything seems fine at this point there was absolutely nothing out of the ordinary)
* startx as ordinary user
* Gnome backdrop screen came up but no menu bars. X still seems alive.
* Ctrl-Alt-F1 to get to the root session. This exited back to the login prompt just as the tty switched, ie I just saw it exit as it switched.
* tried to log in (several times) as root, just returned back to the login prompt
* Switched to another tty, same result.
* found that the Xsession had died with my ordinary user flicked out the same as with root.
* Did a CTRL-ALT_DELETE and the machine began rebooting except that there were lots of segfaults as it was shutting down.
* Booted to F9
* Fsck the F10 filesystem but clean??.
* Submitted the above bug report.

There was definitely:
* No custom drivers
* No access from Windows
* No suspend/resume
* I did access from F9
* I ran gnome from a tty (not GDM)
* Sound was working

I think my next test will be to reinstall and put this disk into my other identical box (I have swapped RAM before).
Comment 13 Wellington Uemura 2008-12-12 10:34:38 EST
Try this in F9:

dd if=/dev/zero of=dummy bs=8192 count=512000

It will create a 4GB dummy file, now in F10 try to copy this file in any other partition, you should get a kernel panic.

If it does panic, try this other thing if you have another computer with a serial port, add this option before booting the F10 at the kernel line 'console=ttyS0,9600n8'

http://www.linuxjournal.com/article/7206

Do not login or use X, CRTL+F2 to a free terminal, login as root or use 'su' and do 'dmesg -n 8'

Connect the other computer by serial cable with tera term or telnet, repeat the copy command, you will be able to record the kernel panic log.

Paste all that information here.
Comment 14 Darryl Bond 2008-12-14 03:47:35 EST
I plugged the disk into my other box. It is not quite the same, it has a rev1.0 rather than a rev1.1 motherboard.

I logged in and out many times and used it without any worries (on the rev1.0) motherboard but it would panic every time I tried to create a 4G file on F10 as described above (instead of on F9 and copying it with F10). I did have X running but used tty2 for the test.

I don't have the hardware to set up the serial console as the mobo has a header for the serial port and an old one header to 9pin cable doesn't seem to work. I will get the HW tomorrow.

I could not break the filesystem in over 2 hours of fiddling on the r1.0 box, logging into gdm as ordinary user, surfing etc.

I plugged the disk back into the original box and booted in in runlevel 3.
I logged in as root did the create of the 4G file and it went fine, no panic??

I simply did a startx and the machine locked solid. I reset it and on bootup the filesystem was busted again.

It seems to be related to X running and the rev 1.0 and rev 1.1 is also significant.
Comment 15 Wellington Uemura 2008-12-14 06:00:38 EST
There is my problem, I have a GA-MA78GM-S2H rev 1.0 but the panic happens only on F10.
Comment 16 Wellington Uemura 2008-12-14 06:01:38 EST
Correction: happens on F10 and F9 with the last kernel update.
Comment 17 Eric Sandeen 2008-12-14 09:28:32 EST
Thanks, I think a serial console will make a huge difference in being able to get to the bottom of what's going on here.

In the meantime, I should probably ask some other normal questions like: 

Is the board overclocked?  If so, does it work better if it's not?
Have you run memtest to test the memory?  (how much memory is in the box?)

(Just a note, I'm asking very high-level questions & waiting on that serial console in part because I'm not even sure that this is in fact a filesystem problem; while you see filesystem problems post-crash, we still don't know what the triggering event was...)
Comment 18 Darryl Bond 2008-12-14 16:36:43 EST
the board is not overclocked.

I think I have the right hardware to try the serial console again tonight.
I am certain that you are right that the fault is not in the fs. It never fails in run level 3.
Comment 19 Darryl Bond 2008-12-16 05:40:11 EST
After making a cable to fit the COM header on my motherboard I set up the serial console. I tried everything I could to make it crash or corrupt the fs without success. 
I thought that perhaps the console itself was the problem since tty1 is now the console on startup and the X server in F10. The serial console had a login prompt:

I changed the console to tty12 to see if it would break. I rebooted and checked what was happening.  With tty12, the kernel messages went to tty1, console messages to tty12 and the Xserver opened on tty1.
I used the box for about 10 minutes with no problems.
I logged out and the Xserver restarted. I tried to log in but the login failed part way through. I tried to change to tty12 to check the console messages but found that the Xserver was now on tty12!!!

I could change to tty2 and log in as root but lots of executables were segfaulting when I checked the syslog.

What if the problem was that the Xserver and console cannot write to the same tty on my hardware???
Comment 20 Eric Sandeen 2008-12-16 11:08:44 EST
You might try that memory test, too.
Comment 21 Darryl Bond 2008-12-16 16:11:27 EST
Sorry, I forgot to mention. I swapped the memory from my other machine just in case (better quality dual channel) and did a memcheck. No errors.
This machine exhibits no problems on Fedora 9.
Comment 22 Darryl Bond 2008-12-17 05:50:39 EST
I think I have it.
I re-installed and set up the console on ttyS0 - not problems. I tried to configure it so that the X server started on vt7 where it used to start. Couldn't find how to do it. the custom.conf FirstVT=7 did nothing.

I tried a different tack and installed xorg-drv-radeonhd and set up a xorg.conf that would use the radeonhd instead of the ati driver and now it has run for an hour with the console back to normal (vt1).

So, it appears to be my hardware, normal radeon xorg driver and the console on vt1 that bangs up the filesystem on F10.

Should I check to see if blacklisting the radeon kernel module and using the ati driver fixes it as well?
Comment 23 Eric Sandeen 2008-12-17 11:15:51 EST
(In reply to comment #22)
> I think I have it.

Sorry if I'm being dense; can you be explicit about which driver(s) seem(s) to work and which fails?  If I understand correctly:

xorg-x11-drv-ati-6.9.0-54.fc10.x86_64 fails
xorg-x11-drv-radeonhd-1.2.3-1.6.20081128git.fc10.x86_64 works?

Wellington, can you try the same thing?

-Eric
Comment 24 Eric Sandeen 2008-12-17 12:53:13 EST
cebbert tells me that xorg-x11-drv-ati-6.9.0-62.fc10.x86_64 is in testing, if you want to give that a whirl.
Comment 25 Wellington Uemura 2008-12-17 14:13:27 EST
Eric, I can try.

Just to let you know that the tests that I've done, I was not using the graphic interface and the X was down as you suggested.

F10 has the kdump available?
http://kbase.redhat.com/faq/docs/DOC-6039

So we can stop guessing and find a solution.
Comment 26 Dave Airlie 2008-12-17 15:18:13 EST
lets get some logs from X, however there is no kernel module for this card, and the userspace drivers operate pretty much the same.

If you can attach the logs from both radeonhd and ati but I don't think they do anything different on that card, its not accelerated.
Comment 27 Darryl Bond 2008-12-18 03:42:00 EST
Created attachment 327303 [details]
Xorg log for ATI driver when machine segfaulting on pretty much everything

This Xorg.0.log was showing these errors:
Dec 18 18:24:00 localhost gdm[3167]: 0x00007f65c3f5eb25 in waitpid () from /lib64/libc.so.6
Dec 18 18:24:00 localhost gdm[3167]: #0  0x00007f65c3f5eb25 in waitpid () from /lib64/libc.so.6
Dec 18 18:24:00 localhost gdm[3167]: #1  0x00000000004298cb in ?? ()
Dec 18 18:24:00 localhost gdm[3167]: #2  0x0000000000429976 in ?? ()
Dec 18 18:24:00 localhost gdm[3167]: #3  <signal handler called>
Dec 18 18:24:00 localhost gdm[3167]: #4  0x00007f65c3ee9ed5 in raise () from /lib64/libc.so.6
Dec 18 18:24:00 localhost gdm[3167]: #5  0x00007f65c3eeba43 in abort () from /lib64/libc.so.6
Dec 18 18:24:00 localhost gdm[3167]: #6  0x00007f65c45e3733 in g_assertion_message () from /lib64/libglib-2.0.so.0
Dec 18 18:24:00 localhost gdm[3167]: #7  0x00007f65c45e3bd2 in g_assertion_message_expr ()
Dec 18 18:24:00 localhost gdm[3167]:    from /lib64/libglib-2.0.so.0
Dec 18 18:24:00 localhost gdm[3167]: #8  0x0000000000421e4c in cairo_move_to ()
Dec 18 18:24:00 localhost gdm[3167]: #9  0x0000000000422cc7 in cairo_move_to ()
Dec 18 18:24:00 localhost gdm[3167]: #10 0x000000000041cdcf in cairo_move_to ()
Dec 18 18:24:00 localhost gdm[3167]: #11 0x00007f65c48727bd in g_closure_invoke () from /lib64/libgobject-2.0.so.0

Plus lots of segfaults on shutdown that were not written to the log
Comment 28 Darryl Bond 2008-12-18 03:43:33 EST
Created attachment 327304 [details]
Xorg for drv-ati-6.9.0-62 which hasn't hda any problems (so far)
Comment 29 Darryl Bond 2008-12-18 03:44:46 EST
Created attachment 327305 [details]
Xorg log for radeonhd which did not display any problems
Comment 30 Darryl Bond 2008-12-18 04:31:52 EST
I copied the radeonhd log and updated the ati driver to 6.9.0.62.
I tested that (ati) and it worked just fine, no problems that I could see (and took a copy of it's log).

I reinstalled F10 with a clean install. On reboot the thing could not bring up the setup screen.

I used another console and installed the latest kernel. I also created an ordinary user.

I rebooted and it booted into gdm but the after clicking on the user it would return straight back to the name. I logged into another console as root and looked at the messages log. I also took a copy of the X log (327303).

I rebooted and it displayed segfaults for each shutdown script.

I rebooted into runlevel 3 successfully with no sign of segfaults or corruption. I installed the latest ati driver and did a startx and logged out, rebooted successfully, logged in and out, rebooted and am now using it to enter this report.

Note that I did not need to reinstall. The filesystem was not corrupted this time.
Comment 31 Dave Airlie 2008-12-18 05:32:50 EST
Wierd on the r600 card you have I can't see how any different versions of -ati should matter. These cards don't have memory management or any accelerated rendering. They literally just memcpy the screen into the framebuffer.

But if -62 is stable, then it must have been something in there.
Comment 32 Wellington Uemura 2008-12-18 06:32:18 EST
How did you manage to make HAL use the radeonhd without creating the xorg.conf?

Any way, i've remove gnome-desktop and xorg, the kernel still crashing in the same condition of copying a big file from one partition to another.

I tried to use the F10 64Bit, but this version is broken, the keyboard for brazilian ABNT-2 is wrong, the network configuration keep adding network mask as 192.168.1.1 (my gateway) instead of 255.255.255.0, even if i configure this by hand "ifcfg-eth0" something keeps pushing 192.168.1.1 as network mask.

Not to say that i can't use "vga=794", after the fedora animation i see a black screen, this only works in the 32Bit version.

To make sure I did a memcheck and there is no errors.
Comment 33 Darryl Bond 2008-12-18 16:17:02 EST
I did use an xorg.conf, after installing the radeonhd driver and issuing 
# Xorg :1 -configure
the file created was for the radeonhd?? so I just copied it to /etc/X11
Comment 34 Eric Sandeen 2008-12-18 18:56:20 EST
Could you try running the debug kernel variant, yum install kernel-debug should get it for you?

-Eric
Comment 35 Wellington Uemura 2008-12-18 19:39:41 EST
I will Eric. ;)
Comment 36 Darryl Bond 2008-12-19 05:47:37 EST
Arghh,
xorg-x11-drv-ati bad, xorg-x11-drv-radeonhd good.

I reinstalled F10 with the view of making it the default. I did a clean install over F9.

I rebooted for the first time into RL3 and did a yum update, assumiong that it would all be good based on my previous tests: stupid assumption.
the first boot was broken with 
Dec 19 19:36:08 localhost kernel: pci 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
Dec 19 19:36:09 localhost kernel: sh[2210]: segfault at 0 ip 0000000000000000 sp 00007fff94dd1788 error 14 in ld-2.9.so[110000+20000]
Dec 19 19:36:09 localhost kernel: sh[2211]: segfault at 0 ip 0000000000000000 sp 00007fff87eeb898 error 14 in ld-2.9.so[110000+20000]
Dec 19 19:36:09 localhost kernel: sh[2212]: segfault at 0 ip 0000000000000000 sp 00007fffd95eaf98 error 14 in ld-2.9.so[110000+20000]
Dec 19 19:36:09 localhost kernel: sh[2213]: segfault at 0 ip 0000000000000000 sp 00007fff84a66418 error 14 in ld-2.9.so[110000+20000]
Dec 19 19:36:10 localhost kernel: Default[2215]: segfault at 7fff47c3c990 ip 00007fff47c3c990 sp 00007fff47c3c8b8 error 15
Dec 19 20:01:01 localhost kernel: bash[2416]: segfault at 0 ip 0000000000000000 sp 00007fff3d90df38 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:16 localhost gdm-session-worker[2461]: WARNING: unable to log session
Dec 19 20:12:16 localhost kernel: Default[2462]: segfault at 7fffac02fd80 ip 00007fffac02fd80 sp 00007fffac02fca8 error 15
Dec 19 20:12:16 localhost kernel: Xsession[2461]: segfault at 7fff5259e1c0 ip 00007fff5259e1c0 sp 00007fff5259e0e8 error 15
Dec 19 20:12:16 localhost kernel: SELinux:  Context system_u:object_r:user_gnome_home_t:s0 is not valid (left unmapped).
Dec 19 20:12:16 localhost acpid: client connected from 2473[0:0]
Dec 19 20:12:17 localhost kernel: sh[2508]: segfault at 0 ip 0000000000000000 sp 00007fffd8471e28 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:17 localhost kernel: sh[2509]: segfault at 0 ip 0000000000000000 sp 00007fff8c5cef88 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:17 localhost kernel: sh[2510]: segfault at 0 ip 0000000000000000 sp 00007fff598b8268 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:17 localhost kernel: sh[2511]: segfault at 0 ip 0000000000000000 sp 00007fff3ef0d8b8 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:17 localhost kernel: sh[2512]: segfault at 0 ip 0000000000000000 sp 00007fff317040b8 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:17 localhost kernel: Default[2514]: segfault at 7fff3583b590 ip 00007fff3583b590 sp 00007fff3583b4b8 error 15
Dec 19 20:12:27 localhost kernel: Default[2670]: segfault at 7fffc3cfba40 ip 00007fffc3cfba40 sp 00007fffc3cfb968 error 15
Dec 19 20:12:27 localhost kernel: Xsession[2669]: segfault at 7fff85509120 ip 00007fff85509120 sp 00007fff85509048 error 15
Dec 19 20:12:27 localhost acpid: client connected from 2682[0:0]
Dec 19 20:12:28 localhost kernel: sh[2715]: segfault at 0 ip 0000000000000000 sp 00007fffcf396d48 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:28 localhost kernel: sh[2716]: segfault at 0 ip 0000000000000000 sp 00007fff45b5c508 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:28 localhost kernel: sh[2717]: segfault at 0 ip 0000000000000000 sp 00007fff116de088 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:28 localhost kernel: sh[2718]: segfault at 0 ip 0000000000000000 sp 00007fff63f9f948 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:28 localhost kernel: sh[2719]: segfault at 0 ip 0000000000000000 sp 00007fff0abc9578 error 14 in ld-2.9.so[110000+20000]
Dec 19 20:12:28 localhost kernel: Default[2721]: segfault at 7fff10c4a9a0 ip 00007fff10c4a9a0 sp 00007fff10c4a8c8 error 15
Dec 19 20:12:44 localhost console-kit-daemon[1698]: GLib-GObject-WARNING: IA__g_object_get_valist: value location for `gchararray' passed as NULL
Dec 19 20:12:44 localhost kernel: ck-system-resta[2855]: segfault at 7fff02227fd0 ip 00007fff02227fd0 sp 00007fff02227ef8 error 15
Dec 19 20:12:47 localhost console-kit-daemon[1698]: GLib-GObject-WARNING: IA__g_object_get_valist: value location for `gchararray' passed as NULL
Dec 19 20:12:47 localhost kernel: ck-system-resta[2858]: segfault at 7fff8ee67c10 ip 00007fff8ee67c10 sp 00007fff8ee67b38 error 15

Could not log in with ssh or change consoles but gdm displayed but I could not log in.
I reset and installed radeonhd and cold rebooted. No problems.

I moved my xorg.conf and cold rebooted on the ati driver. Same result as before: segfaults and no login or console change.

I reinstalled the xorg.conf for radeonhd: all wonderful. Note that I still haven't got a busted filesystem. I can't explain why.
At least that saves reinstalling, although I am getting very good at it.

Should I try the debug kernel.
Comment 37 Eric Sandeen 2008-12-19 10:11:38 EST
If you don't mind the pain of a few more crashes, the debug kernel may give more hints as to what's going wrong, especially if you have a method of capturing the console output.

Thanks,
-Eric
Comment 38 Wellington Uemura 2008-12-19 12:01:54 EST
I recommend that you install the debug-kernel, than install:

yum install kernel-debug kdump system-config-kdump

Use the kdump configurator under System and enable it, restart the system.

This will give you much more useful information to pass to the developers.

So, far the kernel didn't crash with kdump enabled.
Comment 39 Darryl Bond 2008-12-21 21:07:31 EST
I used the debug kernel and ati driver for a day with no fault occurring. I thought I would go back and test the standard kernel and the ati kernel. This also had no problem???
I noticed that the standard kernel had a bit less RAM than it should and thought that the kdump would have taken that. I checked the log and saw that the kernel had the crash kernel=128M p[tion. I checked grub.conf and found that the debug kernel did not have the option, while the standard kernel did.

I must have turned on kdump with the standard kernel. I moved the option to the  debug kernel and rebooted (still with the ATI driver).

It did not start X, a reset showed that the filesystem was busted again!!!! I rebooted and it showed that the /usr/libexec/gdm-simple-greeter was a stale NFS filehandle.



I just booted my other install of F10 and fsck'ed the partition. Damage everywhere.
Entry 'gdm-simple-greeter' in /usr/libexec (130345) has deleted/unused inode 97751.  Clear? no

Entry 'gdm-simple-greeter' in /usr/libexec (130345) has an incorrect filetype (was 1, should be 0).
Fix? no

So, I cannot seem to debug the fault:
* Serial console does not trigger
* Debug kernel does not trigger
* Standard kernel with a crash kernel option does not trigger

I keep coming back to a console and ATI X trying to start on the same tty as the common component.
Comment 40 Darryl Bond 2008-12-30 02:42:12 EST
There was new versions of the kernel and ATI driver so I installed them into clean test install (nothing else). 
* kernel-2.6.27.9-159.fc10.x86_64
* xorg-x11-drv-ati-6.9.0-63.fc10.x86_64

I used it for 3 days without a problem!

I booted the kernel-2.6.27.7-134.fc10.x86_64 and it is still stable.

Perhaps 6.9.0-63 is the answer. 

I shall downgrade the ati driver with the latest kernel and see how that goes.
Comment 41 Darryl Bond 2009-01-10 05:15:15 EST
Nope, it broke after a few hours of use. I have just done a fresh install.

A friend bought the latest version of the Mobo to build a mythtv box. He installed F10 today. The filesystem was corrupt after the second reboot. He had created an account and logged in and used it. He had not done an update.



The fix is definitely the radeonhd Xorg driver. I have no problems at all if the radeonhd is used.
Comment 42 François Cami 2009-02-02 17:09:29 EST
Darryl, could you confirm the system is still stable with radeonhd ?

Switching component to xorg-x11-drivers and reassigning to xgl-maint@redhat.com.
Comment 43 Darryl Bond 2009-02-02 18:02:58 EST
It is rock solid with radeonhd.

I upgraded my second install on both machines to 2.6.27.12-170.2.5.fc10.x86_64 and tried the ati driver on both. I have not seen any corruption so far. The have been in use for a couple of hours each doing stuff that would have broken them in the past. 
I am not yet ready to declare it a fix as it caught me before.
Comment 44 François Cami 2009-02-02 18:24:50 EST
OK, please return in a few days to confirm if radeonhd fixes the problem once and for all.

Switching to xorg-x11-drv-ati and assigned. 
Xorg logs are in comments #28 (ati) and #29 (radeonhd).
Comment 45 François Cami 2009-02-02 18:40:15 EST
Darryl,
If at all possible, could you please try the latest xorg-x11-drv-ati build in Koji : http://koji.fedoraproject.org/koji/buildinfo?buildID=80819
It contains a possible fix.
Comment 46 Darryl Bond 2009-02-02 18:45:23 EST
Sorry if I wasn't clear.
* Radeonhd definitely fixes the problem. I had no corruption problems.
* ATI driver with the latest kernel (2.6.27.12-170.2.5.fc10.x86_64) seems to be Ok so far. I do not want to say it is a fix yet as this has happened to me before. It broke the filesystem after about 4hrs of use.

I will test the Koji ati driver tonight.
Comment 47 bob 2009-02-06 00:47:14 EST
Hi, hope this helps someone,

I have a pavilion dv7 laptop with Radeon HD3200 video, Turion X2.

System freezing up if I work with big tar files or do large copy operations.

The driver in #45 above has stopped my freeze-ups. Thankyou.

New fedora10 32-bit install with all updates installed as of today.

Prior to installing F10 32-bit I had installed 64-bit and did not seem to get the crashes but did not spend much time with 64-bit for other reasons.
Comment 48 Darryl Bond 2009-02-06 23:06:36 EST
The 2.6.27.12-170.2.5.fc10.x86_64 kernel did not help. One of the boxes broke before I installed the Koji xorg-x11-drv-ati-6.10.0-2.fc10.x86_64.

I reinstalled and put on xorg-x11-drv-ati-6.10.0-2.fc10.x86_64 both the boxes.
I have been using it for 4 days on the 2 boxes without a problem.

I am reasonably confident that it would have broken by now on at least one of them.
Comment 49 François Cami 2009-02-07 08:07:39 EST
Thank you bob and Darryl for the testing and information.
Comment 50 Peter Janes 2009-02-09 00:04:42 EST
Having similar problems as those reported here.  (Typing quickly/briefly as Firefox is unstable due to this issue and has crashed 3 times when I've gotten too verbose.)  Neither the kernel nor driver updates have made a difference; neither did switching to the xorg-x11-drv-vesa driver.  Video card is a Radeon X300 (PCIE), so I don't think radeonhd is an option.  Corruption most often occurs when dealing with large files (BitTorrent and rsync report checksum errors; copies onto the server via NFS also fail md5sum) or RPM updates (when attempting to undo corrupt files like /bin/ls and libs in /usr/lib64 with "yum reinstall" or "rpm -Uvh --force").
Comment 51 Bug Zapper 2009-11-18 03:00:28 EST
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 52 Bug Zapper 2009-12-18 02:02:44 EST
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.