Bug 451100

Summary: system constantly freezing due to (pointless) disk activity
Product: [Fedora] Fedora Reporter: Sean Middleditch <sean>
Component: firefoxAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 9CC: robatino
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-06-13 14:15:58 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Sean Middleditch 2008-06-12 15:24:51 EDT
Note: I don't think this is a kernel problem at all, but Bugzilla forces me to
choose one of 10,000 "components" when I don't have a freaking clue what is
causing this.

My disk periodically starts doing heavy activity.  This usually lasts for 5-8
seconds, and then stops.  It'll usually start up again a few seconds later. 
It'll do this several times in a row, and then stop for a while.... that while
may be 10 seconds, it may be 10 minutes.

During this time, half my desktop becomes unusable.  Firefox and Pidgin both
completely stop responding to any UI inputs.  (Firefox possibly because of the
fsync issue, not sure about Pidgin... I have logging turned off so I can't
imagine what reason it would even need to access the disk.)  Evolution stops
working if I want to move or open new messages.  (Over IMAP, and I don't have
Offline support on, so again, no clue why.)

There is nothing in my logs that indicate any kind of disk/driver issue like I'd
expect from a bad disk or bad controller.

This is relatively recent.  It starts a little over a week ago.  I have not made
any changes to the system, other than removing an IDE DVD drive.

I can't for the life of me find any tool that will tell me WHY the disk is under
such heavy use, all the time, or WHY applications like Pidgin feel the need to
lock up when this is happening.  Powertop just tells me that the sata_nv driver
is responsible for a ton of wakeups, but that's not much of a surprise.

I can hear the disk actually spinning.  The activity light on the computer is
on.  Top does not at all indicate that any application is doing anything,
although it does show Firefox and Pidgin and a couple other apps in (D)iskwait mode.

Most other apps seem to continue working just fine.  I can open up terminals to
run top for example.  Other apps that need to write the disk usually lock up too
(Vim sessions become unusable, likely due to writing their swap files).

I really cannot take any more of this - my computer is barely usable at times
that last for well over 30 minutes, because I get occassionnal fits of
responsive applications lasting only several seconds intermixed with long, long
pauses.

I doubt you can fix this with what I've given you, but if you can tell me what
tools to use to try to debug this, I'd be most happy.  I can't find anything
useful for Linux using Google.  This is where DTrace would come in handy, hmm? :)
Comment 1 Andre Robatino 2008-06-13 04:11:24 EDT
Try installing iotop (yum install iotop) to see what's causing the disk
activity.  There's a good chance it's firefox, I see the same thing, though it's
not so bad on my system.  It's been reported as bug #439908.
Comment 2 Sean Middleditch 2008-06-13 14:15:58 EDT
From iotop, something very close to this was always in the top three lines
during the last 85 seconds that my hard disk has been constantly churning.

 1432 root           0 B/s    7.60 K/s  0.00 % 99.99 % [kjournald]
 3056 elanthis  212.92 K/s   91.25 K/s  0.00 % 94.53 % firefox
  235 root           0 B/s       0 B/s  0.00 % 38.02 % [pdflush]

Not a lot is being written if I understand that right, but I guess the journal
is being flushed or something, which does sound like the Firefox fsync bug.

I also looked up Pidgin and apparently it is well known that it also has
excessive disk writing and fsyncing, although not nearly as bad as Firefox,
which explains why it seems to contribute to this.  There are already upstream
bugs for that one.

Since I seem to have pretty much just confirmed that it's the good ol' firefox
fsync bug and that I'm just getting hit by it way harder than most, I'll close
this as a duplicate.  Sorry for the noise.

*** This bug has been marked as a duplicate of 439908 ***