Bug 156442 - readahead improvements.
Summary: readahead improvements.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: readahead
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Karel Zak
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-04-30 09:38 UTC by Reuben Farrelly
Modified: 2008-07-08 08:19 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-07-08 08:19:02 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Reuben Farrelly 2005-04-30 09:38:35 UTC
'readahead' is quite good in terms of it's simplicity, however an alternative
has appeared recently called 'readahead-list' which may offer an improvement,
and seems to be actively maintained.

Description of the package: 

"readahead-list allows users to load files into the page cache before they are
needed, to accelerate program loading. It improves on the existing readahead by
taking the name of a file with the items to load, instead of requiring the
arguments being passed as parameters. Additionally, it contains a tool
(filelist-order) to optimize the order of the file list in several possible ways."

However the package is a bit bigger at 100k (mostly due to the auto* cruft).

The ability to optimize the order is interesting...and I think at this stage is
the most compelling reason to look at this software.

Package can be found at http://freshmeat.net/projects/readahead-list/

Comment 1 Dave Jones 2005-05-01 09:11:17 UTC
interesting. The key piece of the puzzle still missing however is the dynamic
generation of the readahead file lists.

There's some interesting comments at
http://blog.drinsama.de/erich/en/linux/2004122502-readahead4.html about using
the audit subsystem to do this, which could prove to be a worthy experiment.

Until we have something that can do this, replacing readahead with
readahead-list probably won't buy us a great deal. The multiple methods of
sorting are interesting, but there's no numbers backing up how useful any of
them are.  Lots of experimentation needed, which probably puts this off as an
FC5 feature.

Longterm, I agree that replacing readahead with this new variant is a good idea,
as the version we currently ship isn't the most maintained of packages..


Comment 2 Dave Jones 2005-05-01 09:12:45 UTC
Adding auditd maintainer to Cc. Steve, does this sound feasable ?


Comment 3 Steve Grubb 2005-05-01 12:41:33 UTC
>does this sound feasable ?

Certainly. The audit system is way better than 5 months ago when the blogger
wrote that up. In the last week I got the ausearch utility working that allows
searches to be performed against the logs.

To get the list of open syscalls, stop the audit daemon. Delete /var/log/audit/*
edit /etc/audit.rules and add:

-a entry,always -S open

Then reboot the machine. When you get to a prompt, "auditctl -D" to get rid of
auditing rules, then stop the audit daemon.

ausearch -m KERNEL | grep name | awk '{ print $4 }' | sort | uniq | less 

should let you view files that were opened during boot. Delete the rule from
/etc/audit.rules.

Comment 4 David Zeuthen 2005-05-01 16:36:03 UTC
In my experiments here

 http://www.redhat.com/archives/fedora-devel-list/2004-November/msg01374.html

from Nov 2004 I proposed using the kernel events layer (a couple of one-liner
patches maybe) but I suppose that using the audit subsystem is an equally, if
not better, good idea.

However, I think I also showed back then (and I expect this to be true for the
current Fedora devel tree) that readahead isn't really giving any performance
boost UNLESS we actually take the list of readahead files and "optimize the
disk" to get near-peak throughput. With "optimize the disk" I mean either 

 a. Rearranging blocks on disk
    (Windows XP does this, Google for it)

 b. Create a hot cache area in spare partition / reserved area of the disk
    (Mac OS X does this, Google for it)

or something third :-). However, I think it's clear that any approach must
require filesystem / kernel features currently not present. With "optimize the
disk", however, doing quick boot is simply

 1. readahead list of files at earliest possibility

 2. launch daemon to catch list of open files on boot; boot

 3. tell said daemon, say, two minutes after boot completion
     - stop monitoring open files
     - write readahead file list
     - "tweak the disk"

Also, said daemon would take the constraints of the system into account, e.g.
for a 256MB system maybe only a total of 160MB worth of files would be in the
readahead set (we'd need some data to select a ratio). Unfortunately I think
doing all this work is kind of pointless until the "optimize the disk" bits are
on the horizon :-/.

Cheers,
David

Comment 5 David Zeuthen 2005-05-01 16:49:10 UTC
Btw, I also think it's much more efficient and simpler to do this on the block
level rather than file level: e.g. let the kernel emit what physical blocks it
reads off the disk and work on prereading/juggle these around.

Comment 6 Ziga Mahkovec 2005-05-01 21:25:41 UTC
I was investigating both readahead performance and dynamic generation, so I
wanted to add some comments here.

About readahead-list: when I was profiling readahead, I noticed that bash (i.e.
argument parsing) accounts for about 0.1% of the total time.  While it
definitely makes sense to read the list from a file (due to bash's limits), I
wouldn't exactly state that as a performance improvement.

Having the ability to sort the list of files in a bunch of different ways also
doesn't seem very useful.  There is a single criterion that makes sense in this
case and that's sorting by disk blocks.

I was trying to improve readahead performance using ext2-specific optimizations
-- basically, preloading inode tables to improve stat(2) performance.  If anyone
is interested, the source code is available here:

    http://bootchart.org/misc/readahead/

I compared the performance of the current readahead to my e2fs-optimized version
on a list of 14307 files (amounting to 285 MB).  I tested on both the unsorted
list and sorted by first disk block.  Here are the times:

                |  unsorted list  |  sorted list
--------------------------------------------------
fc4-readahead   |      1m19.215s  |    0m47.868s
e2fs-readahead  |      0m41.118s  |    0m40.861s

It's not exactly the improvement I expected (I got about 10MB/s throughput,
hdparm reports 26MB/s).  But it does have the advantage that it works equally
well on both unsorted and sorted lists.

Like David said: unless there's a way to rearrange the file blocks, these
improvements won't get near the maximum throughput.  Disk seeks are just too
expensive.


As for using auditd for building the readahead list: I wrote a script that does
that.  It's not as simple as Steve's approach though, because I wanted to start
logging as early as possible -- by the time the audit service gets started, lots
of files are opened already.  If anyone wants to give it a try, you can get the
script (filemond) here:

    http://bootchart.org/misc/filemon/

Copy filemond to /sbin/, make it executable, disable the audit service
(chkconfig audit off) and then boot with:

    init=/sbin/filemond

The script will start a foreground auditd session (logging to a tmpfs) and then
exec init(1).  The rule I used is:

  -a exit,always -F success=0 -S open -S stat -S lstat -S access -S execve

The file list will be copied to /var/cache/readahead/boot.files, 15 seconds
after gnome-panel is started.  There's a sample list at the address above.


Comment 7 Dave Jones 2005-05-01 21:36:51 UTC
I think its worth repurposing this bug as a general bug for tracking all of this
related work.

Adding sct to cc for ext3-zen.
Stephen, I know akpm wrote some patch circa 2.4.18 that did this, but ext3 has
moved on quite a bit since then.  Any insight?


Comment 8 Karel Zak 2006-11-13 15:58:31 UTC
From fedora-devel list:

On Mon, Nov 13, 2006 at 11:07:41AM +0100, Arjan van de Ven wrote:
> On Mon, 2006-11-13 at 10:48 +0100, Ralf Ertzinger wrote:
> > Hi.
> >
> > On Mon, 13 Nov 2006 09:06:20 +0100, Arjan van de Ven wrote:
> >
> > > the downside is that you now add extra costs (seeks :) which.. well
> > > readahead was trying to avoid. So if this is going to get used
> > > massively it's less certain things will gain as much as before...
> >
> > Maybe run a script on system shutdown which reads all the files under
> > /etc/readahead.d and creates a single config file for readahead to
> > read on the next boot?
>
> I really like this idea; it's a simple "cat" and it can be done at a
> time where latency doesn't matter... (even in cron.daily)
>
> Oh... this opens more options. This also allows the "sort by
> blocknumber" to be done at this point and taken out of the critical
> latency part......


Comment 9 Karel Zak 2006-11-23 12:55:58 UTC
Well, I've created wiki page for readahead:

   https://hosted.fedoraproject.org/projects/readahead/

Comment 10 John Poelstra 2008-07-08 03:44:41 UTC
Given the state of readahead today, is this bug still relevant?


Note You need to log in before you can comment on or make changes to this bug.