496912 – readahead is reading an 8GB libvirt image on system startup, delaying boot

Bug 496912 - readahead is reading an 8GB libvirt image on system startup, delaying boot

Summary: readahead is reading an 8GB libvirt image on system startup, delaying boot

Keywords:
Status:	CLOSED DUPLICATE of bug 470680
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	readahead
Sub Component:
Version:	11
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Harald Hoyer
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-04-21 16:02 UTC by Tom London
Modified:	2009-08-13 11:48 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2009-08-13 11:48:01 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
bootchart of booting with 30 second pause during "fsck.ext3 /boot" (214.80 KB, image/png) 2009-04-21 16:02 UTC, Tom London	no flags	Details
output of bootchart booting with readahead disabled.... (WOW).... (231.10 KB, image/png) 2009-04-21 17:26 UTC, Tom London	no flags	Details
bootchart booting with "fixed" /var/lib/readahead/early.sorted (184.21 KB, image/png) 2009-04-21 18:10 UTC, Tom London	no flags	Details
View All

Description Tom London 2009-04-21 16:02:33 UTC

Created attachment 340565 [details]
bootchart of booting with 30 second pause during "fsck.ext3 /boot"

Description of problem:
I observe that on most boots, the system appears to pause for about 30 seconds.  This means I get lots of Spinfinity spins and very sluggish boots.

If I boot and either hit ESC (to see boot messages) or if I boot without "quiet rhgb", I see it pausing after it prints the output of "fsck.ext4" for / (/dev/sda3 in my case).

It will hang there, and after about 30 seconds it will quickly print out the remainder of the boot messages and complete the boot.

I installed bootchart and got the attached bootchart.png.  It shows "nothing running" after about 15 seconds until about 45 seconds except for "fsck.ext3" and "readahead".

Now /boot is pretty small (200MB), so how could this take that long....

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_tlondon-lv_root
                     213188880  92891064 109468404  46% /
/dev/sda2               198337     30677    157420  17% /boot
tmpfs                  1955052      1212   1953840   1% /dev/shm
/dev/sdb1              2000512      1248   1999264   1% /media/E0FD-1813
[root@tlondon ~]# 

This does not happen on every boot.  I'll try to get a bootchart.png for a "non pausing boot" for comparison.

I have no idea what is causing this.....  Sorry if e2fsprogs is the wrong component.

System is Thinkpad X00 running rawhide + a few packages from koji.

This has been happening with recent rawhide/koji kernels.....

Version-Release number of selected component (if applicable):
kernel-2.6.29.1-102.fc11.x86_64
e2fsprogs-1.41.4-6.fc11.x86_64
initscripts-8.94-1.x86_64


How reproducible:
about 50% of the time....

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Eric Sandeen 2009-04-21 16:18:16 UTC

If you unmount /boot on the running system and point fsck at it, do you see the same issue?

Does this happen after nice clean shutdowns?

Try dropping caches before you re-run fsck in this test too:

# echo 3 > /proc/sys/vm/drop-caches
# time e2fsck /dev/sda2

rinse, repeat, etc.

Thanks,
-Eric

Comment 2 Eric Sandeen 2009-04-21 16:20:13 UTC

If you turn off readahead, do you still see the same problem?

Comment 3 Tom London 2009-04-21 17:15:44 UTC

(In reply to comment #1)
> If you unmount /boot on the running system and point fsck at it, do you see the
> same issue?
> 
> Does this happen after nice clean shutdowns?
> 
> Try dropping caches before you re-run fsck in this test too:
> 
> # echo 3 > /proc/sys/vm/drop-caches
> # time e2fsck /dev/sda2
> 
> rinse, repeat, etc.
> 
> Thanks,
> -Eric  

I believe I get these after "clean shutdowns".... at least they appear clean to me: I do System->Shutdown and see nothing unusual on the monitor.

OK, rebooted and got this:

[root@tlondon ~]# time fsck.ext3 /dev/sda2
e2fsck 1.41.4 (27-Jan-2009)
/dev/sda2: clean, 44/51200 files, 37140/204800 blocks

real	0m0.098s
user	0m0.002s
sys	0m0.005s
[root@tlondon ~]# time fsck.ext3 /dev/sda2
e2fsck 1.41.4 (27-Jan-2009)
/dev/sda2: clean, 44/51200 files, 37140/204800 blocks

real	0m0.015s
user	0m0.001s
sys	0m0.004s
[root@tlondon ~]# time fsck.ext3 /dev/sda2
e2fsck 1.41.4 (27-Jan-2009)
/dev/sda2: clean, 44/51200 files, 37140/204800 blocks

real	0m0.019s
user	0m0.000s
sys	0m0.006s
[root@tlondon ~]# echo 3 >/proc/sys/vm/drop_caches
[root@tlondon ~]# time e2fsck /dev/sda2
e2fsck 1.41.4 (27-Jan-2009)
/dev/sda2: clean, 44/51200 files, 37140/204800 blocks

real	0m1.037s
user	0m0.001s
sys	0m0.017s
[root@tlondon ~]# time e2fsck /dev/sda2
e2fsck 1.41.4 (27-Jan-2009)
/dev/sda2: clean, 44/51200 files, 37140/204800 blocks

real	0m0.104s
user	0m0.003s
sys	0m0.005s
[root@tlondon ~]#

Comment 4 Tom London 2009-04-21 17:26:30 UTC

Created attachment 340590 [details]
output of bootchart booting with readahead disabled....  (WOW)....

Well, disabling readahead, by changing /etc/sysconfig/readahead:

# enable readahead at system startup
#READAHEAD="yes"
READAHEAD="no"

certainly appears to have "fixed this for me"......

I restarted twice, and both times it booted like a champ....

Comment 5 Tom London 2009-04-21 17:30:05 UTC

BTW, is it reasonable for all the instances of "vol_id" and "devkit_disks_pa"?  I count blocks of 24 of each of them.....

Comment 6 Tom London 2009-04-21 17:31:06 UTC

And I forgot to add version of readahead:

readahead-1.4.9-1.fc11.x86_64

Comment 7 Tom London 2009-04-21 17:54:45 UTC

Looking at /var/lib/readahead/early.sorted, I see as the last few lines:

<<<<SNIP>>>>
2323 /etc/rc.d/init.d/libvirtd
1161 /etc/sasl2/libvirt.conf
275 /etc/sysconfig/libvirtd
1608 /usr/share/PolicyKit/policy/org.libvirt.unix.policy
8589934592 /var/lib/libvirt/images/F11_Spin1.img


Can't be "a good thing" to read in an 8GB file into my 4GB of RAM, can it?

Comment 8 Tom London 2009-04-21 18:10:05 UTC

Created attachment 340598 [details]
bootchart booting with "fixed" /var/lib/readahead/early.sorted

After deleting the 8GB virtual image file from /var/lib/readahead/early.sorted and re-enabling readahead, I get this boot chart.

Pretty snappy.

So should readahead (readahead-collector?) filter out such beasts?

Comment 9 Eric Sandeen 2009-04-21 18:16:55 UTC

Haha, neat.  Ok, I thought this looked like contention between fsck and readahead.  I'm going to punt this one over to readahead, thanks for digging into it!

8GB in 30s isn't too bad though ;)

-Eric

Comment 10 Tom London 2009-04-21 19:52:04 UTC

Hmmm... the Giga-beast got added to /var/lib/readahead/early.sorted again:

[root@tlondon readahead]# tail early.sorted
86 /usr/share/dbus-1/system-services/org.freedesktop.PackageKit.service
105 /usr/share/dbus-1/system-services/org.freedesktop.PackageKitAptBackend.service
110 /usr/share/dbus-1/system-services/org.freedesktop.PackageKitTestBackend.service
9738 /etc/libvirt/libvirtd.conf
3071 /etc/libvirt/qemu.conf
2323 /etc/rc.d/init.d/libvirtd
1161 /etc/sasl2/libvirt.conf
275 /etc/sysconfig/libvirtd
1608 /usr/share/PolicyKit/policy/org.libvirt.unix.policy
8589934592 /var/lib/libvirt/images/F11_Spin1.img
[root@tlondon readahead]# ls -l
total 140
-rw-r--r--. 1 root root 41969 2009-04-21 07:09 custom.early
-rw-r--r--. 1 root root 48033 2009-04-21 12:25 early.sorted
-rw-r--r--. 1 root root 48033 2009-04-21 10:55 early.sorted.save
[root@tlondon readahead]# uptime
 12:46:31 up  1:45,  3 users,  load average: 2.13, 1.34, 1.01
[root@tlondon readahead]# 
[root@tlondon readahead]# 

So about 20 minutes ago, it got put back!.

Not sure this is a good fix, but I'm adding /var/lib/libvirt/images to the excluded dirs in /etc/readahead.conf:

--- readahead.conf.old	2009-04-21 12:48:46.255661882 -0700
+++ readahead.conf	2009-04-21 12:50:01.160785556 -0700
@@ -21,7 +21,7 @@
 RAC_EXECIGN="/sbin/readahead"
 
 # Exclude dirs
-RAC_EXCLUDE="/proc /sys /dev /var/log /var/run /var/lock /home /tmp /var/tmp /media /var/lib/gdm /selinux /var/cache/yum /var/lib/rpm /mnt /var/lib/dkms /usr/src /usr/include /lib/modules /usr/lib/gcc /usr/lib64/gcc  /usr/lib/locale /usr/libexec/gcc"
+RAC_EXCLUDE="/proc /sys /dev /var/log /var/run /var/lock /home /tmp /var/tmp /media /var/lib/gdm /selinux /var/cache/yum /var/lib/libvirt/images /var/lib/rpm /mnt /var/lib/dkms /usr/src /usr/include /lib/modules /usr/lib/gcc /usr/lib64/gcc  /usr/lib/locale /usr/libexec/gcc"
 
 # A first filename in the list
 # RAC_LISTBEGIN="/etc/rc.d/init.d/readahead_early"

Comment 11 Tom London 2009-04-21 19:59:04 UTC

And it popped up in /var/lib/readahead/custom.early as well.

I manually removed it from that list as well.....

--- custom.early.save	2009-04-21 12:57:29.003785312 -0700
+++ custom.early	2009-04-21 12:57:34.308695210 -0700
@@ -1169,7 +1169,6 @@
 /var/lib/dbus/machine-id
 /var/lib/dhclient/dhclient-eth0.lease
 /var/lib/dnsmasq/dnsmasq.leases
-/var/lib/libvirt/images/F11_Spin1.img
 /var/lib/libvirt/network/default.xml
 /var/lib/ntp/drift
 /var/lib/random-seed
[root@tlondon readahead]#

Comment 12 Tom London 2009-04-22 00:09:38 UTC

Argh.... this is a persistent bugger, readahead seems to be insistent on adding this file (/var/lib/libvirt/images/F11_Spin1.img) back into /var/lib/readahead/custom.early and early.sorted.

[root@tlondon ~]# grep .img /var/lib/readahead/*
/var/lib/readahead/custom.early:/var/lib/libvirt/images/F11_Spin1.img
/var/lib/readahead/custom.early.save:/var/lib/libvirt/images/F11_Spin1.img
/var/lib/readahead/early.sorted:8589934592 /var/lib/libvirt/images/F11_Spin1.img
/var/lib/readahead/early.sorted.save:8589934592 /var/lib/libvirt/images/F11_Spin1.img
[root@tlondon ~]# 

My "hack" at updating /etc/readahead.conf doesn't appear to "do the right thing".....

I've manually edited the /var/lib/readahead files again, and set READAHEAD_COLLECT="no" in /etc/sysconfig/readahead

Comment 13 Bug Zapper 2009-06-09 14:20:00 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 14 Harald Hoyer 2009-08-13 11:48:01 UTC


*** This bug has been marked as a duplicate of bug 470680 ***

Note You need to log in before you can comment on or make changes to this bug.