Created attachment 340565 [details] bootchart of booting with 30 second pause during "fsck.ext3 /boot" Description of problem: I observe that on most boots, the system appears to pause for about 30 seconds. This means I get lots of Spinfinity spins and very sluggish boots. If I boot and either hit ESC (to see boot messages) or if I boot without "quiet rhgb", I see it pausing after it prints the output of "fsck.ext4" for / (/dev/sda3 in my case). It will hang there, and after about 30 seconds it will quickly print out the remainder of the boot messages and complete the boot. I installed bootchart and got the attached bootchart.png. It shows "nothing running" after about 15 seconds until about 45 seconds except for "fsck.ext3" and "readahead". Now /boot is pretty small (200MB), so how could this take that long.... Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_tlondon-lv_root 213188880 92891064 109468404 46% / /dev/sda2 198337 30677 157420 17% /boot tmpfs 1955052 1212 1953840 1% /dev/shm /dev/sdb1 2000512 1248 1999264 1% /media/E0FD-1813 [root@tlondon ~]# This does not happen on every boot. I'll try to get a bootchart.png for a "non pausing boot" for comparison. I have no idea what is causing this..... Sorry if e2fsprogs is the wrong component. System is Thinkpad X00 running rawhide + a few packages from koji. This has been happening with recent rawhide/koji kernels..... Version-Release number of selected component (if applicable): kernel-2.6.29.1-102.fc11.x86_64 e2fsprogs-1.41.4-6.fc11.x86_64 initscripts-8.94-1.x86_64 How reproducible: about 50% of the time.... Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
If you unmount /boot on the running system and point fsck at it, do you see the same issue? Does this happen after nice clean shutdowns? Try dropping caches before you re-run fsck in this test too: # echo 3 > /proc/sys/vm/drop-caches # time e2fsck /dev/sda2 rinse, repeat, etc. Thanks, -Eric
If you turn off readahead, do you still see the same problem?
(In reply to comment #1) > If you unmount /boot on the running system and point fsck at it, do you see the > same issue? > > Does this happen after nice clean shutdowns? > > Try dropping caches before you re-run fsck in this test too: > > # echo 3 > /proc/sys/vm/drop-caches > # time e2fsck /dev/sda2 > > rinse, repeat, etc. > > Thanks, > -Eric I believe I get these after "clean shutdowns".... at least they appear clean to me: I do System->Shutdown and see nothing unusual on the monitor. OK, rebooted and got this: [root@tlondon ~]# time fsck.ext3 /dev/sda2 e2fsck 1.41.4 (27-Jan-2009) /dev/sda2: clean, 44/51200 files, 37140/204800 blocks real 0m0.098s user 0m0.002s sys 0m0.005s [root@tlondon ~]# time fsck.ext3 /dev/sda2 e2fsck 1.41.4 (27-Jan-2009) /dev/sda2: clean, 44/51200 files, 37140/204800 blocks real 0m0.015s user 0m0.001s sys 0m0.004s [root@tlondon ~]# time fsck.ext3 /dev/sda2 e2fsck 1.41.4 (27-Jan-2009) /dev/sda2: clean, 44/51200 files, 37140/204800 blocks real 0m0.019s user 0m0.000s sys 0m0.006s [root@tlondon ~]# echo 3 >/proc/sys/vm/drop_caches [root@tlondon ~]# time e2fsck /dev/sda2 e2fsck 1.41.4 (27-Jan-2009) /dev/sda2: clean, 44/51200 files, 37140/204800 blocks real 0m1.037s user 0m0.001s sys 0m0.017s [root@tlondon ~]# time e2fsck /dev/sda2 e2fsck 1.41.4 (27-Jan-2009) /dev/sda2: clean, 44/51200 files, 37140/204800 blocks real 0m0.104s user 0m0.003s sys 0m0.005s [root@tlondon ~]#
Created attachment 340590 [details] output of bootchart booting with readahead disabled.... (WOW).... Well, disabling readahead, by changing /etc/sysconfig/readahead: # enable readahead at system startup #READAHEAD="yes" READAHEAD="no" certainly appears to have "fixed this for me"...... I restarted twice, and both times it booted like a champ....
BTW, is it reasonable for all the instances of "vol_id" and "devkit_disks_pa"? I count blocks of 24 of each of them.....
And I forgot to add version of readahead: readahead-1.4.9-1.fc11.x86_64
Looking at /var/lib/readahead/early.sorted, I see as the last few lines: <<<<SNIP>>>> 2323 /etc/rc.d/init.d/libvirtd 1161 /etc/sasl2/libvirt.conf 275 /etc/sysconfig/libvirtd 1608 /usr/share/PolicyKit/policy/org.libvirt.unix.policy 8589934592 /var/lib/libvirt/images/F11_Spin1.img Can't be "a good thing" to read in an 8GB file into my 4GB of RAM, can it?
Created attachment 340598 [details] bootchart booting with "fixed" /var/lib/readahead/early.sorted After deleting the 8GB virtual image file from /var/lib/readahead/early.sorted and re-enabling readahead, I get this boot chart. Pretty snappy. So should readahead (readahead-collector?) filter out such beasts?
Haha, neat. Ok, I thought this looked like contention between fsck and readahead. I'm going to punt this one over to readahead, thanks for digging into it! 8GB in 30s isn't too bad though ;) -Eric
Hmmm... the Giga-beast got added to /var/lib/readahead/early.sorted again: [root@tlondon readahead]# tail early.sorted 86 /usr/share/dbus-1/system-services/org.freedesktop.PackageKit.service 105 /usr/share/dbus-1/system-services/org.freedesktop.PackageKitAptBackend.service 110 /usr/share/dbus-1/system-services/org.freedesktop.PackageKitTestBackend.service 9738 /etc/libvirt/libvirtd.conf 3071 /etc/libvirt/qemu.conf 2323 /etc/rc.d/init.d/libvirtd 1161 /etc/sasl2/libvirt.conf 275 /etc/sysconfig/libvirtd 1608 /usr/share/PolicyKit/policy/org.libvirt.unix.policy 8589934592 /var/lib/libvirt/images/F11_Spin1.img [root@tlondon readahead]# ls -l total 140 -rw-r--r--. 1 root root 41969 2009-04-21 07:09 custom.early -rw-r--r--. 1 root root 48033 2009-04-21 12:25 early.sorted -rw-r--r--. 1 root root 48033 2009-04-21 10:55 early.sorted.save [root@tlondon readahead]# uptime 12:46:31 up 1:45, 3 users, load average: 2.13, 1.34, 1.01 [root@tlondon readahead]# [root@tlondon readahead]# So about 20 minutes ago, it got put back!. Not sure this is a good fix, but I'm adding /var/lib/libvirt/images to the excluded dirs in /etc/readahead.conf: --- readahead.conf.old 2009-04-21 12:48:46.255661882 -0700 +++ readahead.conf 2009-04-21 12:50:01.160785556 -0700 @@ -21,7 +21,7 @@ RAC_EXECIGN="/sbin/readahead" # Exclude dirs -RAC_EXCLUDE="/proc /sys /dev /var/log /var/run /var/lock /home /tmp /var/tmp /media /var/lib/gdm /selinux /var/cache/yum /var/lib/rpm /mnt /var/lib/dkms /usr/src /usr/include /lib/modules /usr/lib/gcc /usr/lib64/gcc /usr/lib/locale /usr/libexec/gcc" +RAC_EXCLUDE="/proc /sys /dev /var/log /var/run /var/lock /home /tmp /var/tmp /media /var/lib/gdm /selinux /var/cache/yum /var/lib/libvirt/images /var/lib/rpm /mnt /var/lib/dkms /usr/src /usr/include /lib/modules /usr/lib/gcc /usr/lib64/gcc /usr/lib/locale /usr/libexec/gcc" # A first filename in the list # RAC_LISTBEGIN="/etc/rc.d/init.d/readahead_early"
And it popped up in /var/lib/readahead/custom.early as well. I manually removed it from that list as well..... --- custom.early.save 2009-04-21 12:57:29.003785312 -0700 +++ custom.early 2009-04-21 12:57:34.308695210 -0700 @@ -1169,7 +1169,6 @@ /var/lib/dbus/machine-id /var/lib/dhclient/dhclient-eth0.lease /var/lib/dnsmasq/dnsmasq.leases -/var/lib/libvirt/images/F11_Spin1.img /var/lib/libvirt/network/default.xml /var/lib/ntp/drift /var/lib/random-seed [root@tlondon readahead]#
Argh.... this is a persistent bugger, readahead seems to be insistent on adding this file (/var/lib/libvirt/images/F11_Spin1.img) back into /var/lib/readahead/custom.early and early.sorted. [root@tlondon ~]# grep .img /var/lib/readahead/* /var/lib/readahead/custom.early:/var/lib/libvirt/images/F11_Spin1.img /var/lib/readahead/custom.early.save:/var/lib/libvirt/images/F11_Spin1.img /var/lib/readahead/early.sorted:8589934592 /var/lib/libvirt/images/F11_Spin1.img /var/lib/readahead/early.sorted.save:8589934592 /var/lib/libvirt/images/F11_Spin1.img [root@tlondon ~]# My "hack" at updating /etc/readahead.conf doesn't appear to "do the right thing"..... I've manually edited the /var/lib/readahead files again, and set READAHEAD_COLLECT="no" in /etc/sysconfig/readahead
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
*** This bug has been marked as a duplicate of bug 470680 ***