Bug 847546
Summary: | seabios sits for 3 seconds doing nothing during boot | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | amit.shah, berrange, cfergeau, dwmw2, gansalmon, itamar, jonathan, kernel-maint, knoel, madhu.chinakonda, pbonzini, rjones, scottt.tw, virt-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-08-13 16:33:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Richard W.M. Jones
2012-08-12 21:26:36 UTC
Also tested with qemu 2:1.2-0.1.20120806git3e430569.fc18. This also suffers from the same problem -- 3 second delay in sgabios. It also contains some sort of bug in virtio-scsi emulation so that the kernel crashes when the guest does 'insmod virtio_scsi.ko'. I'll open a separate bug about that. Sorry, wrong component. The bios is located in the sgabios package: sgabios-bin-0-0.20110622SVN.fc17.noarch (I tried the later sgabios, but it is completely broken. I will file a separate bug ...) ... SeaBIOS ... Even enabling full debugging in seabios didn't make the problem obvious, so I took a Fedora 17 (baremetal) host and updated it component by component to Fedora 18 until the bug manifested itself. The component to blame is the *host kernel* (NB: not the guest kernel). Updating from 3.4.0-1.fc17.x86_64 -> 3.6.0-0.rc1.git4.1.fc18.x86_64 is what triggers the problem. So I guess this is some sort of regression in the KVM module, perhaps causing guest memory or I/O access to be slower. I tracked this down using gdb, and it does seem to be a regression (severe slow down) in the 'rep insb' instruction. (gdb) info registers eax 0x500011 5242897 ecx 0x354990 3492240 edx 0x511 1297 ebx 0x0 0 esp 0x6f40 0x6f40 ebp 0x6f40 0x6f40 esi 0x0 0 edi 0x2b1400 2823168 eip 0x15e 0x15e eflags 0x86 [ PF SF ] cs 0xc380 50048 ss 0x0 0 ds 0xc380 50048 es 0x10 16 fs 0x0 0 gs 0x0 0 (gdb) x/5i ($cs*16)+$eip 0xc395e: rep insb (%dx),%es:(%edi) <--- here 0xc3961: mov $0xa,%ax 0xc3964: mov $0x510,%dx 0xc3967: out %ax,(%dx) 0xc3968: mov $0x511,%dx Port 0x511 is PORT_QEMU_CFG_DATA, ie. fw_cfg data. I don't know the full size of the read or the destination, because by the time I hit ^C it had already copied maybe half of the data, so %ecx and %edi have already decremented/ incremented from the initial values. Lots of places in the seabios code call qemu_cfg_read and it could be any one of them, so still debugging this. (In reply to comment #4) > Even enabling full debugging in seabios didn't make the > problem obvious, so I took a Fedora 17 (baremetal) host > and updated it component by component to Fedora 18 until > the bug manifested itself. > > The component to blame is the *host kernel* (NB: not the > guest kernel). > > Updating from 3.4.0-1.fc17.x86_64 -> 3.6.0-0.rc1.git4.1.fc18.x86_64 > is what triggers the problem. > > So I guess this is some sort of regression in the KVM module, > perhaps causing guest memory or I/O access to be slower. That has all of the various kernel debugging options enabled. They are known to slow things down. You might want to test a non-debug kernel. Grab http://koji.fedoraproject.org/koji/buildinfo?buildID=345823 or wait for the first -rc2 kernel to be built if that one has bugs that hold you up. I don't know why there is a regression. However I do have a fix for this, which is to apply Gleb's (non-upstream) patch series to improve rep ins performance: http://comments.gmane.org/gmane.comp.emulators.kvm.devel/95259 I applied this series on top of 3.6.0-0.rc1.git3.2 and essentially the pause goes away. Going from "Booting the ROM" to the first kernel message is about 0.8 seconds (versus 3 before). (In reply to comment #6) > That has all of the various kernel debugging options enabled. They are > known to slow things down. You might want to test a non-debug kernel. Grab How can I find out if a kernel has debugging options enabled? (In reply to comment #7) > http://comments.gmane.org/gmane.comp.emulators.kvm.devel/95259 > > I applied this series on top of 3.6.0-0.rc1.git3.2 and > essentially the pause goes away. Going from "Booting the ROM" > to the first kernel message is about 0.8 seconds (versus 3 before). Just to be clearer about this. I tested two identical host kernels except that one had Gleb's patch series applied. So whether or not debugging is enabled in this kernel, the patch series still makes a big difference. (In reply to comment #8) > (In reply to comment #6) > > That has all of the various kernel debugging options enabled. They are > > known to slow things down. You might want to test a non-debug kernel. Grab > > How can I find out if a kernel has debugging options enabled? If you're using rawhide, we build one kernel without debug set per RC/release. So the first build of rc1, rc2, etc will have them off. Otherwise, they're always enabled. Anything with -gitX will have debug options enabled. (In reply to comment #6) > That has all of the various kernel debugging options enabled. They are > known to slow things down. You might want to test a non-debug kernel. What I did was I removed Gleb's patches and cherry picked d1027f01240c3426211a80146f8c36c2b5736865 (which disables debug options). So now I have a kernel which has debugging off, but is otherwise completely identical to the one I was testing before (eg. in comment 4). I rebooted the host with this kernel and ran the test again. The delay is now ~ 1.2 seconds, so it does seem likely that the regression was caused by adding debugging options. Nevertheless Gleb's patches are a clear benefit (even with debugging on). OK I'm going to close this because I'm satisfied with the boot times I'm getting without debug enabled. |