My x86_64 RHEL-6.4 virtual machine on Fedora 17 stopped booting with recent RHEL kernels. Grub clears screen just before loading kernel, I get cursor in top left corner and that's all. Last kernel known to boot is 2.6.38-328. However even it sometimes takes significant time to boot. E.g. 2.6.38-338 or 2.6.38-343 does not boot at all (or the delay is so long I have never waited long enough).
I execute the virtual machine with command:
qemu-kvm -hda /home/petr/virtual/rhel-6.4.disk -boot c -net nic,macaddr=00:50:54:00:0a:64 -net tap,ifname=taprhel6_4x86,script=no -vga std -m 1024 -smp 2
I have default RHEL-6 kernel arguments and grub settings.
My host is qemu-kvm-1.0.1-2.fc17.x86_64.
Maybe it's a bug in Fedora KVM or seabios-1.7.1-1.fc17.x86_64. Maybe it's bug in RHEL kernel.
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Does this reproduce with update to Fedora 17 and RHEL6.4 ?
I had to move the virtual machine to different physical machine where I still have Fedora 17 to test it. I removed the network configuration and I kept SDL output.
The result is RHEL-6 2.6.32-358 sometimes boots. It does not ever hang but it sometimes resets during loading/unpacking (?) the kernel. And if it resets, it resets forever until I quit and start qemu again. Sometimes I get error message about corruption while decompressing and that the system was halted.
I have the same problem in Fedora 18.
It usually boots for the first time, but the probability it starts failing rises on next boot and especially if I manage to interrupt the grub and I select the kernel manually from grub menu.
Just to be sure I understand this correctly I'll rewrite it in my own words;
You had a working rhel6 guest running over an f17 host. You didn't update anything on the host (at least not any virt-related s/w; kernel and qemu), you also didn't change anything about how you start your guest, i.e. the qemu command
> qemu-kvm -hda /home/petr/virtual/rhel-6.4.disk -boot c -net nic,macaddr=00:50:54:00:0a:64 -net tap,ifname=taprhel6_4x86,script=no -vga std -m 1024 -smp 2
is still the same as it was before. And, finally, the kernel command line in the guest is also the same as before.
If all that correct, then the only variable is the guest kernel version (and thus this BZ should be in the kernel component). Although, I'd prefer switching to serial console on the guest, in order to eliminate even more variables, and to possibly get some useful information. Can you add console=ttyS0,115200n8 to your guest kernel command line and also change your qemu command line to use serial instead?
If the problem persists, then we should be able to bisect it.
In my opinion something is wrong with the qemu.
Because now I cannot boot any RHEL-5. Even any installation image from corporate PXE. The RHEL-5 stops somewhere between Grub and Linux.
The RHEL-6 stops booting when I interrupt loading in Grub and then I try to load a kernel. The first boot works because I'm not able to interrupt the Grub (probably to short delay). The RHEL-6 issue is it resets just when Grub tries to load a kernel. If I add console=ttyS0 to the kernel, it will hangs in same way as RHEL-5.
When loading kernel hangs, the qemu process consumes all host CPU cycles.
This is a description from current F18.
Regarding the F17 host, of course it's updated. But it's F17.
It's hard to pass console argument to a kernel if Grub does not have enabled serial console. Thus I cannot eliminate the SDL output.
(In reply to Petr Pisar from comment #6)
> In my opinion something is wrong with the qemu.
> Because now I cannot boot any RHEL-5. Even any installation image from
> corporate PXE. The RHEL-5 stops somewhere between Grub and Linux.
Sounds like a dup of bug 967652, but that would imply the problems started after you updated the host kernel (kvm), and I understood that *only* the guest kernel had been updated between having working config and a not working config.
To confirm if it's a dup of that bug, try the workaround that Paolo suggests
modprobe kvm_intel emulate_invalid_guest_state=0
The kvm_intel emulate_invalid_guest_state=0 helped with RHEL-5. Thanks for the hint.
The problem with RHEL-6 remains. I will try to change Grub to use serial port to check if it's possible to reproduce the bug without SDL output.
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '17'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 17's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 17 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged change the
'version' to a later Fedora version prior to Fedora 17's end of life.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Petr, 3.9.5 in Fedora 18 should have some improvements:
Can you update to that and see if the RHEL6 issue persists?
No improvement. Even the RHEL-5 issue remains. (host: kernel-3.9.9-201.fc18.x86_64, qemu-kvm-1.2.2-13.fc18.x86_64).
Petr, does passing -no-kvm to qemu-kvm help booting RHEL5 or RHEL6?
No response for 6 weeks, closing. If anyone can still reproduce, please reopen and try the suggestion in Comment #11
-no-kvm is so slow it's not possible to interrupt the grub by a keyboard.
I don't have time playing with your try-and-report tips because of my own business. I'm just amazed you have never tried to reproduce the issue yourself and catch the bug yourself.
(In reply to Petr Pisar from comment #14)
> -no-kvm is so slow it's not possible to interrupt the grub by a keyboard.
> I don't have time playing with your try-and-report tips because of my own
> business. I'm just amazed you have never tried to reproduce the issue
> yourself and catch the bug yourself.
Apologies that I didn't mention it, but I had no problems booting RHEL5, RHEL6, F16, F17, and F18 guests on my intel F18 machine. All your reports basically indicate that it was something relatively specific to your hardware, so not much more that can be done in this case besides 'try and report'. Really, if this was broken for everyone, there would be a lot more commotion on this bug. Some people did report similar issues but emulate_invalid_guest_state=0 fixed things for them, and then as work was done upstream that became unnecessary IIRC