Bug 883311 - kernels > 2.6.32-328 does not boot in qemu-kvm-1.0.1-2.fc17.x86_64
Summary: kernels > 2.6.32-328 does not boot in qemu-kvm-1.0.1-2.fc17.x86_64
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 18
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-12-04 09:59 UTC by Petr Pisar
Modified: 2013-09-02 16:26 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-08-31 14:39:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Petr Pisar 2012-12-04 09:59:33 UTC
My x86_64 RHEL-6.4 virtual machine on Fedora 17 stopped booting with recent RHEL kernels. Grub clears screen just before loading kernel, I get cursor in top left corner and that's all. Last kernel known to boot is 2.6.38-328. However even it sometimes takes significant time to boot. E.g. 2.6.38-338 or 2.6.38-343 does not boot at all (or the delay is so long I have never waited long enough).

I execute the virtual machine with command:

qemu-kvm -hda /home/petr/virtual/rhel-6.4.disk -boot c -net nic,macaddr=00:50:54:00:0a:64 -net tap,ifname=taprhel6_4x86,script=no -vga std -m 1024 -smp 2

I have default RHEL-6 kernel arguments and grub settings.

My host is qemu-kvm-1.0.1-2.fc17.x86_64.

Maybe it's a bug in Fedora KVM or seabios-1.7.1-1.fc17.x86_64. Maybe it's bug in RHEL kernel.

Comment 2 RHEL Program Management 2012-12-14 08:22:17 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Cole Robinson 2013-05-24 18:32:40 UTC
Does this reproduce with update to Fedora 17 and RHEL6.4 ?

Comment 4 Petr Pisar 2013-05-27 06:44:38 UTC
I had to move the virtual machine to different physical machine where I still have Fedora 17 to test it. I removed the network configuration and I kept SDL output.

The result is RHEL-6 2.6.32-358 sometimes boots. It does not ever hang but it sometimes resets during loading/unpacking (?) the kernel. And if it resets, it resets forever until I quit and start qemu again. Sometimes I get error message about corruption while decompressing and that the system was halted.

I have the same problem in Fedora 18.

It usually boots for the first time, but the probability it starts failing rises on next boot and especially if I manage to interrupt the grub and I select the kernel manually from grub menu.

Comment 5 Andrew Jones 2013-05-27 11:32:11 UTC
Just to be sure I understand this correctly I'll rewrite it in my own words;

You had a working rhel6 guest running over an f17 host. You didn't update anything on the host (at least not any virt-related s/w; kernel and qemu), you also didn't change anything about how you start your guest, i.e. the qemu command

> qemu-kvm -hda /home/petr/virtual/rhel-6.4.disk -boot c -net nic,macaddr=00:50:54:00:0a:64 -net tap,ifname=taprhel6_4x86,script=no -vga std -m 1024 -smp 2

is still the same as it was before. And, finally, the kernel command line in the guest is also the same as before.

If all that correct, then the only variable is the guest kernel version (and thus this BZ should be in the kernel component). Although, I'd prefer switching to serial console on the guest, in order to eliminate even more variables, and to possibly get some useful information. Can you add console=ttyS0,115200n8 to your guest kernel command line and also change your qemu command line to use serial instead?

If the problem persists, then we should be able to bisect it.

Comment 6 Petr Pisar 2013-06-14 07:15:21 UTC
In my opinion something is wrong with the qemu.

Because now I cannot boot any RHEL-5. Even any installation image from corporate PXE. The RHEL-5 stops somewhere between Grub and Linux.

The RHEL-6 stops booting when I interrupt loading in Grub and then I try to load a kernel. The first boot works because I'm not able to interrupt the Grub (probably to short delay). The RHEL-6 issue is it resets just when Grub tries to load a kernel. If I add console=ttyS0 to the kernel, it will hangs in same way as RHEL-5.

When loading kernel hangs, the qemu process consumes all host CPU cycles.

This is a description from current F18.

Regarding the F17 host, of course it's updated. But it's F17.

It's hard to pass console argument to a kernel if Grub does not have enabled serial console. Thus I cannot eliminate the SDL output.

Comment 7 Andrew Jones 2013-06-14 10:56:34 UTC
(In reply to Petr Pisar from comment #6)
> In my opinion something is wrong with the qemu.
> 
> Because now I cannot boot any RHEL-5. Even any installation image from
> corporate PXE. The RHEL-5 stops somewhere between Grub and Linux.

Sounds like a dup of bug 967652, but that would imply the problems started after you updated the host kernel (kvm), and I understood that *only* the guest kernel had been updated between having working config and a not working config.

To confirm if it's a dup of that bug, try the workaround that Paolo suggests

modprobe kvm_intel emulate_invalid_guest_state=0

Comment 8 Petr Pisar 2013-06-14 12:02:49 UTC
The kvm_intel emulate_invalid_guest_state=0 helped with RHEL-5. Thanks for the hint.

The problem with RHEL-6 remains. I will try to change Grub to use serial port to check if it's possible to reproduce the bug without SDL output.

Comment 9 Fedora End Of Life 2013-07-04 04:14:14 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Cole Robinson 2013-07-11 19:43:33 UTC
Petr, 3.9.5 in Fedora 18 should have some improvements:

https://bugzilla.redhat.com/show_bug.cgi?id=967652#c11

Can you update to that and see if the RHEL6 issue persists?

Comment 11 Petr Pisar 2013-07-12 06:34:23 UTC
No improvement. Even the RHEL-5 issue remains. (host: kernel-3.9.9-201.fc18.x86_64, qemu-kvm-1.2.2-13.fc18.x86_64).

Comment 12 Cole Robinson 2013-07-12 12:48:01 UTC
Petr, does passing -no-kvm to qemu-kvm help booting RHEL5 or RHEL6?

Comment 13 Cole Robinson 2013-08-31 14:39:33 UTC
No response for 6 weeks, closing. If anyone can still reproduce, please reopen and try the suggestion in Comment #11

Comment 14 Petr Pisar 2013-09-02 05:29:08 UTC
-no-kvm is so slow it's not possible to interrupt the grub by a keyboard.

I don't have time playing with your try-and-report tips because of my own business. I'm just amazed you have never tried to reproduce the issue yourself and catch the bug yourself.

Comment 15 Cole Robinson 2013-09-02 16:26:21 UTC
(In reply to Petr Pisar from comment #14)
> -no-kvm is so slow it's not possible to interrupt the grub by a keyboard.
> 
> I don't have time playing with your try-and-report tips because of my own
> business. I'm just amazed you have never tried to reproduce the issue
> yourself and catch the bug yourself.

Apologies that I didn't mention it, but I had no problems booting RHEL5, RHEL6, F16, F17, and F18 guests on my intel F18 machine. All your reports basically indicate that it was something relatively specific to your hardware, so not much more that can be done in this case besides 'try and report'. Really, if this was broken for everyone, there would be a lot more commotion on this bug. Some people did report similar issues but emulate_invalid_guest_state=0 fixed things for them, and then as work was done upstream that became unnecessary IIRC


Note You need to log in before you can comment on or make changes to this bug.