Description of problem:
RHEL 4.9 32-bit kernel 2.6.9-103.EL running as guest
hangs/spins on boot. Works fine with upstream
Version-Release number of selected component (if applicable):
RHEL 6.3 distro as-of report date
64-bit kernel 2.6.32-279.el6
Start VM with 'virsh start guest'.
Boot sequence proceeds normally at first, but
hangs with blank screen after video display size change.
Boot parameters are
hangs when console size changes per vga=791;
'top' shows guest consuming 100% of CPU core
(one virtual CPU is configured)
works fine under 64-bit 3.1.8 upstream kernel
Guest kernel slightly modified to inhibit timer fallback.
However suspect the video size change from vga=791
is causing the problem and the timer tweak is
Last few years RH appears indifferent to bug reports
that don't originate with paying customers.
Therefore not wasting any effort to include
supporting materials unless requested.
Willing to provide configuration files and guest
kernel patch, but not willing to spend time
on activities that would take the system offline.
It is a stable production system. Cannot afford to
have it down so if RH is interested in fixing this
it will have to be reproduced at RH first.
Willing to support an effort to reproduce
the problem. Might be the trivial item of
adding "vga=791" to the boot line of a 32-bit
RHEL 4 guest.
firstname.lastname@example.org, thanks for reporting it. Did you ever test it without your patch?
Anyway, can someone from QE please try to reproduce this with a pristine, up-to-date RHEL4 32bit guest?
Testcase should be: adding "vga=791" to the boot line of a RHEL4 32-bit guest and checking if it boots normally. A test on RHEL6 is also welcome. Thanks.
(In reply to comment #0)
(In reply to comment #2)
> Testcase should be: adding "vga=791" to the boot line of a RHEL4 32-bit
> guest and checking if it boots normally. A test on RHEL6 is also welcome.
I did not reproduce this issue, just boot successfully both the rhel6.3_64bit and rhel4.9_32bit guest with appending 'vga=791' to the guest kernel line on my rhel6.3_64bit host, all guest have no any issues during the booting process. Pls correct me if any problem.
email@example.com, could you paste your qemu-kvm-command-line and guest kernel line here ? That's very important for me to reproduce this issue. Thx.
# /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu host -enable-kvm -m 2048 -smp 2 -usb -name sluo-action -uuid `uuidgen` -drive file=RHEL-4.9-32-virtio.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B3 -vnc :1 -device sga -serial stdio -monitor unix:/tmp/monitor1,server,nowait
Created attachment 599783 [details]
qemu start line for VM that hangs
Can provide libvirt definition and other materials as needed.
It's been awhile since I tweaked the kernel, so I forget the exact details, but I wanted this VM to run with an alternate timer mode that results in less guest-clock drift. The regular RHEL 4 kernel decided the alternate timer is unstable and would fall back to a default. The patch comments-out the fall-back logic. Would have to look at it to remember which of the three or four timer flavors was desired. Can provide the patch if needed--it's simple.
Created attachment 599787 [details]
libvirt XML definition for VM
Here's the XML definition in case it's useful.
I reproduce this issue on the rhel6.3 64-bit kernel-2.6.32-284.el6.x86_64 host with the rhel4.9 32-bit kernel-2.6.9-100.EL guest with one virtual CPU. Boot sequence proceeds normally at first, but hangs with blank screen after video display size change if append 'clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi' to the guest kernel line and run 'top' in host showing the guest consuming 100% CPU.
I also tested the rhel6.3 64-bit guest for more than 30 times, it boot correctly without any problem.
Version-Release number of selected component (if applicable):
# uname -r && rpm -q qemu-kvm
guest name: RHEL-4.9-32-virtio.qcow2
# cat /boot/grub/grub.conf
title Red Hat Enterprise Linux AS-up (2.6.9-100.EL)
kernel /vmlinuz-2.6.9-100.EL ro root=/dev/VolGroup00/LogVol00 clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi
always but not 100%
Steps to Reproduce:
1.boot a rhel4.9 32-bit guest with one virtual CPU appending 'clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi' to the guest kernel line.
eg: # /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu host -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -usb -name sluo-action -uuid `uuidgen` -drive file=RHEL-4.9-32-virtio.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B3,bootindex=2 -vnc 0.0.0.0:1 -vga cirrus -device sga -serial stdio -monitor unix:/tmp/monitor1,server,nowait
2.run 'top' shows guest consuming CPU in host.
after the step 2, the guest hang with blank screen, and the guest consume 100% CPU.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5516 root 20 0 1664m 134m 4444 S 100.1 1.7 6:52.77 qemu-kvm
Best wish & thx.
firstname.lastname@example.org, thank you for taking the time to enter this bug and give us the detailed information. We appreciate the feedback and look to use reports such as this to guide our efforts at improving our products.
That being said, this bug is a corner case under an old RHEL release and therefore is of very low priority for us. Since we have a full queue of bugs in RHEL6 and RHEL7 and we can't reproduce with a RHEL6.3 guest, I'm closing it for now.
If this issue is critical for you, please raise a ticket through your regular Red Hat support channels to make certain it receives the proper attention and prioritization to assure a timely resolution.
For information on how to contact the Red Hat production support team, please visit: https://www.redhat.com/support/process/production/#howto
I'm giving up one reporting future bugs--this was
the last. Complete waste of time. Long ago RH
viewed fixing bugs as good policy, that overall
quality mattered. Haven't seen that attitude in
years. Of late upstream code quality is better,
particularly with the kernel, and the developers
are vastly more responsive to good reports.
RH should consider advancing kernel versions with
"dot" releases instead of trying to backport new
features. The kernel is much more complex than it
was and backporting is labor intensive and
suboptimal from a quality perspective. Add the
resources to kernel.org instead. Have been
running upstream 3.1 from the get-go
since the RH variant never worked properly.
Next time I touch this system it will be when
an important CVE arrives, at which point I'll
take a newer kernel.org branch that's not EOL.