Bug 840990 - KVM 32-bit RHEL 4 guest hangs after screen size change from vga=791 parameter
KVM 32-bit RHEL 4 guest hangs after screen size change from vga=791 parameter
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.3
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Virtualization Maintenance
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-17 15:42 EDT by starlight
Modified: 2012-07-24 17:01 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-24 16:37:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
qemu start line for VM that hangs (1.20 KB, text/plain)
2012-07-23 09:52 EDT, starlight
no flags Details
libvirt XML definition for VM (2.36 KB, text/plain)
2012-07-23 10:05 EDT, starlight
no flags Details

  None (edit)
Description starlight 2012-07-17 15:42:24 EDT
Description of problem:

   RHEL 4.9 32-bit kernel 2.6.9-103.EL running as guest
   hangs/spins on boot.  Works fine with upstream
   3.1.8 kernel.

Version-Release number of selected component (if applicable):

   RHEL 6.3 distro as-of report date
   64-bit kernel 2.6.32-279.el6

How reproducible:

   Start VM with 'virsh start guest'.

   Boot sequence proceeds normally at first, but
   hangs with blank screen after video display size change.

   Boot parameters are

      ro
      root=/dev/vg00/lv00
      clock=pmtmr
      hpet=disable
      divider=20
      vga=791
      hdc=ide-scsi

Actual results:

   hangs when console size changes per vga=791;
   'top' shows guest consuming 100% of CPU core
   (one virtual CPU is configured)

Expected results:

   works fine under 64-bit 3.1.8 upstream kernel

Additional info:

   Guest kernel slightly modified to inhibit timer fallback.

   However suspect the video size change from vga=791
   is causing the problem and the timer tweak is
   irrelevant.

   Last few years RH appears indifferent to bug reports
   that don't originate with paying customers.
   Therefore not wasting any effort to include
   supporting materials unless requested.
   Willing to provide configuration files and guest
   kernel patch, but not willing to spend time
   on activities that would take the system offline.
   It is a stable production system.  Cannot afford to
   have it down so if RH is interested in fixing this
   it will have to be reproduced at RH first.
   Willing to support an effort to reproduce
   the problem.  Might be the trivial item of
   adding "vga=791" to the boot line of a 32-bit
   RHEL 4 guest.
Comment 2 Ademar Reis 2012-07-19 20:18:09 EDT
starlight@binnacle.cx, thanks for reporting it. Did you ever test it without your patch?

Anyway, can someone from QE please try to reproduce this with a pristine, up-to-date RHEL4 32bit guest?

Testcase should be: adding "vga=791" to the boot line of a RHEL4 32-bit guest and checking if it boots normally. A test on RHEL6 is also welcome. Thanks.
Comment 3 Sibiao Luo 2012-07-23 02:07:50 EDT
(In reply to comment #0)
>
(In reply to comment #2)
> 
> Testcase should be: adding "vga=791" to the boot line of a RHEL4 32-bit
> guest and checking if it boots normally. A test on RHEL6 is also welcome.
> Thanks.

Hi all,
  
   I did not reproduce this issue, just boot successfully both the rhel6.3_64bit and rhel4.9_32bit guest with appending 'vga=791' to the guest kernel line on my rhel6.3_64bit host, all guest have no any issues during the booting process. Pls correct me if any problem.
 
   starlight@binnacle.cx, could you paste your qemu-kvm-command-line and guest kernel line here ? That's very important for me to reproduce this issue. Thx.

my qemu-kvm-command-line:
# /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu host -enable-kvm -m 2048 -smp 2 -usb -name sluo-action -uuid `uuidgen` -drive file=RHEL-4.9-32-virtio.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B3 -vnc :1 -device sga -serial stdio -monitor unix:/tmp/monitor1,server,nowait

Best wish.
sluo
Comment 4 starlight 2012-07-23 09:52:38 EDT
Created attachment 599783 [details]
qemu start line for VM that hangs

Can provide libvirt definition and other materials as needed.

It's been awhile since I tweaked the kernel, so I forget the exact details, but I wanted this VM to run with an alternate timer mode that results in less guest-clock drift.  The regular RHEL 4 kernel decided the alternate timer is unstable and would fall back to a default.  The patch comments-out the fall-back logic.  Would have to look at it to remember which of the three or four timer flavors was desired.  Can provide the patch if needed--it's simple.
Comment 5 starlight 2012-07-23 10:05:14 EDT
Created attachment 599787 [details]
libvirt XML definition for VM

Here's the XML definition in case it's useful.
Comment 6 Sibiao Luo 2012-07-23 23:22:57 EDT
Hi all,

   I reproduce this issue on the rhel6.3 64-bit kernel-2.6.32-284.el6.x86_64 host with the rhel4.9 32-bit kernel-2.6.9-100.EL guest with one virtual CPU. Boot sequence proceeds normally at first, but hangs with blank screen after video display size change if append 'clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi' to the guest kernel line and run 'top' in host showing the guest consuming 100% CPU.
  
   I also tested the rhel6.3 64-bit guest for more than 30 times, it boot correctly without any problem.

Version-Release number of selected component (if applicable):
Host info:
# uname -r && rpm -q qemu-kvm
2.6.32-284.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
Guest info:
guest name: RHEL-4.9-32-virtio.qcow2
# cat /boot/grub/grub.conf 
...
title Red Hat Enterprise Linux AS-up (2.6.9-100.EL)
	root (hd0,0)
	kernel /vmlinuz-2.6.9-100.EL ro root=/dev/VolGroup00/LogVol00 clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi
	initrd /initrd-2.6.9-100.EL.img

How reproducible:
always but not 100%

Steps to Reproduce:
1.boot a rhel4.9 32-bit guest with one virtual CPU appending 'clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi' to the guest kernel line.
eg: # /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu host -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -usb -name sluo-action -uuid `uuidgen` -drive file=RHEL-4.9-32-virtio.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B3,bootindex=2 -vnc 0.0.0.0:1 -vga cirrus -device sga -serial stdio -monitor unix:/tmp/monitor1,server,nowait
2.run 'top' shows guest consuming CPU in host.

Test results:
after the step 2, the guest hang with blank screen, and the guest consume 100% CPU.
# top
...
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                        
 5516 root      20   0 1664m 134m 4444 S 100.1  1.7   6:52.77 qemu-kvm
...

Best wish & thx.
sluo
Comment 7 Ademar Reis 2012-07-24 16:37:18 EDT
starlight@binnacle.cx, thank you for taking the time to enter this bug and give us the detailed information. We appreciate the feedback and look to use reports such as this to guide our efforts at improving our products.

That being said, this bug is a corner case under an old RHEL release and therefore is of very low priority for us. Since we have a full queue of bugs in RHEL6 and RHEL7 and we can't reproduce with a RHEL6.3 guest, I'm closing it for now.

If this issue is critical for you, please raise a ticket through your regular Red Hat support channels to make certain  it receives the proper attention and prioritization to assure a timely resolution.

    For information on how to contact the Red Hat production support team, please visit: https://www.redhat.com/support/process/production/#howto
Comment 8 starlight 2012-07-24 17:01:51 EDT
I'm giving up one reporting future bugs--this was
the last.  Complete waste of time.  Long ago RH
viewed fixing bugs as good policy, that overall
quality mattered.  Haven't seen that attitude in
years.  Of late upstream code quality is better,
particularly with the kernel, and the developers
are vastly more responsive to good reports.

RH should consider advancing kernel versions with
"dot" releases instead of trying to backport new
features.  The kernel is much more complex than it
was and backporting is labor intensive and
suboptimal from a quality perspective.  Add the
resources to kernel.org instead.  Have been
running upstream 3.1 from the get-go
since the RH variant never worked properly.
Next time I touch this system it will be when
an important CVE arrives, at which point I'll
take a newer kernel.org branch that's not EOL.

Note You need to log in before you can comment on or make changes to this bug.