|Summary:||KVM 32-bit RHEL 4 guest hangs after screen size change from vga=791 parameter|
|Product:||Red Hat Enterprise Linux 6||Reporter:||starlight|
|Component:||qemu-kvm||Assignee:||Virtualization Maintenance <virt-maint>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:||Virtualization Bugs <virt-bugs>|
|Version:||6.3||CC:||acathrow, areis, bsarathy, dyasny, juzhang, mkenneth, sluo, virt-maint|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2012-07-24 20:37:18 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description starlight 2012-07-17 19:42:24 UTC
Description of problem: RHEL 4.9 32-bit kernel 2.6.9-103.EL running as guest hangs/spins on boot. Works fine with upstream 3.1.8 kernel. Version-Release number of selected component (if applicable): RHEL 6.3 distro as-of report date 64-bit kernel 2.6.32-279.el6 How reproducible: Start VM with 'virsh start guest'. Boot sequence proceeds normally at first, but hangs with blank screen after video display size change. Boot parameters are ro root=/dev/vg00/lv00 clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi Actual results: hangs when console size changes per vga=791; 'top' shows guest consuming 100% of CPU core (one virtual CPU is configured) Expected results: works fine under 64-bit 3.1.8 upstream kernel Additional info: Guest kernel slightly modified to inhibit timer fallback. However suspect the video size change from vga=791 is causing the problem and the timer tweak is irrelevant. Last few years RH appears indifferent to bug reports that don't originate with paying customers. Therefore not wasting any effort to include supporting materials unless requested. Willing to provide configuration files and guest kernel patch, but not willing to spend time on activities that would take the system offline. It is a stable production system. Cannot afford to have it down so if RH is interested in fixing this it will have to be reproduced at RH first. Willing to support an effort to reproduce the problem. Might be the trivial item of adding "vga=791" to the boot line of a 32-bit RHEL 4 guest.
Comment 2 Ademar Reis 2012-07-20 00:18:09 UTC
email@example.com, thanks for reporting it. Did you ever test it without your patch? Anyway, can someone from QE please try to reproduce this with a pristine, up-to-date RHEL4 32bit guest? Testcase should be: adding "vga=791" to the boot line of a RHEL4 32-bit guest and checking if it boots normally. A test on RHEL6 is also welcome. Thanks.
Comment 3 Sibiao Luo 2012-07-23 06:07:50 UTC
(In reply to comment #0) > (In reply to comment #2) > > Testcase should be: adding "vga=791" to the boot line of a RHEL4 32-bit > guest and checking if it boots normally. A test on RHEL6 is also welcome. > Thanks. Hi all, I did not reproduce this issue, just boot successfully both the rhel6.3_64bit and rhel4.9_32bit guest with appending 'vga=791' to the guest kernel line on my rhel6.3_64bit host, all guest have no any issues during the booting process. Pls correct me if any problem. firstname.lastname@example.org, could you paste your qemu-kvm-command-line and guest kernel line here ? That's very important for me to reproduce this issue. Thx. my qemu-kvm-command-line: # /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu host -enable-kvm -m 2048 -smp 2 -usb -name sluo-action -uuid `uuidgen` -drive file=RHEL-4.9-32-virtio.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B3 -vnc :1 -device sga -serial stdio -monitor unix:/tmp/monitor1,server,nowait Best wish. sluo
Comment 4 starlight 2012-07-23 13:52:38 UTC
Created attachment 599783 [details] qemu start line for VM that hangs Can provide libvirt definition and other materials as needed. It's been awhile since I tweaked the kernel, so I forget the exact details, but I wanted this VM to run with an alternate timer mode that results in less guest-clock drift. The regular RHEL 4 kernel decided the alternate timer is unstable and would fall back to a default. The patch comments-out the fall-back logic. Would have to look at it to remember which of the three or four timer flavors was desired. Can provide the patch if needed--it's simple.
Comment 5 starlight 2012-07-23 14:05:14 UTC
Created attachment 599787 [details] libvirt XML definition for VM Here's the XML definition in case it's useful.
Comment 6 Sibiao Luo 2012-07-24 03:22:57 UTC
Hi all, I reproduce this issue on the rhel6.3 64-bit kernel-2.6.32-284.el6.x86_64 host with the rhel4.9 32-bit kernel-2.6.9-100.EL guest with one virtual CPU. Boot sequence proceeds normally at first, but hangs with blank screen after video display size change if append 'clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi' to the guest kernel line and run 'top' in host showing the guest consuming 100% CPU. I also tested the rhel6.3 64-bit guest for more than 30 times, it boot correctly without any problem. Version-Release number of selected component (if applicable): Host info: # uname -r && rpm -q qemu-kvm 2.6.32-284.el6.x86_64 qemu-kvm-0.12.1.2-2.295.el6.x86_64 Guest info: guest name: RHEL-4.9-32-virtio.qcow2 # cat /boot/grub/grub.conf ... title Red Hat Enterprise Linux AS-up (2.6.9-100.EL) root (hd0,0) kernel /vmlinuz-2.6.9-100.EL ro root=/dev/VolGroup00/LogVol00 clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi initrd /initrd-2.6.9-100.EL.img How reproducible: always but not 100% Steps to Reproduce: 1.boot a rhel4.9 32-bit guest with one virtual CPU appending 'clock=pmtmr hpet=disable divider=20 vga=791 hdc=ide-scsi' to the guest kernel line. eg: # /usr/libexec/qemu-kvm -M rhel6.3.0 -cpu host -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -usb -name sluo-action -uuid `uuidgen` -drive file=RHEL-4.9-32-virtio.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2E:5F:0A:0D:B3,bootindex=2 -vnc 0.0.0.0:1 -vga cirrus -device sga -serial stdio -monitor unix:/tmp/monitor1,server,nowait 2.run 'top' shows guest consuming CPU in host. Test results: after the step 2, the guest hang with blank screen, and the guest consume 100% CPU. # top ... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5516 root 20 0 1664m 134m 4444 S 100.1 1.7 6:52.77 qemu-kvm ... Best wish & thx. sluo
Comment 7 Ademar Reis 2012-07-24 20:37:18 UTC
email@example.com, thank you for taking the time to enter this bug and give us the detailed information. We appreciate the feedback and look to use reports such as this to guide our efforts at improving our products. That being said, this bug is a corner case under an old RHEL release and therefore is of very low priority for us. Since we have a full queue of bugs in RHEL6 and RHEL7 and we can't reproduce with a RHEL6.3 guest, I'm closing it for now. If this issue is critical for you, please raise a ticket through your regular Red Hat support channels to make certain it receives the proper attention and prioritization to assure a timely resolution. For information on how to contact the Red Hat production support team, please visit: https://www.redhat.com/support/process/production/#howto
Comment 8 starlight 2012-07-24 21:01:51 UTC
I'm giving up one reporting future bugs--this was the last. Complete waste of time. Long ago RH viewed fixing bugs as good policy, that overall quality mattered. Haven't seen that attitude in years. Of late upstream code quality is better, particularly with the kernel, and the developers are vastly more responsive to good reports. RH should consider advancing kernel versions with "dot" releases instead of trying to backport new features. The kernel is much more complex than it was and backporting is labor intensive and suboptimal from a quality perspective. Add the resources to kernel.org instead. Have been running upstream 3.1 from the get-go since the RH variant never worked properly. Next time I touch this system it will be when an important CVE arrives, at which point I'll take a newer kernel.org branch that's not EOL.