Description of problem: The driver/device pair does not have a reset option to call when the driver is unloaded / system is rebooted. As a result, there might be cases where the irq line will stay asserted and the system will consume 100% cpu How reproducible: run qemu with -vmchannel di:0200,tcp:0:4444,server and connect a telnet 0 4444 client with input redirection of a huge file . Then, run winxp with the hypercall driver, reboot the system Actual results: The system won't be able to power down on the reboot. Expected results: Successful reboot Additional info:
QA_ACK for 5.5. Probably should be cloned for 5.4.z, if we want it there as well.
*** Bug 536835 has been marked as a duplicate of this bug. ***
http://post-office.corp.redhat.com/archives/virtualist/2009-December/msg00640.html
(In reply to comment #0) > Description of problem: > The driver/device pair does not have a reset option to call when the driver is > unloaded / system is rebooted. > As a result, there might be cases where the irq line will stay asserted and the > system will consume 100% cpu > > How reproducible: > > run qemu with -vmchannel di:0200,tcp:0:4444,server and connect a telnet 0 4444 > client with input redirection of a huge file . Does it mean "#telnet 0 4444 < hugefile"?
Yap
(In reply to comment #0) ... > How reproducible: > > run qemu with -vmchannel di:0200,tcp:0:4444,server and connect a telnet 0 4444 > client with input redirection of a huge file . > Then, run winxp with the hypercall driver, reboot the system > Can't reproduce, need to confirm whether my steps are right. 1. boot a winXP guest (Red Hat Hypercall Device already installed) with vmchannel option. #/usr/libexec/qemu-kvm -m 2048 -smp 2 -drive file=winXP-32.raw,if=ide,cache=off,boot=on -net nic,model=rtl8139,vlan=1,macaddr=DE:AD:BE:EF:17:27 -net tap,vlan=1,script=/etc/qemu-ifup -boot c -uuid c9bdbfde-dd54-4e9d-8fbe-17228cd33a08 -usbdevice tablet -no-hpet -rtc-td-hack -no-kvm-pit-reinjection -monitor stdio -notify all -cpu qemu64,+sse2 -balloon none -vnc :1 -vmchannel di:0200,tcp:0:4444,server 2. connect through telnet. $telnet $hostIP 4444 < RHEL5.4-Server-20090819.0-i386-DVD.iso (also use "yes 'abc' | telnet $hostIP 4444" instead) 3. make sure Red Hat Hypercall Device is installed and enabled (check "Device Manager">"System Devices" inside winXP). 4. restart winXP by clicking buttons inside winXP. After the above steps, Windows can be restarted normally normally on kvm-83-105.el5_4.13 and kvm-83-105.el5_4.9.
Additional info accompanying comment 14: there were lines of message as " vmchannel_read: error: got read during interrupt disabled vmchannel_read: error: got read during interrupt disabled vmchannel_read: error: got read during interrupt disabled ... " shown in the qemu console on host after executing step 2 in comment 14. the msg always appeared when during guest start-up and shutdown.
It shouldn't reboot on kvm-83-105.el5_4.13. Do you have a userspace daemon using the driver in the guest? It is installed when you install the driver using the msi installer. Once you write into the vmchannel it should consume lots of cpu in the guest
(In reply to comment #16) > It shouldn't reboot on kvm-83-105.el5_4.13. Do you have a userspace daemon > using the driver in the guest? It is installed when you install the driver > using the msi installer. Once you write into the vmchannel it should consume > lots of cpu in the guest I believe I have. As I notice in the "Windows Task Manager" there is a guestVdsAgentService.exe (SYSTEM proc) which costs around 20% of the CPU Usage when writing into the vmchannel, and around 0% when I terminate the telnet client.
Also checked under RHEL host: the qemu-kvm process running the winXP guest costed about 80% of CPU when writing into the vmchannel just as you described in comment 16. BTW, did you notice comment 15? Is it something unusual? Thanks
I saw comment #15, it was common in the original code, the fix dos not have it. Again, to understand, this is fixed in kvm-83-105.el5_4.15 before it should not reboot and get stuck
retest many times (*20) with khong's steps. still can not reproduce the bug in kvm-83-105.el5_4.13 also test in kvm-83-105.el5_4.19. the only different I have observed is the mount of debug message: (vmchannel_read: error: got read during interrupt disabled) on kvm-83-105.el5.4_13, tuns of debug message is printed when booting the guest or do operation in the guest. on kvm-83-105.el5_4.19. I have reboot the guest X3 times, load the guest by iometer and cpuhog, the number of debug message is less then 10. Dor,Can we verified the bug according to current test ? (can not reproduce in 20X reboot )
It is probably very rare thing, together with https://bugzilla.redhat.com/show_bug.cgi?id=553249 it was easy to reproduce because of the later bug. I guess you shouldn't invest no more in reproducing it.
Created attachment 386084 [details] screen shot (In reply to comment #21) > It is probably very rare thing, together with > https://bugzilla.redhat.com/show_bug.cgi?id=553249 it was easy to reproduce > because of the later bug. > I guess you shouldn't invest no more in reproducing it. Can reproduce in kvm-83-105.el5_4.13 with spice CLI : [root@t199 t199]# ps -aef | grep kvm root 29257 8655 94 22:01 pts/5 00:29:35 /usr/libexec/qemu-kvm -m 1024 -smp 1 -drive file=winXP-32-1.raw,if=ide -net nic,macaddr=DE:AD:BE:EF:99:01 -net tap -boot c -uuid 2f5bde3c-9721-4ff7-a41e-0ab4a28cea8c -usbdevice tablet -rtc-td-hack -no-kvm-pit-reinjection -monitor stdio -notify all -balloon none -vmchannel di:0200,tcp:0:7891,server,nowait -drive file=/data/t199/test1,if=ide -name t99.2 -spice host=0,ic=on,port=5911,disable-ticketing -qxl 1 -soundhw ac97 -drive file=rhevm-guest-tools-2.1-39917.iso,media=cdrom top - 22:32:55 up 2 days, 22:26, 11 users, load average: 1.80, 1.75, 2.37 Tasks: 166 total, 2 running, 164 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 44.0%sy, 0.0%ni, 55.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 7912828k total, 4141532k used, 3771296k free, 188764k buffers Swap: 10482404k total, 168k used, 10482236k free, 3429208k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29257 root 15 0 1366m 1.1g 899m S 100.0 14.4 29:31.04 qemu-kvm 1885 root 11 -5 0 0 0 R 76.0 0.0 2689:39 kksmd 494 root 10 -5 0 0 0 S 0.0 0.0 0:52.16 scsi_eh_1
So is it verified?
yes. also do the above testing on kvm-83-147.el5. PASS.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0271.html