Bug 1393043

Summary: system_reset should clear pending request for error (IDE)
Product: Red Hat Enterprise Linux 7 Reporter: Marcel Kolaja <mkolaja>
Component: qemu-kvm-rhevAssignee: John Snow <jsnow>
Status: CLOSED CURRENTRELEASE QA Contact: aihua liang <aliang>
Severity: medium Docs Contact:
Priority: high    
Version: 7.2CC: ailan, aliang, areis, armbru, chayang, coli, huding, jsnow, juzhang, knoel, michen, mkenneth, mrezanin, qzhang, rbalakri, virt-bugs, virt-maint, xuwei, zhguo
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.6.0-28.el7_3.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1299876 Environment:
Last Closed: 2017-02-09 10:59:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1299876    
Bug Blocks:    

Description Marcel Kolaja 2016-11-08 17:49:34 UTC
This bug has been copied from bug #1299876 and has been proposed
to be backported to 7.3 z-stream (EUS).

Comment 3 Miroslav Rezanina 2016-11-30 10:42:55 UTC
Fix included in qemu-kvm-rhev-2.6.0-28.el7_3.1

Comment 5 aihua liang 2016-12-23 10:44:30 UTC
Has verified it, it still has some problem, bellow is the detail:

Version-Release number:
  kernel version:3.10.0-514.6.1.el7.x86_64
  qemu-kvm-rhev version:qemu-kvm-rhev-2.6.0-28.el7_3.2.x86_64


Test Step:
 1.Create a 25G qcow2 image and install windows2012r2 on it.

 2.Full write the guest image disk until no space left on device.

 3.Start guest by qemu cmd:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox off  \
-machine pc  \
-nodefaults  \
-vga std  \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20161219-042734-6fVMWCMz,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control  \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20161219-042734-6fVMWCMz,server,nowait \
-mon chardev=qmp_id_catch_monitor,mode=control \
-drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/win2012.qcow2 \
-device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=03 \
-device virtio-net-pci,mac=9a:f2:f3:f4:f5:f6,id=id30uvBS,vectors=4,netdev=idADyVP5,bus=pci.0,addr=04  \
-netdev tap,id=idADyVP5,vhost=on \
-m 2048  \
-smp 16,maxcpus=16,cores=8,threads=1,sockets=2  \
-cpu 'Opteron_G3',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \
-vnc :0  \
-rtc base=localtime,clock=host,driftfix=slew  \
-boot order=cdn,once=d,menu=off,strict=off \
-enable-kvm \
-spice port=3000,ipv4,disable-ticketing \
-monitor stdio \

 4.Start some applications on guest utils it hangs.

 5.Check vm status:
   (qemu)info status     -------> VM status:paused(io-error)

 6.Reset vm
   (qemu)system_reset    ------->VM doesn't reboot immediately.
 
 7.Check vm status
   (qemu)info status    -------->VM status:paused(prelaunch)

 8.Cont vm
   (qemu)cont           -------->VM restart but hang when loading os

 9.Check vm status
   (qemu)info status    -------->VM status:paused(io-error)

Actual Result:
  When we do "system_reset", guest starts fail as it can't recovery from io-error.

Expect Result:
  VM can recovery from "io-error" after system_reset and start successfully.
 

Note:
  My CPU is "Opteron_G3" and no segment fault displayed in qemu during the test.

Comment 7 aihua liang 2016-12-27 10:45:47 UTC
Retest it by ide-drive, it also has problem:

Version-Release number:
  kernel version:3.10.0-514.6.1.el7.x86_64
  qemu-kvm-rhev version:qemu-kvm-rhev-2.6.0-28.el7_3.2.x86_64


Test Step:
 1.Create a 25G qcow2 image and install windows2012r2 on it.

 2.Full write the guest image disk until no space left on device.

 3.Start guest by qemu cmd:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox off  \
-machine pc  \
-nodefaults  \
-vga std  \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20161227-001116-PD2k1uXB,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control  \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20161227-001116-PD2k1uXB,server,nowait \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=id95e1vw  \
-chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20161227-001116-PD2k1uXB,server,nowait \
-device isa-serial,chardev=serial_id_serial0  \
-chardev socket,id=seabioslog_id_20161227-001116-PD2k1uXB,path=/var/tmp/seabios-20161227-001116-PD2k1uXB,server,nowait \
-device isa-debugcon,chardev=seabioslog_id_20161227-001116-PD2k1uXB,iobase=0x402 \
-device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \
-device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \
-device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \
-device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \
-drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/win2012-64r2-virtio.qcow2 \
-device ide-hd,id=image1,drive=drive_image1,bus=ide.0,unit=0 \
-device virtio-net-pci,mac=9a:d7:d8:d9:da:db,id=idoYMY7R,vectors=4,netdev=iddvjhTd,bus=pci.0,addr=03  \
-netdev tap,id=iddvjhTd,vhost=on \
-m 4096  \
-smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
-cpu 'Westmere',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \
-drive id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/ISO/Win2012R2/en_windows_server_2012_r2_with_update_x64_dvd_6052708.iso \
-device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=1 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04 \
-drive id=drive_winutils,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/windows/winutils.iso \
-device scsi-cd,id=winutils,drive=drive_winutils \
-vnc :0  \
-rtc base=localtime,clock=host,driftfix=slew  \
-boot order=cdn,menu=off,strict=off \
-enable-kvm \
-monitor stdio \
-spice port=3000,ipv4,disable-ticketing \

 4.Start some applications on guest utils it hangs.

 5.Check vm status:
   (qemu)info status     -------> VM status:paused(io-error)

 6.Reset vm
   (qemu)system_reset    ------->VM doesn't reboot immediately.
 
 7.Check vm status
   (qemu)info status    -------->VM status:paused(prelaunch)

 8.Cont vm
   (qemu)cont           -------->VM restart but hang when loading os

 9.Check vm status
   (qemu)info status    -------->VM status:paused(io-error)

Actual Result:
  When we do "system_reset", guest starts fail as it can't recovery from io-error.

Expect Result:
  VM can recovery from "io-error" after system_reset and start successfully.

Comment 9 aihua liang 2017-01-11 12:12:56 UTC
According to fam's suggestion in https://bugzilla.redhat.com/show_bug.cgi?id=1393041 comment8, retest the bug, vm can reset successfully after release the disk space, detail as bellow:

Test Steps:
   1.Create a 25G qcow2 image and install windows2012r2 on it.

   2.Full write the guest image disk until no space left on device.

   3.Start guest by qemu cmd:(same as comment 7)

   4.Start some applications on guest utils it hangs.

   5.Check vm status:
     (qemu)info status     ------->VM status:paused(io-error)

   6.Reset vm
     (qemu)system_reset    ------->VM doesn't reboot immediately.
 
   7.Check vm status
     (qemu)info status    -------->VM status:paused(prelaunch)

   8.Cont vm
     (qemu)cont           -------->VM restart but hang when loading os

   9.Check vm status
     (qemu)info status    -------->VM status:paused(io-error)
 
   10.Release some disk space, then repeat step6~9.

 Test Result:
  After step10, vm can boot up and work normally with status "running".

After release disk space + vm reset, the io-error can be ended and vm can work normally, so change the bug status to "Verified".

Comment 10 John Snow 2017-01-12 23:01:00 UTC
OK, removing needinfo for now. If there are still problems, please let me know.