Bug 876601

Summary: device_del cannot delete a virtio disk that is in use
Product: Red Hat Enterprise Linux 6 Reporter: FuXiangChun <xfu>
Component: qemu-kvmAssignee: Asias He <asias>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 6.4CC: acathrow, areis, asias, bsarathy, chayang, dyasny, flang, juzhang, michen, mkenneth, qzhang, rhod, sluo, unicell, virt-maint
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-22 04:28:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
guest call trace message
none
attached call trace again none

Description FuXiangChun 2012-11-14 14:56:00 UTC
Description of problem:
Try to remove a virtio disk device that is in use via monitor or qmp,  it will fail. rhel6 and windows2k8r2 guest have the same issue.

This is a regression issue. reason: 
1.for qemu-kvm-rhev-0.12.1.2-2.334
  cannot be removed

2.for qemu-kvm-rhev-0.12.1.2-2.209
  can be removed

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-0.12.1.2-2.334.el6.x86_64
# uname -r
2.6.32-340.el6.x86_64

guest:
rhel6 and windows2k8

How reproducible:
100%

Steps to Reproduce:
1./usr/libexec/qemu-kvm -enable-kvm -m 2G -smp 4 -name rhel6 -uuid ddcbfb49-3411-1701-3c36-6bdbc00bedb9 -rtc base=utc,clock=host,driftfix=slew -boot c -drive file=/home/rhel6.4.qcow2,if=none,id=drive-virtio-0-1,format=qcow2,cache=none,werror=report,rerror=report -device virtio-blk-pci,drive=drive-virtio-0-1,id=virt0-0-1 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:50:a4:c2:c5  -vnc :1 -device virtio-balloon-pci,id=ballooning -monitor stdio  -qmp tcp:0:4455,server,nowait  -serial unix:/home/error-message,server,nowait -drive file=/root/live-block-copy/5g.qcow2,format=qcow2,if=none,id=drive-disk,cache=none,werror=ignore,rerror=ignore -device virtio-blk-pci,scsi=off,drive=drive-disk,id=image

2.copy a big file to second disk

3.device_del image

4.info block via monitor
info block
drive-virtio-0-1: removable=0 file=/home/rhel6.4.qcow2 ro=0 drv=qcow2 encrypted=0
drive-disk: removable=0 file=/root/live-block-copy/5g.qcow2 ro=0 drv=qcow2 encrypted=0
ide1-cd0: removable=1 locked=0 tray-open=0 io-status=ok [not inserted]
floppy0: removable=1 locked=0 tray-open=0 [not inserted]
sd0: removable=1 locked=0 tray-open=0 [not inserted]

5.info pci via monitor
Bus  0, device   6, function 0:
    SCSI controller: PCI device 1af4:1001
      IRQ 10.
      BAR0: I/O at 0xc0c0 [0xc0ff].
      BAR1: 32 bit memory at 0xf2040000 [0xf2040fff].
      id "image"

  
Actual results:


Expected results:
should be deleted

Additional info:

Comment 3 juzhang 2012-11-15 02:20:38 UTC
FYI
Bug 734051 - rhel6.1 guest hang when unplug is using virtio disk from monitor

Comment 4 FuXiangChun 2012-11-15 04:58:09 UTC
Tested three scenarios. and scenario 1 can verify this bug 734051.

1. qemu-kvm-0.12.1.2-2.209
   guest kernel-2.6.32-296.el6
   result:
   secondary disk can be removed successfully via device_del

2. qemu-kvm-rhev-0.12.1.2-2.334.el6.x86_64.rpm
   guest kernel-2.6.32-296.el6
   result:
   secondary disk cann't be removed via device_del(check guest and info pci/block)

3. qemu-kvm-rhev-0.12.1.2-2.334.el6.x86_64.rpm
   guest kernel-2.6.32-340.el6.x86_64
   
   result:
   secondary disk cann't be removed via device_del(check guest and info pci/block)

So, kernel-296 and qemu-kvm-209 support remove device after bug 734051 is fixed. For this bug, whatever kernel 296 or 340. The latest qemu-kvm don't support remove device.

Comment 5 FuXiangChun 2012-11-19 08:55:56 UTC
Another scenario, Tested this issue with qemu-kvm-0.12.1.2-2.334.el6.x86_64 and kernel 2.6.32-342.el6.x86_64

Result:

   device_del still cann't remove device is in use, but guest will show call trace and automatic restart after a few minutes. Meantime device will be automatically removed after guest reboot as well. I attached call trace message.

Comment 6 FuXiangChun 2012-11-19 08:56:47 UTC
Created attachment 647590 [details]
guest call trace message

Comment 15 FuXiangChun 2012-11-28 02:18:02 UTC
Created attachment 653223 [details]
attached call trace again

Comment 16 FuXiangChun 2012-11-28 02:49:23 UTC
summary testing result with fix v2

1. guest don't appear panic
2. device will be removed automatically after i/o operation is done(execute device_del image that is in use).
3. guest work well

Comment 17 Ronen Hod 2012-11-28 07:37:11 UTC
Asias has a working fix, but,
As this is not a regression
And it was not sent upstream yet
And we are late in 6.4
I prefer not to take a risk and defer to 6.5.
Asias will add a tech note explaining how to avoid it.

Comment 18 Asias He 2012-12-18 02:26:57 UTC
sluo did similar hot-unplug test (20 times) here:
https://bugzilla.redhat.com/show_bug.cgi?id=734051#c34

This also confirms we are ok with the hot-unplug process.

The real problem is tracked by this bz:
https://bugzilla.redhat.com/show_bug.cgi?id=870344

Another more rigorous hot-unplug test (1000 times) is ongoing, I will close this bug when the test passes.

Comment 19 Asias He 2013-01-22 04:28:54 UTC
1. QE reported the real problem is fixed at bz870344.
2. We have new bz: https://bugzilla.redhat.com/show_bug.cgi?id=892067 to track the issue during the 1000 times test.

So I am closing this bug.

*** This bug has been marked as a duplicate of bug 870344 ***