Bug 725368

Summary: VDSM: During failed migration, VM stops responding for ~20 minutes
Product: Red Hat Enterprise Linux 5 Reporter: Daniel Paikov <dpaikov>
Component: kvmAssignee: Ronen Hod <rhod>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.7CC: abaron, bazulay, danken, hateya, iheim, knoel, mkenneth, tburke, virt-maint, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-07 11:37:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 699402    
Attachments:
Description Flags
vdsm.log none

Description Daniel Paikov 2011-07-25 10:31:03 UTC
Created attachment 515007 [details]
vdsm.log

* Create VM with 4GB of RAM.
* Migrate VM to 2nd host.
* During migration, block connection to 2nd host using iptables.
* VM becomes stuck in Not Responding status for ~20 minutes. During this time the VM doesn't respond to ping or any other connection.

Comment 1 Daniel Paikov 2011-07-25 10:53:18 UTC
Related to bug #690189.

Comment 3 Dan Kenigsberg 2011-07-25 20:14:00 UTC
Are you saying the the guest is hung for 20 minutes?

If you stop vdsm and contact qemu directly with

nc -U /var/vdsm/<vmId>.monitor.socket

is the qemu monitor responsive? (help, info status). If not, it is a qemu bug.

Comment 4 Dor Laor 2011-07-28 10:40:05 UTC
You should be able to cancel the migration through the monitor interface.

Comment 5 Daniel Paikov 2011-07-28 13:46:46 UTC
The monitor interface isn't responding during these 20 minutes.

Comment 8 Ronen Hod 2011-08-07 11:37:01 UTC
Closing since we only fix urgent bugs in RHEL5.8.
It seems as if we survived this behavior so far. It was not reported by a customer. If I got it right, there is no corruption, and the source vm can be restarted and run properly.
Ronen.