Bug 1683907

Summary: Qemu guest agent lost connection after postcopy migration if postcopy phase lasts too long(more than 30s)
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Fangge Jin <fjin>
Component: libvirtAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED WONTFIX QA Contact: Fangge Jin <fjin>
Severity: low Docs Contact:
Priority: unspecified    
Version: 8.0CC: chhu, dyuan, jdenemar, lizhu, mprivozn, rbalakri, xuzhang, yafu, yalzhang, yanqzhan
Target Milestone: rcKeywords: Triaged
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-15 07:26:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1897025    
Attachments:
Description Flags
libvirtd log none

Description Fangge Jin 2019-02-28 04:51:20 UTC
Created attachment 1539358 [details]
libvirtd log

Description of problem:
Qemu guest agent lost connection after migration if postcopy phase lasts too long(more than 30s)

Version-Release number of selected component (if applicable):
libvirt-5.0.0-4.el8.x86_64
qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Start a vm

2.Migrate vm to another host with postcopy enabled and low postcopy bandwidth, and switch migration to postcopy mode:
# date; virsh migrate avocado-vt-vm1 qemu+ssh://10.66.5.148/system --live --verbose --p2p  --persistent --postcopy --postcopy-bandwidth 5; date
Thu Feb 28 12:41:58 CST 2019
Migration: [100 %]
Thu Feb 28 12:42:53 CST 2019

# date; virsh migrate-postcopy avocado-vt-vm1 
Thu Feb 28 12:42:01 CST 2019

The postcopy phase lasts 52s as shown above, while QEMU_JOB_WAIT_TIME is 30s.

3.After migration finished, check guest agent on target host:
# virsh domtime avocado-vt-vm1 
error: Guest agent is not responding: QEMU guest agent is not connected


Actual results:
As step3

Expected results:
Guest agent can work after migration.

Additional info:

Comment 3 Michal Privoznik 2021-06-28 14:40:02 UTC
I couldn't reproduce anymore :( I've tried the latest libvirt (v7.5.0-rc1-3-g7c08141f90) and latest qemu (v6.0.0-1995-g5d2d18ae39). Jin, do you still see this problem?

Comment 4 Fangge Jin 2021-06-29 11:19:50 UTC
I can reproduce with libvirt-client-7.4.0-1.module+el8.5.0+11218+83343022.x86_64 and qemu-kvm-6.0.0-21.module+el8.5.0+11555+e0ab0d09.x86_64

Comment 5 Fangge Jin 2021-06-29 11:20:20 UTC
I will try with upstream version

Comment 6 Fangge Jin 2021-06-29 13:13:17 UTC
I can reproduce with upstream libvirt: libvirt-7.5.0-1. But I think this scenario is common, normally postcopy migration will finish very soon. I will change the severity to low.

Comment 7 Fangge Jin 2021-06-29 13:13:42 UTC
I can reproduce with upstream libvirt: libvirt-7.5.0-1. But I think this scenario is not common, normally postcopy migration will finish very soon. I will change the severity to low.

Comment 8 RHEL Program Management 2021-08-15 07:26:50 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.