Bug 1982224

Summary: [RHEL9]qemu segfault after the 2rd postcopy live migration with vhost-user
Product: Red Hat Enterprise Linux 9 Reporter: Pei Zhang <pezhang>
Component: qemu-kvmAssignee: Juan Quintela <quintela>
qemu-kvm sub component: Live Migration QA Contact: Pei Zhang <pezhang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: chayang, dgilbert, jinzhao, juzhang, maxime.coquelin, mrezanin, quintela, virt-maint
Version: 9.0   
Target Milestone: beta   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-6.2.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1981782 Environment:
Last Closed: 2022-05-17 12:23:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1981782, 2021976, 2021981, 2024981, 2025609    
Bug Blocks:    

Comment 1 Pei Zhang 2021-07-14 13:16:34 UTC
Versions:
5.13.0-1.rt3.1.el9.x86_64
qemu-kvm-6.0.0-8.el9.x86_64
libvirt-7.4.0-1.el9.x86_64
openvswitch2.15-2.15.0-16.el9fdp.x86_64

Note:
Currently postcopy hit existed issue: Bug 1945420 - [RHEL9] Setup vm.unprivileged_userfaultfd for postcopy

The workaround is: execute below cmd on both src and des host before doing postcopy live migration.

# echo 1 >/proc/sys/vm/unprivileged_userfaultfd

Comment 2 John Ferlan 2021-07-22 18:45:50 UTC
Meirav - keeping this and the cloned from bug together.

Comment 3 Juan Quintela 2021-11-10 10:19:45 UTC
Hi Pei

This brew should fix the issue:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41123115

Patch exist upstream, and merge request has been sent.

Could you try it?

Thanks, Juan.

Comment 4 Pei Zhang 2021-11-11 06:19:50 UTC
(In reply to Juan Quintela from comment #3)
> Hi Pei
> 
> This brew should fix the issue:
> 
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41123115
> 
> Patch exist upstream, and merge request has been sent.
> 
> Could you try it?

Juan, thank you providing this build. 

This bug cannot be reproduced this build, post copy live migration works well and no any error any more.

Versions:
qemu-kvm-6.1.0-6.el9.quintela202111091652.x86_64

Tested:
Testcase: live_migration_nonrt_server_2Q_1G_iommu_ovs_postcopy
PASS

Testcase: live_migration_nonrt_server_1Q_1G_iommu_ovs_postcopy
PASS

Testcase: live_migration_nonrt_server_4Q_1G_iommu_ovs_postcopy
PASS

Thanks.

Best regards,

Pei



> 
> Thanks, Juan.

Comment 5 Pei Zhang 2021-12-17 04:11:23 UTC
Verification:

Versions:
qemu-kvm-6.2.0-1.el9.x86_64
libvirt-7.10.0-1.el9.x86_64
5.14.0-31.el9.x86_64
openvswitch2.15-2.15.0-33.el9fdp.x86_64
tuned-2.16.0-4.el9.noarch

vhost-user 1Q/2Q/4Q post copy live migration work well, no qemu crash and no any error any more.  


Testcase: live_migration_nonrt_server_2Q_1G_iommu_ovs_postcopy
PASS

Testcase: live_migration_nonrt_server_1Q_1G_iommu_ovs_postcopy
PASS

Testcase: live_migration_nonrt_server_4Q_1G_iommu_ovs_postcopy
PASS


So this bug has been fixed very well. Will move to Verified once ON_QA.

Comment 6 Yanan Fu 2021-12-20 12:44:35 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 11 errata-xmlrpc 2022-05-17 12:23:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: qemu-kvm), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2307