Bug 1210715

Summary: migration/rdma: 7.1->7.2: RDMA ERROR: ram blocks mismatch #3!
Product: Red Hat Enterprise Linux 7 Reporter: Dr. David Alan Gilbert <dgilbert>
Component: qemu-kvm-rhevAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 7.2CC: dgilbert, hhuang, huding, juzhang, mrezanin, qzhang, virt-maint, xfu
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.3.0-13.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1242032 1248382 (view as bug list) Environment:
Last Closed: 2015-12-04 16:35:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1174920, 1242032, 1248382    
Attachments:
Description Flags
xml for broken vm none

Description Dr. David Alan Gilbert 2015-04-10 12:23:23 UTC
Created attachment 1013118 [details]
xml for broken vm

Description of problem:
This VM migrates fine usign TCP but with RDMA it fails

Version-Release number of selected component (if applicable):

source: qemu-kvm-rhev-2.1.2-23.el7_1.1.x86_64
dest: qemu-kvm-rhev-2.3.0-0.el7.dgilbert20150410a.x86_64

How reproducible:
100%

Steps to Reproduce:
1. See xml attached; 7.1 machine type vm started on 7.1
2. migrate to 7.2 using RDMA - virsh migrate --live --migrateuri rdma://ibpair f20-414 qemu+ssh://ibpair/system --listen-address 192.168.99.13


Actual results:
Migration fails, error seen in source logs:
RDMA ERROR: ram blocks mismatch #3! Your QEMU command line parameters are probably not identical on both the source and destination.


Expected results:
A working migration

Additional info:

Comment 2 Dr. David Alan Gilbert 2015-04-10 14:33:42 UTC
    hmm, the RDMA code seems to be keying off the 'offset' which I think is the offset in RAMBlock address space - that's not in any way intended to stay the same across versions
'mismatch #3' is when qemu_rdma_registration_stop failed to find a matching RAMBlock.

Comment 3 Dr. David Alan Gilbert 2015-04-17 11:04:02 UTC
Just confirmed this also affects 7.1 by itself with hotplug, hotplug a device on the source, but have it on the command line on the destination and it fails in a similar way.

Comment 8 Miroslav Rezanina 2015-07-24 11:08:07 UTC
Fix included in qemu-kvm-rhev-2.3.0-13.el7

Comment 11 Shaolong Hu 2015-07-31 07:18:13 UTC
source host:
qemu-kvm-1.5.3-60.el7.x86_64


Reproduced with qemu-kvm-rhev-2.3.0-8.el7.x86_64 on target host:


(qemu) migrate -d x-rdma:192.168.0.21:6666
source_resolve_host RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (1) Infiniband
(qemu) 
(qemu) 
(qemu) 
(qemu) 
(qemu) info migrateRDMA ERROR: ram blocks mismatch #1! Your QEMU command line parameters are probably not identical on both the source and destination.


Verify failed with qemu-kvm-rhev-2.3.0-13.el7.x86_64 on target host:

(qemu) migrate -d x-rdma:192.168.0.21:6666
source_resolve_host RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (1) Infiniband
(qemu) RDMA ERROR: ram blocks mismatch #1! Your QEMU command line parameters are probably not identical on both the source and destination.


David, could you have a look?

Comment 12 Shaolong Hu 2015-07-31 09:23:05 UTC
Retest with qemu-kvm-rhev-2.1.2-23.el7_1.6.x86_64 on source host, result are the same.

Comment 17 errata-xmlrpc 2015-12-04 16:35:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html