Bug 1465799
Summary: | When do migration from RHEL7.4 host to RHEL7.3.Z host, dst host prompt "error while loading state for instance 0x0 of device 'spapr_pci'" | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | xianwang <xianwang> |
Component: | qemu-kvm-rhev | Assignee: | Laurent Vivier <lvivier> |
Status: | CLOSED ERRATA | QA Contact: | xianwang <xianwang> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.4 | CC: | knoel, lvivier, michen, mrezanin, mtessun, qzhang, virt-maint, xianwang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | ppc64le | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.10.0-1.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-11 00:26:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1473046 |
Description
xianwang
2017-06-28 08:50:58 UTC
It fails in get_uint32_equal() for dma_liobn field, because received value is 0, while it should be 0x80000000. This part has been reworked between 2.7 and 2.8, there is a special case and the value is "forged" to be able to migrate to machine before 2.8 (before RHEL 7.4). I think the value is not initialized correctly. It works when the OS is started because I guess the OS has put the good value in the field. 1.Spapr-pci is a device of ppc only, so this bug is powerpc only, not for x86_64. 2.When do migration as same qemu cli as bug report from rhel7.3.z host to rhel7.4 host, the result is normal i.e,the migration completed and vm is running on dst host (In reply to xianwang from comment #3) > 1.Spapr-pci is a device of ppc only, so this bug is powerpc only, not for > x86_64. > 2.When do migration as same qemu cli as bug report from rhel7.3.z host to > rhel7.4 host, the result is normal i.e,the migration completed and vm is > running on dst host 3. If the migration is started while the guest OS is running, the migration from rhel7.4 host to rhel7.3.z host works well The problem is related to the hack to allow the migration from 2.8 and latter to pre-2.8. In the pre_save function, the part copying the new fields to the migration fields is short-circuited because the function returns when the number of MSI devices is 0. Moving the copy of the fields before this part fixes the problem. (In reply to Laurent Vivier from comment #4) > (In reply to xianwang from comment #3) > > 1.Spapr-pci is a device of ppc only, so this bug is powerpc only, not for > > x86_64. > > 2.When do migration as same qemu cli as bug report from rhel7.3.z host to > > rhel7.4 host, the result is normal i.e,the migration completed and vm is > > running on dst host > > 3. If the migration is started while the guest OS is running, the migration > from rhel7.4 host to rhel7.3.z host works well Hi, Laurent, Yes, you are right, I have re-test this scenario with the latest version just now, if the guest OS is running, the migration works well, but if use the simple cli, migration failed as bug report. Detail is as below: version: host1(rhel7.4) 3.10.0-690.el7.ppc64le qemu-kvm-rhev-2.9.0-14.el7.ppc64le SLOF-20170303-4.git66d250e.el7.noarch host2(rhel7.3) 3.10.0-514.27.1.el7.ppc64le qemu-kvm-rhev-2.6.0-28.el7_3.12.ppc64le SLOF-20160223-6.gitdbbfda4.el7.noarch scenario I: qemu cli: 1.On rhel7.4 host boot guest: /usr/libexec/qemu-kvm -monitor stdio -M pseries-rhel7.3.0 -nodefaults -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=09 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=qcow2,file=/root/rhel74-ppc64le-virtio-scsi.qcow2 -device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,bootindex=0 -vnc :1 -vga std 2.On rhel7.3 host launch listening mode with "-incoming tcp:0:5801" 3.Migrate guest form rhel7.4 host to rhel7.3 host (qemu) migrate -d tcp:10.16.42.48:5801 4.Check the status of migration and reboot guest on src host(rhel7.4) (qemu) info migrate Migration status: completed on dst host(rhel7.3) (qemu) info status VM status: running (qemu) system_reset (qemu) KVM: Failed to create TCE table for liobn 0x80000001 vm works well though there is this message prompt. 5.Then, migrate guest from rhel7.3 to rhel7.4 host migration completed and "system_reset",there is no message prompt. scenario II: use the simple qemu cli as bug report, the result is same with bug. Now, I will test the build in comment 6. For the message prompt in qemu that "(qemu) KVM: Failed to create TCE table for liobn 0x80000001" in comment8, there is another bug for this issue which is https://bugzilla.redhat.com/show_bug.cgi?id=1440619, this bug is fixed for rhel7.4 but not for rhel7.3.z, due to this bug is triggered once the memory of guest is very small and it is not critical enough, so, we should ignore it. Fixed in qemu 2.10. Fix will allow to migrate from RHEL7.5.0 to RHEL7.3.z and before. This bug is verified pass on qemu-kvm-rhev-2.10.0-5.el7.ppc64le version: host1(rhel7.5) 3.10.0-776.el7.ppc64le qemu-kvm-rhev-2.10.0-5.el7.ppc64le SLOF-20170724-2.git89f519f.el7.noarch host2(rhel7.3) 3.10.0-776.el7.ppc64le qemu-kvm-rhev-2.6.0-28.el7_3.14.ppc64le SLOF-20170724-2.git89f519f.el7.noarch steps: 1.On src host(rhel7.5 host) /usr/libexec/qemu-kvm -monitor stdio -M pseries-rhel7.3.0 -nodefaults 2.On dst host(rhel7.3 host) /usr/libexec/qemu-kvm -monitor stdio -M pseries-rhel7.3.0 -nodefaults -incoming tcp:0:5801 3.on src host: (qemu)migrate -d tcp:10.16.42.46:5801 result: migration complete successfully. src end: (qemu) info migrate Migration status: completed (qemu) info status VM status: paused (postmigrate) dst end: (qemu) info status VM status: running So, this bug is fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |