Bug 1372153
Summary: | migration failed from rhel7.3 to rhel7.0 when guest with numa setting | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yafu <yafu> | ||||||
Component: | libvirt | Assignee: | Martin Kletzander <mkletzan> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | zhe peng <zpeng> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.3 | CC: | dyuan, fjin, jsuchane, mzhan, rbalakri, xuzhang, yafu, yanqzhan, zpeng | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-05-12 14:49:28 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
yafu
2016-09-01 04:04:40 UTC
Correct the xml setting for the guest, the error was caused by numatune: #virsh dumpxml mig1 ... <cpu> ... <numa> <cell id='0' cpus='0-1' memory='512000' unit='KiB'/> <cell id='1' cpus='2-3' memory='512000' unit='KiB'/> </numa> ... </cpu> ... <numatune> <memory mode='strict' nodeset='0'/> </numatune> ... Created attachment 1197748 [details]
libvirtd.log and qemu.log both on source and target host
Do you have some matrix of migrations from/to which work and which don't? I'm guessing if this doesn't work, then 7.0 -> 7.3 doesn't work either, also 7.2 <-> 7.3 is broken both ways, right? Make sure you have (at minimum): <memoryBacking> <hugepages/> </memoryBacking> <cpu> <numa> <cell .../> </numa> </cpu> but no <numatune>, neither nodeset= in <hugepages/>. Fixed upstream with commit v2.3.0-rc1-10-gff3112f3dc2c: commit ff3112f3dc2c276a7e387ff7bb86f4fbbdf7bf2c Author: Martin Kletzander <mkletzan> Date: Fri Sep 23 11:31:30 2016 +0200 qemu: Only use memory-backend-file with NUMA if needed (In reply to Martin Kletzander from comment #4) > Do you have some matrix of migrations from/to which work and which don't? > I'm guessing if this doesn't work, then 7.0 -> 7.3 doesn't work either, also > 7.2 <-> 7.3 is broken both ways, right? Make sure you have (at minimum): > > <memoryBacking> > <hugepages/> > </memoryBacking> > <cpu> > <numa> > <cell .../> > </numa> > </cpu> > > but no <numatune>, neither nodeset= in <hugepages/>. Sorry for late. I just come back from holiday. With the following setting,but no <numatune>, neither nodeset= in <hugepages/>: <memoryBacking> <hugepages/> </memoryBacking> <cpu> <numa> <cell .../> </numa> </cpu> Test results are as follows: 1.Migration failed from rhel7.3 to rhel7.0, since the qemu cmd line use "memory-backend-file" in the rhel7.3, but it uses "-mem-prealloc -mem-path /dev/hugepages/libvirt/qemu" in rhel7.0. 2.Migration works well from rhel7.0 to rhel7.3, both source and target host are use "-mem-prealloc -mem-path /dev/hugepages/libvirt/qemu". 3.It works well when do migration between rhel7.2 and rhel7.3, since both rhel7.2 and rhel7.3 use "memory-backend-file". Created attachment 1208308 [details]
The guest XML
Please see the guest XML in the attachment. (In reply to yafu from comment #6) You are saying that 7.0 <-> 7.2 doesn't work either? Would you mind checking 7.0 <-> 7.1 as well? Thanks a lot in advance. (In reply to Martin Kletzander from comment #9) > (In reply to yafu from comment #6) > You are saying that 7.0 <-> 7.2 doesn't work either? Would you mind > checking 7.0 <-> 7.1 as well? Thanks a lot in advance. 1.rhel7.2->rhel7.0 works well now, since Bug 1266856 - Migration from 7.0 to 7.2 failed with numa+hugepage settings is fixed; 2.rhel7.1->rhel7.0 failed with the same error with rhel7.3->rhel7.0; (In reply to yafu from comment #10) Oh, my bad, I just figured it out. So Bug 1266856 fixed the scenario with: <memoryBacking> <hugepages/> </memoryBacking> <cpu> <numa> <cell .../> </numa> </cpu> but what we need to fix here is: <memoryBacking> <hugepages size='...'/> </memoryBacking> <cpu> <numa> <cell .../> </numa> </cpu> It has different code path and hence it might be beneficial to test both aproaches in the migration matrix, I guess. Any easy fix that would be provided now could actually break newer migration scenarios (rhel7.2 -> rhel7.3). Since this is corner case and was not reported by any customer, I'm closing this as WONTFIX. The reasoning behind it is just that we will have less broken things this way then if we "fixed" this particular scenario. This bug only affects migration from rhel7.0 to newer ones, I believe. |