Description of problem: postcopy migration failed when guest with numa+hugepage+nodeset setting Version-Release number of selected component (if applicable): libvirt-4.5.0-17.el7.x86_64 qemu-kvm-rhev-2.12.0-27.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.Config hugepage in both soure and target os; 2.Start a guest with numa+hugepage+nodeset setting: #virsh dumpxml vm1 ... <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0-1'/> </hugepages> </memoryBacking> ... <cpu mode='custom' match='exact' check='full'> ... <numa> <cell id='0' cpus='0-1' memory='1025024' unit='KiB'/> <cell id='1' cpus='2-3' memory='1025024' unit='KiB'/> </numa> </cpu> ... 2.Check the qemu cmd line: #ps aux | grep -i numa ...-object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/1-vm1,size=1049624576 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/1-vm1,size=1049624576 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 ... 3.Do postcopy migraiton: # virsh migrate vm1 qemu+ssh://10.66.4.143/system --live --verbose --postcopy error: internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported 4.Check the libvirtd log: # cat /var/log/libvirt/libvirtd.log | grep -i 'migrate-set' 2019-05-16 03:22:48.260+0000: 14503: debug : virJSONValueToString:2005 : result={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true}]},"id":"libvirt-3"} 2019-05-16 03:22:48.260+0000: 14503: debug : qemuMonitorJSONCommandWithFd:305 : Send command '{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true}]},"id":"libvirt-3"}' for write with FD -1 2019-05-16 03:22:48.260+0000: 14503: info : qemuMonitorSend:1083 : QEMU_MONITOR_SEND_MSG: mon=0x7fc70c014f80 msg={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true}]},"id":"libvirt-3"} 2019-05-16 03:22:48.260+0000: 14500: info : qemuMonitorIOWrite:551 : QEMU_MONITOR_IO_WRITE: mon=0x7fc70c014f80 buf={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true}]},"id":"libvirt-3"} Actual results: postcopy migration failed when guest with numa+hugepage+nodeset setting Expected results: postcopy migration should be successful when guest with numa+hugepage+nodeset setting Additional info: 1.It works well when guest with numa+hugepage setting. 2.Since can not find the error info from qemu side in the log, i filed the bug toward libvirt. Please help to correct it if i make a mistake.
Created attachment 1569326 [details] guest xml
Created attachment 1569328 [details] qemu log and libvirtd log on source host
Created attachment 1569330 [details] guest xml - update Please see guest xml here.
This looks like a limitation on QEMU side. Could you please recheck with current libvirt and QEMU?
(In reply to Jiri Denemark from comment #5) > This looks like a limitation on QEMU side. Could you please recheck with > current libvirt and QEMU? It works well with: libvirt-daemon-6.0.0-13.el8.x86_64 qemu-kvm-4.2.0-15.module+el8.2.0+6029+618ef2ec.x86_64
OK, thanks.
Tested this bug with latest RHEL-820AV with the following detailed components and steps. (SRC host and DST host) Version: libvirt-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64 qemu-kvm-4.2.0-17.module+el8.2.0+6131+4e715f3b.x86_64 kernel-4.18.0-193.el8.x86_64 Steps: 1: Prepare a VM on the SRC host, start VM and check qemu cmd line # virsh domstate vm1 shut off # virsh dumpxml vm1 --inactive <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0-1'/> </hugepages> </memoryBacking> <vcpu placement='static'>4</vcpu> <cpu mode='host-model' check='partial'> <feature policy='disable' name='vmx'/> <numa> <cell id='0' cpus='0-1' memory='1025024' unit='KiB'/> <cell id='1' cpus='2-3' memory='1025024' unit='KiB'/> </numa> </cpu> # virsh start vm1 Domain vm1 started # ps -ef | grep vm1 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/2-vm1,size=1049624576 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/2-vm1,size=1049624576 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 2. Migrate VM with post-copy parameter from SRC host to DST host # virsh migrate vm1 qemu+ssh://****/system --live --verbose --postcopy error: internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported 3. Log from SRC host # vim /var/log/libvirt/libvirtd.log 2020-04-08 09:40:25.952+0000: 6836: info : qemuMonitorSend:996 : QEMU_MONITOR_SEND_MSG: mon=0x7efc400033d0 msg={"execute":"query-migrate-capabilities","id":"libvirt-2"}^M fd=-1 2020-04-08 09:40:25.953+0000: 6759: info : qemuMonitorIOWrite:453 : QEMU_MONITOR_IO_WRITE: mon=0x7efc400033d0 buf={"execute":"query-migrate-capabilities","id":"libvirt-2"}^M len=59 ret=59 errno=0 2020-04-08 09:40:25.956+0000: 6836: info : qemuMonitorSend:996 : QEMU_MONITOR_SEND_MSG: mon=0x7efc400033d0 msg={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true}]},"id":"libvirt-3"}^M fd=-1 2020-04-08 09:40:25.956+0000: 6759: info : qemuMonitorIOWrite:453 : QEMU_MONITOR_IO_WRITE: mon=0x7efc400033d0 buf={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"events","state":true}]},"id":"libvirt-3"}^M len=125 ret=125 errno=0
In comment 6, the test succeeded because of there is no "nodeset" in memoryBacking/hugepages/page element after confirming with yafu@. And also test the scenario with the following components and steps. (SRC host and DST host) Version: libvirt-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64 qemu-kvm-4.2.0-17.module+el8.2.0+6131+4e715f3b.x86_64 kernel-4.18.0-193.el8.x86_64 Steps: 1: Prepare a VM on the SRC host, start VM and check qemu cmd line # virsh domstate vm1 shut off # virsh dumpxml vm1 --inactive <memoryBacking> <hugepages> <page size='2048' unit='KiB'/> </hugepages> </memoryBacking> <vcpu placement='static'>4</vcpu> <cpu mode='host-model' check='partial'> <feature policy='disable' name='vmx'/> <numa> <cell id='0' cpus='0-1' memory='1025024' unit='KiB'/> <cell id='1' cpus='2-3' memory='1025024' unit='KiB'/> </numa> </cpu> # virsh start vm1 Domain vm1 started # ps -ef | grep vm1 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/3-vm1 -numa node,nodeid=0,cpus=0-1,mem=1001 -numa node,nodeid=1,cpus=2-3,mem=1001 2. Migrate VM with post-copy parameter from SRC host to DST host # virsh migrate vm1 qemu+ssh://****/system --live --verbose --postcopy Migration: [100 %]
Hi Jiri Could you pls check comment 8 and comment 9 again? Thank you in advance. :)
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.