Bug 2048022

Summary: Qemu crash when migrate with virtio-mem and huge pages (no sufficient free huge pages available on hosts)
Product: Red Hat Enterprise Linux 9 Reporter: Jing Qi <jinqi>
Component: qemu-kvmAssignee: David Hildenbrand <dhildenb>
qemu-kvm sub component: Live Migration QA Contact: Mario Casquero <mcasquer>
Status: ASSIGNED --- Docs Contact:
Severity: low    
Priority: low CC: chayang, coli, dhildenb, jinzhao, juzhang, lijin, mprivozn, virt-maint, xiaohli, yuhuang, zhenyzha
Version: 9.0Keywords: RFE, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2054134 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2054134, 2047797    

Comment 2 Michal Privoznik 2022-01-29 06:44:44 UTC
So if the QEMU crashes, should this bug be against QEMU?

Comment 3 Jing Qi 2022-01-29 09:59:24 UTC
(In reply to Michal Privoznik from comment #2)
> So if the QEMU crashes, should this bug be against QEMU?

OK. Changed it to qemu-kvm component.

Comment 4 David Hildenbrand 2022-01-29 16:38:27 UTC
Am I right that this setup is using huge pages? Huge pages with virtio-mem are not supported yet.

Comment 5 Chao Yang 2022-01-30 01:42:18 UTC
Hi Jing,

Can you answer David's question?

Comment 6 Jing Qi 2022-01-30 01:56:17 UTC
(In reply to David Hildenbrand from comment #4)
> Am I right that this setup is using huge pages? Huge pages with virtio-mem
> are not supported yet.

Yes. The setup is using hugepages.

Comment 7 Jing Qi 2022-01-30 02:08:21 UTC
In fact, I didn't config any hugepage in the source host and the vm doesn't config the hugepage in the domain xml except the virtio-mem device with pagesize set to 2048MiB. 
And the target host didn't config any hugepage. The vm crashed in the target side during migration.

I tried it again with hugepage setting in the target host and no hugepage setting in source host. The migration can pass. So, seems it's confusing if the hugepages setting needed when the virtio-mem with pagesize set to 2048MiB. Can you please help to clarify it?

Comment 8 Li Xiaohui 2022-01-30 03:12:38 UTC
Thanks Jing for the updates.

Note we need to keep same configure on source and target host when do migration test.

Comment 9 Jing Qi 2022-01-30 03:34:06 UTC
(In reply to Li Xiaohui from comment #8)
> Thanks Jing for the updates.
> 
> Note we need to keep same configure on source and target host when do
> migration test.

Agree,We used to do as it. But here is the issue with virtio-mem. The vm can be started well without hugepage setting, but can't be migrated to target without hugepage setting.

Comment 10 Li Xiaohui 2022-01-30 04:18:05 UTC
Could you list qemu command line on source and target host so that I could have a try? (I'm not clear how config pagesize for virtio-mem)?

Comment 13 David Hildenbrand 2022-01-30 10:02:10 UTC
I'll elaborate a bit:

virtio-mem does not support preallocation yet, and consequently we don't support huge pages with virtio-mem. While virtio-mem with huge pages works just fine with RHEL9 as long as the user makes sure that there are sufficient free huge pages around. Using huge pages with virtio-mem is therefore prone to user errors, if more huge pages than available are supposed to be given to a VM. Once the VM would have to populate a huge page (e.g., during migration), but there are not sufficient free huge pages around, the VM will crash.

On the source we have:
-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/dev/hugepages/libvirt/qemu/3-rhel_i","reserve":false,"size":134217728}' \
-device virtio-mem-pci,node=0,block-size=2097152,requested-size=134217728,memdev=memvirtiomem0,id=virtiomem0,bus=pcie.0,addr=0x1 \

On the destination we have:
-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/dev/hugepages/libvirt/qemu/7-rhel_i","reserve":false,"size":134217728}' \
-device virtio-mem-pci,node=0,block-size=2097152,requested-size=134217728,memdev=memvirtiomem0,id=virtiomem0,bus=pcie.0,addr=0x1 \


What you observed is that the destination doesn't have free huge pages, yet you want to migrate a virtio-mem device that ends up wanting to use huge pages. During migration, we'll crash because we run out of free huge pages when trying to write to a huge page location.

Note that this is somewhat the expected behaviour, although sub-optimal: we crash on the destination because we run out of huge pages and can continue running the VM on the source. the unfortunate thing is that there is no proper error message, but that might be a little bit more tricky to handle in a more elegant way during migration for virtio-mem. Ideally we'd fail migration early with "allocation failed".


Long story short: Using huge pages with virtio-mem is not supported due to missing preallocation support. Missing preallocation makes using huge pages more prone to user errors. In this example, we fail migration because of an user error -- there are not sufficient free huge pages available on the destination.

Misisng support for huge pages has already been documented under: https://docs.google.com/document/d/1LG9Vqm6Q3TKX5X7__RpsnN9K13innr_9H0XVLUELT_M/edit# "What’s NOT Supported in Tech-Preview".

Comment 16 Li Xiaohui 2022-02-06 12:18:21 UTC
I would add 'RFE' keyword for this bug according to above comments. Please correct me if wrong. Thanks.

Comment 17 Li Xiaohui 2022-02-07 12:57:47 UTC
Thanks Jing for providing qemu commands and thanks David for analysing this bug.

I could also hit qemu core dump when hugepage isn't configured on source or target host. 
But migration succeeded if hugepage is configured right on source and target host.

Test environment:
kernel-5.14.0-42.el9.x86_64 & qemu-kvm-6.2.0-5.el9.x86_64

Qemu related commands:
-m size=5242880k,slots=16,maxmem=10485760k \
-smp 4,sockets=4,cores=1,threads=1 \
-object '{"qom-type":"memory-backend-ram","id":"ram-node0","size":5368709120}' \
-numa node,nodeid=0,cpus=0-3,memdev=ram-node0 \
-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/dev/hugepages","reserve":false,"size":134217728}' \
-device virtio-mem-pci,node=0,block-size=2097152,requested-size=134217728,memdev=memvirtiomem0,id=virtiomem0,bus=pcie.0,addr=0x1 \

Comment 19 David Hildenbrand 2022-12-22 11:12:08 UTC
Proposal posted upstream: lore.kernel.org/r/20221222110215.130392-1-david

Comment 20 David Hildenbrand 2023-04-17 07:48:20 UTC
Fix will be included in QEMU 8.0: https://lore.kernel.org/all/20230117112249.244096-1-david@redhat.com/T/#u