Bug 1528259
| Summary: | [Q35][OVMF] Boot guest failed with 8T memory | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | yduan | ||||
| Component: | ovmf | Assignee: | Laszlo Ersek <lersek> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | FuXiangChun <xfu> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 7.5 | CC: | chayang, jinzhao, juzhang, michen, xfu, yduan | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-01-04 12:42:42 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Hello Yanbin, this guest configuration has huge SMRAM requirements (384 VCPUs and 8TB RAM), so the 48MB SMRAM size may not be enough. I have two suggestions / requests: (1) Please keep incrasing the -global mch.extended-tseg-mbytes=N value until the guest boots OK -- it's hard to tell the exact N value in advance. (2) I see another thing from the error message. Namely, the following source code location: UefiCpuPkg/PiSmmCpuDxeSmm/X64/PageTbl.c(212) implies that the VM configuration does not support 1GB pages. With 1GB pages enabled, the memory footprint of the SMRAM page tables would be much smaller. The QEMU source code calls this CPU model feature "CPUID_EXT2_PDPE1GB". On the QEMU command line, it is called "pdpe1gb". Only the following CPU models seem to enable it by default: - phenom - Skylake-Server - Opteron_G4 - Opteron_G5 - EPYC Your current command line says -cpu SandyBridge,enforce Please try Skylake-Server instead of SandyBridge: -cpu Skylake-Server,enforce Or else, keep SandyBridge, but add "pdpe1gb" explicitly: -cpu SandyBridge,+pdpe1gb,enforce (It's entirely possible that QEMU will not launch with these options at all, if the host CPU does not support 1GB pages. In that case, only option (1) remains viable.) Either way, this does not look like an OVMF bug; it's a domain tuning question. I'll await your response and then I'll likely suggest closing this BZ as NOTABUG. Thanks! ... To clarify, options (1) and (2) in comment 2 are alternatives -- please do one or the other, but both at the same time shouldn't be necessary. Hi Laszlo,
1.Yes, you're right. Guest works well after I add '+pdpe1gb'.
-cpu SandyBridge,+pdpe1gb,enforce
2.Then I try to add memory to 8.2T, qemu core dumped.
(qemu) kvm_set_phys_mem: error registering slot: Invalid argument
rhel-q35-ovmf.sh: line 31: 331557 Aborted (core dumped) /usr/libexec/qemu-kvm -S -name 'RHEL7.5-1' -machine q35,kernel-irqchip=split -device intel-iommu,intremap=on,eim=on -m 8.2T -smp 384,maxcpus=384,sockets=2,cores=96,threads=2 -cpu SandyBridge,+pdpe1gb,enforce -rtc base=localtime,clock=host,driftfix=slew -nodefaults -device AC97 -vga qxl -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/mnt/OVMF_VARS.fd,if=pflash,format=raw,unit=1 -serial unix:/tmp/serial0,server,nowait -debugcon file:/mnt/rhel75-q35-ovmf.log -global isa-debugcon.iobase=0x402 -device usb-ehci,id=usb1 -device usb-tablet,id=usb-tablet1 -boot menu=on -enable-kvm -monitor stdio -monitor unix:/tmp/monitor2,server,nowait -device pcie-root-port,id=root1,chassis=1 -netdev tap,id=netdev0,vhost=on -device virtio-net-pci,mac=BA:BC:13:83:3F:1D,id=net0,netdev=netdev0,status=on -spice port=5900,disable-ticketing -qmp tcp:0:9999,server,nowait -drive file=/mnt/rhel75-ovmf-virtio.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive_sysdisk,id=device_sysdisk,bus=root1,bootindex=1
(gdb) bt
#0 0x00007fffdf4821f7 in raise () from /lib64/libc.so.6
#1 0x00007fffdf4838e8 in abort () from /lib64/libc.so.6
#2 0x00005555557f71b0 in kvm_set_phys_mem (kml=0x555556df10a0,
section=0x7fffffffd560, add=true)
at /usr/src/debug/qemu-2.10.0/accel/kvm/kvm-all.c:786
#3 0x00005555557e84f1 in address_space_update_topology_pass (
as=as@entry=0x55555607f620 <address_space_memory>,
adding=adding@entry=true, new_view=0x55555a03ca80,
new_view=0x55555a03ca80, old_view=<optimized out>,
old_view=<optimized out>) at /usr/src/debug/qemu-2.10.0/memory.c:962
#4 0x00005555557e88a4 in address_space_set_flatview (
as=as@entry=0x55555607f620 <address_space_memory>)
at /usr/src/debug/qemu-2.10.0/memory.c:1037
#5 0x00005555557ea630 in memory_region_transaction_commit ()
at /usr/src/debug/qemu-2.10.0/memory.c:1089
#6 0x000055555583a128 in pc_memory_init (pcms=pcms@entry=0x555556d80380,
system_memory=0x555556d3e780, rom_memory=rom_memory@entry=0x555556d3eb40,
ram_memory=ram_memory@entry=0x7fffffffd728)
at /usr/src/debug/qemu-2.10.0/hw/i386/pc.c:1386
#7 0x000055555583ce30 in pc_q35_init (machine=0x555556d80380)
at /usr/src/debug/qemu-2.10.0/hw/i386/pc_q35.c:148
#8 0x000055555590a4d8 in machine_run_board_init (machine=0x555556d80380)
at hw/core/machine.c:760
---Type <return> to continue, or q <return> to quit---
#9 0x000055555579a6ef in main (argc=<optimized out>, argv=<optimized out>,
envp=<optimized out>) at vl.c:4645
Thanks!
yduan
Hello Yanbin,
based on your +pdpe1gb result (and my earlier comments), I'm closing this as NOTABUG (for OVMF).
--*--
Regarding the QEMU crash with 8.2TB guest RAM -- it is an intentional abort() on QEMU's part:
849 err = kvm_set_user_memory_region(kml, mem);
850 if (err) {
851 fprintf(stderr, "%s: error registering slot: %s\n", __func__,
852 strerror(-err));
853 abort();
854 }
If we wanted to investigate the error here, then a new BZ should please be filed for qemu-kvm-rhev. However, the 8TB guest RAM size (which you successfully tested) is already way above the limit that RHV4 supports:
https://access.redhat.com/articles/906543
(Updated April 10 2017 at 9:08 AM)
- Maximum memory in virtualized guest: 4 TB
So, personally I don't think a new qemu-kvm-rhev RHBZ is necessary either.
Thanks!
Laszlo
Hi Laszlo, Thanks for your detailed explanation. Then I will file a new bug for qemu-kvm-rhev with low priority. It's valuable to track not only the fully support memory (4T) but also the internal actual maximum memory from QE's perspective. BR, yduan I think it's about a same root cause with bz1528149, so no need to file a new bug. (In reply to yduan from comment #8) > It's valuable to track not only the fully support memory (4T) but also the > internal actual maximum memory from QE's perspective. Makes sense. (In reply to yduan from comment #9) > I think it's about a same root cause with bz1528149, so no need to file a > new bug. Right, it seems to be the same. Thanks! |
Created attachment 1370831 [details] rhel75-q35-ovmf.log Description of problem: Boot guest failed with 8T memory. Version-Release number of selected component (if applicable): Host: hp-bl920gen8-01.khw.lab.eng.bos.redhat.com # uname -r 3.10.0-693.5.2.el7.x86_64 # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.10.0-13.el7.x86_64 # rpm -q OVMF OVMF-20171011-4.git92d07e48907f.el7.noarch Guest: # uname -r 3.10.0-823.el7.x86_64 How reproducible: 3/3 Steps to Reproduce: 1.On host: # free -h total used free shared buff/cache available Mem: 11T 114G 11T 26M 24G 11T Swap: 4.0G 0B 4.0G 2.Boot guest with 8T memory: /usr/libexec/qemu-kvm \ -S \ -name 'RHEL7.5-1' \ -machine q35,kernel-irqchip=split \ -device intel-iommu,intremap=on,eim=on \ -m 8T \ -smp 384,maxcpus=384,sockets=2,cores=96,threads=2 \ -cpu SandyBridge,enforce \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -device AC97 \ -vga qxl \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=OVMF_VARS.fd,if=pflash,format=raw,unit=1 \ -serial unix:/tmp/serial0,server,nowait \ -debugcon file:test/rhel75-q35-ovmf.log \ -global isa-debugcon.iobase=0x402 \ -global mch.extended-tseg-mbytes=48 \ -device usb-ehci,id=usb1 \ -device usb-tablet,id=usb-tablet1 \ -boot menu=on \ -enable-kvm \ -monitor stdio \ -device pcie-root-port,id=root1,chassis=1 \ -netdev tap,id=netdev0,vhost=on \ -device virtio-net-pci,mac=BA:BC:13:83:3F:1D,id=net0,netdev=netdev0,status=on \ -spice port=5800,disable-ticketing \ -qmp tcp:0:8888,server,nowait \ -drive file=images/rhel75-ovmf-virtio.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive_sysdisk,id=device_sysdisk,bus=root1,bootindex=1 \ Actual results: Cannot boot up guest successfully. OVMF log: ...... mXdSupported - 0x1 One Semaphore Size = 0x40 Total Semaphores Size = 0x12540 1GPageTableSupport - 0x0 PcdCpuSmmStaticPageTable - 0x1 PhysicalAddressBits - 0x2C ASSERT /builddir/build/BUILD/ovmf-92d07e48907f/UefiCpuPkg/PiSmmCpuDxeSmm/X64/PageTbl.c(212): PageDirectoryEntry != ((void *) 0) Expected results: Boot up guest successfully Additional info: 1.It cannot reproduced with seabios. # rpm -q seabios seabios-1.11.0-1.el7.x86_64 2.Installing RHEL7.5 failed to this host "hp-bl920gen8-01.khw.lab.eng.bos.redhat.com". 3.OVMF log is attached.