Bug 1560453 - Failed to boot guest with ovmf and with maxmem set safe = 32g
Summary: Failed to boot guest with ovmf and with maxmem set safe = 32g
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ovmf
Version: 7.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: FuXiangChun
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-26 08:22 UTC by Sitong Liu
Modified: 2018-03-27 09:22 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-26 12:30:20 UTC
Target Upstream Version:


Attachments (Terms of Use)
Auto test logs (1.35 MB, application/x-tar)
2018-03-26 12:15 UTC, Sitong Liu
no flags Details

Description Sitong Liu 2018-03-26 08:22:05 UTC
Description of problem:

1. Boot a guest with q35+ovmf (cli below), when set '-m 2048,slots=4,maxmem=32G', guest can't get into guest's os and stop at OVMF UI. When boot with 'maxmem=16G', guest boot successfully. Besides, when boot with q35+seabios and 'maxmem=32G', guest boot successfully, so it may be a ovmf bug. 

2. According to bug 1353591 comment 2, providing my host hardware here:
# grep "address sizes" /proc/cpuinfo 
address sizes	: 36 bits physical, 48 bits virtual

'maxmem=32GB' should be safe accordingly.


4.q35 version:
# /usr/libexec/qemu-kvm -machine help
...
q35   RHEL-7.5.0 PC (Q35 + ICH9, 2009) (alias of pc-q35-rhel7.5.0)
...

Cli:
/usr/libexec/qemu-kvm \
    -sandbox off  \
    -machine q35  \
    -nodefaults  \
    -vga qxl \
    -device i82801b11-bridge,id=dmi2pci_bridge,bus=pcie.0,addr=0x2 \
    -device pci-bridge,id=pci_bridge,bus=dmi2pci_bridge,addr=0x1,chassis_nr=1 \
    -device intel-hda,bus=pci_bridge,addr=0x1 \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-6,addr=0x0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel75-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:58:59:5a:5b:5c,id=idR4dB6n,vectors=4,netdev=idzRf0nJ,bus=pcie.0-root-port-7,addr=0x0  \
    -netdev tap,id=idzRf0nJ,vhost=on \
    -m 2048,slots=4,maxmem=32G  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'SandyBridge',+kvm_pv_unhalt \
    -rtc base=utc,clock=vm,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c  \
    -no-hpet \
    -drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.secboot.fd \
    -drive if=pflash,format=raw,file=/usr/share/OVMF/OVMF_VARS.fd \
    -debugcon file:/home/ovmf.log \
    -global isa-debugcon.iobase=0x402 \
    -enable-kvm  \
    -vnc :0 \
    -monitor stdio



Version-Release number of selected component (if applicable):
kernel-3.10.0-862.el7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64
OVMF-20171011-4.git92d07e48907f.el7.noarch

Failed also on 
OVMF-20171011-2.git92d07e48907f.el7.noarch.rpm
OVMF-20171011-3.git92d07e48907f.el7.noarch.rpm


How reproducible:
100%

Steps to Reproduce:
1. Boot a guest with the cli.
2.
3.

Actual results:
VM failed to get into guest os when maxmem set safe.
q35 + seabios + maxmem=32G,  good
q35 + ovmf    + maxmem=16G,  good
q35 + ovmf    + maxmem=32G,  failed
q35 + ovmf    + maxmem=64G,  failed


Expected results:
VM can get into guest os when maxmem set safe.

Additional info:

Comment 2 Sitong Liu 2018-03-26 09:51:31 UTC
Test on Rhel 7.4.z with
kernel-3.10.0-693.25.1.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.16.x86_64
OVMF-20170228-5.gitc325e41585e3.el7.noarch

Same issue, so it is not a regression bug.

Comment 3 Laszlo Ersek 2018-03-26 11:47:01 UTC
I cannot reproduce this issue. Please capture the OVMF debug log and attach it to the BZ. Thanks.

Comment 4 Laszlo Ersek 2018-03-26 11:49:32 UTC
BTW your command line is broken;

    -drive if=pflash,format=raw,file=/usr/share/OVMF/OVMF_VARS.fd \

is invalid. You should never use the varstore *template* file (which is read-only) as an actual (r/w) varstore file.

Comment 5 Sitong Liu 2018-03-26 12:15:31 UTC
Created attachment 1413097 [details]
Auto test logs

Sorry, pls. see the attached.

Comment 6 Laszlo Ersek 2018-03-26 12:25:07 UTC
OK, I think I know what's wrong on your end. I ran a simpler variant of your command line (keeping "-m 2048,slots=4,maxmem=32G") and I am looking at the log.

> ...
> ScanOrAdd64BitE820Ram: Base=0x0 Length=0x80000000 Type=1
> ...
> GetFirstNonAddress: Pci64Base=0x1000000000 Pci64Size=0x800000000
> ...
> PublishPeiMemory: mPhysMemAddressWidth=37 PeiMemoryCap=66056 KB
> ...

When you specify "maxmem=32G", that means the maximum *amount* of RAM that the guest may have (after hotplugging DIMMs) is 32GB. However, because RAM does not contiguously cover the guest-physical address space from zero up, the maximum RAM *address* for this RAM size will be *higher* than 32GB.

In turn, when OVMF attempts to place the 64-bit PCI MMIO aperture, which is 32GB in size, at a naturally aligned address, it cannot put the aperture at 32GB exactly, because that address is not free -- it is covered by RAM. Therefore the aperture is moved up to 64GB. (See Pci64Base in the log.)

For that however, 37 address bits are required (see mPhysMemAddressWidth=37), but your host CPU has only 36 address bits. (From comment 0.)

I couldn't reproduce this issue in comment 3 because my laptop CPU has 39 physical address bits.

Your other results support this analysis. maxmem=64G fails for a similar reason as above. And maxmem=16G works because then the 64-bit PCI MMIO aperture can be placed at 32GB sharp, and we fit into 36 address bits.

Comment 7 Laszlo Ersek 2018-03-26 12:30:20 UTC
(I wrote comment 6 before comment 5 appeared, and ran into a "mid-air collision" with comment 5 when I tried to save comment 6.)

Comment 5 does not contain any OVMF debug logs. For the future, please consult the virt-QE test plans, or else "/usr/share/doc/OVMF/README", for capturing the OVMF debug log.

Right now however I no longer need the OVMF debug log from your side; the analysis in comment 6 is pretty conclusive -- in fact when I tried to save that comment, I had already set NOTABUG resolution; it just couldn't be saved because of the "mid-air collision". So I'm closing the BZ like that now.

Comment 8 Sitong Liu 2018-03-27 06:32:16 UTC
(In reply to Laszlo Ersek from comment #7)
> (I wrote comment 6 before comment 5 appeared, and ran into a "mid-air
> collision" with comment 5 when I tried to save comment 6.)
> 
> Comment 5 does not contain any OVMF debug logs. For the future, please
> consult the virt-QE test plans, or else "/usr/share/doc/OVMF/README", for
> capturing the OVMF debug log.
> 
> Right now however I no longer need the OVMF debug log from your side; the
> analysis in comment 6 is pretty conclusive -- in fact when I tried to save
> that comment, I had already set NOTABUG resolution; it just couldn't be
> saved because of the "mid-air collision". So I'm closing the BZ like that
> now.

Thanks Laszlo, it's very clear.

This bz ran into "mid-air collision" because I wanted to explain the OVMF debug log is named "seabios-avocado-vt-vm1.log" in the attached, since this name hasn't been updated in our auto framework now. Sorry for make that confusion. You could check that log if you want.

And for
    -drive if=pflash,format=raw,file=/usr/share/OVMF/OVMF_VARS.fd \
This is my fault, and fortunately it is copied as a new file in our auto framework, thanks for reminder.

Best regards,
Sitong

Comment 9 Laszlo Ersek 2018-03-27 07:46:13 UTC
Hi Sitong,

when I wrote in comment 7 that there was no OVMF log attached, that was because I grepped the entire attachment (after extraction) recursively, for the string "mPhysMemAddressWidth". And that string is not present anywhere, so I thought that the firmware log was missing.

Now that you've given me the filename, I've looked at the log. Unfortunately, this log file is not usable, because it is truncated -- the head (not the tail) of the log is missing. I don't know what's going on in avocado, but this logfile is not useful.

It would be nice if this could be fixed in the future. (For this BZ it is not necessary.) Thanks!

Comment 10 Sitong Liu 2018-03-27 09:22:25 UTC
(In reply to Laszlo Ersek from comment #9)
> Hi Sitong,
> 
> when I wrote in comment 7 that there was no OVMF log attached, that was
> because I grepped the entire attachment (after extraction) recursively, for
> the string "mPhysMemAddressWidth". And that string is not present anywhere,
> so I thought that the firmware log was missing.
> 
> Now that you've given me the filename, I've looked at the log.
> Unfortunately, this log file is not usable, because it is truncated -- the
> head (not the tail) of the log is missing. I don't know what's going on in
> avocado, but this logfile is not useful.
> 
> It would be nice if this could be fixed in the future. (For this BZ it is
> not necessary.) Thanks!

Hi Laszlo,

Yes. thanks for point this out! We will do some research on avocado and fix it later, thanks a lot!

Best regards,
Sitong


Note You need to log in before you can comment on or make changes to this bug.