Bug 1789797

Summary: Backport upstream patch series: "UefiBootManagerLib, HttpDxe: tweaks for large HTTP(S) downloads" to improve HTTP(S) Boot experience with large (4GiB+) files
Product: Red Hat Enterprise Linux 8 Reporter: Xueqiang Wei <xuwei>
Component: edk2Assignee: Laszlo Ersek <lersek>
Status: CLOSED ERRATA QA Contact: Xueqiang Wei <xuwei>
Severity: low Docs Contact:
Priority: low    
Version: 8.2CC: berrange, chayang, coli, ddepaula, jinzhao, juzhang, kraxel, lersek, pbonzini, philmd
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: edk2-20190829git37eef91017ad-5.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:02:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
unexpected_network_error
none
ipv4_downloading
none
ipv4_installation
none
ipv6_downloading
none
ipv6_installation none

Description Xueqiang Wei 2020-01-10 13:23:25 UTC
Description of problem:

VM can't install successfully from UEFI HTTP when the vm memory is not large enough. But it's difficult to debug the error.

According to https://bugzilla.redhat.com/show_bug.cgi?id=1536624#c59, need to backport (or inherit) upstream patch series, in order to QE can identify this kind of problem more easily.



Version-Release number of selected component (if applicable):
Host:
kernel-4.18.0-165.el8.x86_64
qemu-kvm-4.2.0-4.module+el8.2.0+5220+e82621dc
edk2-ovmf-20190829git37eef91017ad-4.el8.noarch


How reproducible:
5/5


Steps to Reproduce:

1. Set memory to 4G for net-client vm.

Try HTTPS Booting the following ISO image:

  RHEL-7.7-20190723.1-Server-x86_64-dvd1.iso

Its size is 4,501,536,768 bytes; that is, 4GiB + 197MiB.

The symptom is that the HTTPS Boot is aborted after downloading just
197MiB. Namely, at approximately 4% into the download, the following
message is written to the UEFI console:

  Error: Unexpected network error

Note that you may or may not be able to reproduce the problem. It also
depends on the HTTPS server. If you use a RHEL7 "net-server" virtual
machine, there is a good chance that the issue will reproduce though.

Also note that you will need to assign a *lot* of RAM to the
"net-client" VM, for successfully testing the fix (8-12 GiB)!


Actual results:
If the vm memory is not large enough, https boot failure.
The important clue for determining this error was in
"/var/log/httpd/ssl_request_log", on the net-server virtual machine.

Expected results:
After backport the upstream patch (writing such an error to the
debug log) so that QE can identify this kind of problem more easily.


Additional info:
please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1536624#c60 for detailed steps and related test files.

Comment 1 Laszlo Ersek 2020-01-10 14:58:19 UTC
I'd like to clarify the scope of this BZ (and the goals of the upstream
patch series, which I posted at:

[edk2-devel] [PATCH 0/2] UefiBootManagerLib, HttpDxe: tweaks for large HTTP(S) downloads
https://edk2.groups.io/g/devel/message/53034
http://mid.mail-archive.com/20200108234313.28510-1-lersek@redhat.com
)

There are two failure scenarios. Comment#0 currently mixes them up, so
let me isolate one scenario from the other.


(1) The first symptom is when the net-client VM's RAM is simply too
small for accepting the remote file (such as a very large ISO image)
over HTTPS Boot. In this case, the HTTPS Boot attempt in the net-client
VM is aborted very quickly, and there is basically nothing usable
printed to *either* the UEFI console (graphical screen and/or serial
port), *or* the OVMF debug log. The user is simply returned to the UEFI
Setup TextUI, almost immediately. Therefore this situation is difficult
to identify / diagnose.

One hint about this particular problem is the net-server VM's apache log
("/var/log/httpd/ssl_request_log"). When this failure occurs, there will
only be a HEAD request in the log, and no GET request. That's because
OVMF, running on the net-client VM, sends a HEAD request for learning
the size of the file to download. Then the memory allocation fails, and
so there is no GET request. If there is enough RAM in the net-client VM,
then the HEAD request is followed by a GET request, in the net-server
VM's "/var/log/httpd/ssl_request_log" file.

This issue is remedied by the first patch in the series. Obviously the
HTTPS Boot can still not succeed (there still isn't enough RAM in
net-client), but the OVMF debug log will now contain a single-line
message like:

> UiApp:BmExpandLoadFile: failed to allocate reserved pages:
> BufferSize=4501536768
> LoadFile="PciRoot(0x0)/Pci(0x3,0x0)/MAC(5254001B103E,0x1)/IPv4(0.0.0.0,TCP,DHCP,192.168.124.106,192.168.124.1,255.255.255.0)/Dns(192.168.124.1)/Uri(https://ipv4-server/RHEL-7.7-20190723.1-Server-x86_64-dvd1.iso)"
> FilePath=""

Here, "BufferSize" stands for the remote file size, and the Uri() device
path node contains the URL of the file to download (which was provided
to net-client by net-server's DHCP server)


(2) The other failure scenario is different. In this case, there are two
conditions:

(2a) the size of the remote file equals 4 GiB *plus* an integral
multiple of 16 KiB;

(2b) the RAM in the net-client VM *is* sufficient for downloading that
large file (for example 8GiB or 12GiB could be the net-client VM's RAM
size).

To given an example for (2a), consider
"RHEL-7.7-20190723.1-Server-x86_64-dvd1.iso". Its size is 4 GiB plus 197
MiB. Because 197 MiB is a whole multiple of 16 KiB, it satisfies the
requirement (for reproducing the issue).

The symptom of this HTTPS Boot failure is that only an initial slice of
the remote file is downloaded, and when exactly 4GiB are *left* to
download -- that is, in the above example, after 197 MiB has been
downloaded --, the download is aborted, with "Error: Unexpected network
error".

This issue is solved by the second patch in the series. The expected
result is that the HTTPS Boot simply succeeds.


So, in order for virt-QE to verify this BZ, two cases should be
reproduced / checked:

- Remote file too large for net-client's RAM: check for the informative
message in the OVMF log.

- Remote file *not* too large for net-client's RAM, but has a size
(4GiB+N*16KiB): check that the download simply works.

Thanks.

Comment 2 Laszlo Ersek 2020-01-14 11:02:28 UTC
(In reply to Laszlo Ersek from comment #1)

> [edk2-devel] [PATCH 0/2] UefiBootManagerLib, HttpDxe: tweaks for large HTTP(S) downloads
> https://edk2.groups.io/g/devel/message/53034
> http://mid.mail-archive.com/20200108234313.28510-1-lersek@redhat.com

Merged upstream as commit range b112ec225f1c..4cca7923992a.

Comment 5 Danilo de Paula 2020-01-15 14:38:46 UTC
QA_ACK, please?

Comment 8 Xueqiang Wei 2020-01-20 06:29:00 UTC
Laszlo,

Thanks for your detailed explanation. 


I reproduced them on edk2-ovmf-20190829git37eef91017ad-4.el8.noarch.
Retested them on edk2-ovmf-20190829git37eef91017ad-5.el8.noarch, all work well, so set bug status to VERIFIED.


Details:

Versions:
kernel-4.18.0-167.el8.x86_64
qemu-kvm-4.2.0-6.module+el8.2.0+5451+991cea0d
edk2-ovmf-20190829git37eef91017ad-5.el8.noarch
openssl-1.1.1c-10.el8.x86_64


Download RHEL-7.7-20190723.1-Server-x86_64-dvd1.iso and place it under "/var/www/html" in net-server virtual machine. 
Try to install it with UEFI IPv4 and UEFI IPv6.
For details steps, please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1536624#c60.



case 1.  set memory to 2Gib

e.g.
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory> 

(1) reproduced it on edk2-ovmf-20190829git37eef91017ad-4.el8.noarch
 
check HEAD request and GET request in "/var/log/httpd/ssl_request_log" on the net-server virtual machine.

In this case, there was only a HEAD request, and no GET request. There is no explicit error message about this in the OVMF
log. 


(2) tested on edk2-ovmf-20190829git37eef91017ad-5.el8.noarch, it has been fixed.

  Found the informative message in the OVMF log:
  For IPv4:
  UiApp:BmExpandLoadFile: failed to allocate reserved pages: BufferSize=4501536768 LoadFile="PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(525400DD90E9,0x1)/IPv4(0.0.0.0,TCP,DHCP,192.168.124.101,192.168.124.1,255.255.255.0)/Dns(192.168.124.1)/Uri(https://ipv4-server/RHEL-7.7-20190723.1-Server-x86_64-dvd1.iso)" FilePath=""

  For IPv6:
  UiApp:BmExpandLoadFile: failed to allocate reserved pages: BufferSize=4501536768 LoadFile="PciRoot(0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/MAC(525400DD90E9,0x1)/IPv6(FD33:EB1B:9B36:0000:0000:0000:0000:0002,TCP,Static,FD33:EB1B:9B36:0000:0000:0000:0000:00C8,0x40,FE80:0000:0000:0000:5054:00FF:FE28:8AFF)/Dns(FD33:EB1B:9B36:0000:0000:0000:0000:0001)/Uri(https://ipv6-server/RHEL-7.7-20190723.1-Server-x86_64-dvd1.iso)" FilePath=""



case 2. set memory to 12Gib 

e.g.
<memory unit='KiB'>12582912</memory>
<currentMemory unit='KiB'>12582912</currentMemory>

(1)  reproduced it on edk2-ovmf-20190829git37eef91017ad-4.el8.noarch

  HTTPS Boot failure is that only an initial slice of the remote file is downloaded,
  and when exactly 4GiB are left to download, after 197 MiB has been downloaded, 
  the download is aborted, with "Error: Unexpected network error".
  
  Please refer to attachment for screenshot.


(2) tested on edk2-ovmf-20190829git37eef91017ad-5.el8.noarch, it has been fixed.
  The remote file is downloaded successfully and it is installed successfully with UEFI IPv4 and UEFI IPv6.
  
  Please refer to attachment for screenshot.

Comment 9 Xueqiang Wei 2020-01-20 06:29:55 UTC
Created attachment 1653827 [details]
unexpected_network_error

Comment 10 Xueqiang Wei 2020-01-20 06:30:57 UTC
Created attachment 1653828 [details]
ipv4_downloading

Comment 11 Xueqiang Wei 2020-01-20 06:31:37 UTC
Created attachment 1653829 [details]
ipv4_installation

Comment 12 Xueqiang Wei 2020-01-20 06:32:11 UTC
Created attachment 1653830 [details]
ipv6_downloading

Comment 13 Xueqiang Wei 2020-01-20 06:32:43 UTC
Created attachment 1653831 [details]
ipv6_installation

Comment 14 Laszlo Ersek 2020-01-20 08:10:52 UTC
Xueqiang Wei, the results look good to me, many thanks.

Comment 16 errata-xmlrpc 2020-04-28 16:02:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1712