Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1654388

Summary: grub2 UEFI PXE boot: exclusivity of SNP open has regressed from RHEL7 to RHEL8
Product: Red Hat Enterprise Linux 8 Reporter: Laszlo Ersek <lersek>
Component: grub2Assignee: Bootloader engineering team <bootloader-eng-team>
Status: CLOSED CURRENTRELEASE QA Contact: Release Test Team <release-test-team-automation>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.0CC: choma, fmartine, pholica
Target Milestone: rcKeywords: Regression
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: grub2-2.02-62.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-14 01:16:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1653140    
Attachments:
Description Flags
edk2 debug patch 1/3
none
edk2 debug patch 2/3
none
edk2 debug patch 3/3 none

Description Laszlo Ersek 2018-11-28 16:20:27 UTC
(click "unwrap comments" near the top)

*** Description of problem:

The Simple Network Protocol (SNP) is unsuitable for use by multiple clients
/ consumers at the same time. For such purposes, UEFI defines MNP (Managed
Network Protocol), which layers on top of SNP.

Grub2 is an application that requires exclusive access to SNP however. In
upstream commit 49426e9fd2e5 ("efinet: open Simple Network Protocol
exclusively", 2015-05-07), the exclusive SNP open was implemented. In
response to an exclusive open, the platform firmware disconnects the MNP
driver from the SNP interface, via the DisconnectController() boot service.
That boot service calls the EFI_DRIVER_BINDING_PROTOCOL.Stop() member
function exposed by the MNP driver that keeps the SNP interface open. Then
the MNP driver releases its references to SNP, and its resources.

This is the logic that has regressed in RHEL8, from RHEL7. The exclusive
open does not occur any longer, and so packet reception on the SNP interface
turns into a competition between grub2 and MNP.

*** Version-Release number of selected component (if applicable):

Neutral (identical) components in the test environment:

- virtualization host:
  - kernel:         3.10.0-862.15.1.el7.x86_64
  - libvirt-daemon: 4.3.0-1.el7.x86_64
  - qemu-kvm-rhev:  qemu-kvm-rhev-2.12.0-18.el7.x86_64
  - virt-manager:   1.5.0-1.el7.noarch
  - OVMF:           built from upstream edk2 @ 18a700945f20

- DHCP/TFTP/PXE server guest:
  - kernel:         3.10.0-948.el7.x86_64
  - xinetd:         2.3.15-13.el7.x86_64
  - tftp-server:    5.2-22.el7.x86_64

Working set of components (served to the PXE client guest from the PXE
server guest):

- shim-x64-12-3.el7.x86_64 (shimx64.efi sha256sum:
  d57f40a0e9018765cd79393a0d57d8e6d6d880d93b95fa57cedbda5a0b4a1ae3)
- grub2-efi-x64-2.02-0.76.el7.x86_64 (grubx64.efi sha256sum:
  2b4c21cb24248aebbcf596421d7226912872ea1605fa4ddec079aaa729729ef6)

Regressed set of components (served to the PXE client guest from the PXE
server guest):

- shim-x64-13-0.7.el8+7.x86_64 (shimx64.efi sha256sum:
  475552c7476ad45e42344eee8b30d44c264d200ac2468428aa86fc8795fb6e34)
- grub2-efi-x64-2.02-61.el8.x86_64 (grubx64.efi sha256sum:
  495ec11446682b3aa4cd7a658a7a06eb0a58af2f18f29b23a7f636fbdb5ae6c0)

*** How reproducible:

Tried about 5 times; all attempts produced the same result.

*** Steps to Reproduce:

Initiate PXE boot in the above environment.

*** Actual results:

The ultimate result is that grub2 trips an ASSERT() in OVMF's built-in
virtio-net-pci device driver (VirtioNetDxe), which implements SNP:

> ASSERT .../OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184): ((BOOLEAN)(0==1))

(The VirtioNetTransmit() function enforces (for independent, valid reasons)
that a packet buffer at a specific address *not* be re-queued for TX, until
it is recycled to the caller via a call to SNP.GetStatus(). This
spec-mandated invariant was the topic of BZ#1649048 too, and indeed that BZ
has been fixed, but this is now a different avenue to the same misbehavior.)

In order to understand how the ASSERT() was tripped, despite BZ#1649048
having been fixed, I added a number of debug messages to edk2. Some (not
all) lines of the log below were produced by such debug patches. The logs
were captured while running the "working set of components".

(1)

   3166 MnpAddFreeTxBuf: Add TxBufWrap 7D131018, TxBuf 7D131041
   3167 MnpAddFreeTxBuf: Add TxBufWrap 7D131698, TxBuf 7D1316C1
   3168 MnpAddFreeTxBuf: Add TxBufWrap 7D130018, TxBuf 7D130041
   3169 MnpAddFreeTxBuf: Add TxBufWrap 7D130698, TxBuf 7D1306C1
   3170 MnpAddFreeTxBuf: Add TxBufWrap 7D12F018, TxBuf 7D12F041
   3171 MnpAddFreeTxBuf: Add TxBufWrap 7D12F698, TxBuf 7D12F6C1
   3172 MnpAddFreeTxBuf: Add TxBufWrap 7D12E018, TxBuf 7D12E041
   3173 MnpAddFreeTxBuf: Add TxBufWrap 7D12E698, TxBuf 7D12E6C1
   3174 MnpAddFreeTxBuf: Add TxBufWrap 7D12D018, TxBuf 7D12D041
   3175 MnpAddFreeTxBuf: Add TxBufWrap 7D12D698, TxBuf 7D12D6C1
   3176 MnpAddFreeTxBuf: Add TxBufWrap 7D12C018, TxBuf 7D12C041
   3177 MnpAddFreeTxBuf: Add TxBufWrap 7D12C698, TxBuf 7D12C6C1
   3178 MnpAddFreeTxBuf: Add TxBufWrap 7D12B018, TxBuf 7D12B041
   3179 MnpAddFreeTxBuf: Add TxBufWrap 7D12B698, TxBuf 7D12B6C1
   3180 MnpAddFreeTxBuf: Add TxBufWrap 7D12A018, TxBuf 7D12A041
   3181 MnpAddFreeTxBuf: Add TxBufWrap 7D12A698, TxBuf 7D12A6C1
   3182 MnpAddFreeTxBuf: Add TxBufWrap 7D129018, TxBuf 7D129041
   3183 MnpAddFreeTxBuf: Add TxBufWrap 7D129698, TxBuf 7D1296C1
   3184 MnpAddFreeTxBuf: Add TxBufWrap 7D128018, TxBuf 7D128041
   3185 MnpAddFreeTxBuf: Add TxBufWrap 7D128698, TxBuf 7D1286C1
   3186 MnpAddFreeTxBuf: Add TxBufWrap 7D127018, TxBuf 7D127041
   3187 MnpAddFreeTxBuf: Add TxBufWrap 7D127698, TxBuf 7D1276C1
   3188 MnpAddFreeTxBuf: Add TxBufWrap 7D126018, TxBuf 7D126041
   3189 MnpAddFreeTxBuf: Add TxBufWrap 7D126698, TxBuf 7D1266C1
   3190 MnpAddFreeTxBuf: Add TxBufWrap 7D125018, TxBuf 7D125041
   3191 MnpAddFreeTxBuf: Add TxBufWrap 7D125698, TxBuf 7D1256C1
   3192 MnpAddFreeTxBuf: Add TxBufWrap 7D124018, TxBuf 7D124041
   3193 MnpAddFreeTxBuf: Add TxBufWrap 7D124698, TxBuf 7D1246C1
   3194 MnpAddFreeTxBuf: Add TxBufWrap 7D123018, TxBuf 7D123041
   3195 MnpAddFreeTxBuf: Add TxBufWrap 7D123698, TxBuf 7D1236C1
   3196 MnpAddFreeTxBuf: Add TxBufWrap 7D122018, TxBuf 7D122041
   3197 MnpAddFreeTxBuf: Add TxBufWrap 7D122698, TxBuf 7D1226C1

Here MnpDxe dumps the 32 buffers that it preallocates (as a pool) for
queueing to SNP.Transmit(). This is nothing special.

(2)

   3549 VirtioNetTransmit:150: Buffer=7D131041 BufferSize=46 TxCurPending=0 Crc32=0x5C6D78D6
   3550 VirtioNetTransmit:150: Buffer=7D1316C1 BufferSize=46 TxCurPending=1 Crc32=0x6171F323
   3551 VirtioNetTransmit:150: Buffer=7D130041 BufferSize=46 TxCurPending=2 Crc32=0x6C118117
   3552 VirtioNetTransmit:150: Buffer=7D1306C1 BufferSize=46 TxCurPending=3 Crc32=0x67A7FBE8
   3553 VirtioNetTransmit:150: Buffer=7D12F041 BufferSize=46 TxCurPending=4 Crc32=0x777099B8
   3554 VirtioNetTransmit:150: Buffer=7D12F6C1 BufferSize=46 TxCurPending=5 Crc32=0x6F9E1B3B
   3555 VirtioNetTransmit:150: Buffer=7D12E041 BufferSize=46 TxCurPending=6 Crc32=0x62FE690F
   3556 VirtioNetTransmit:150: Buffer=7D12E6C1 BufferSize=46 TxCurPending=7 Crc32=0x2A35BC94
   3557 VirtioNetTransmit:150: Buffer=7D12D041 BufferSize=46 TxCurPending=8 Crc32=0x9AE83CBC
   3558 VirtioNetTransmit:150: Buffer=7D12D6C1 BufferSize=46 TxCurPending=9 Crc32=0xEC10A5A5
   3559 VirtioNetTransmit:150: Buffer=7D12C041 BufferSize=46 TxCurPending=10 Crc32=0xE170D791
   3560 VirtioNetTransmit:150: Buffer=7D12C6C1 BufferSize=46 TxCurPending=11 Crc32=0xEAC6AD6E
   3561 VirtioNetTransmit:150: Buffer=7D12B041 BufferSize=46 TxCurPending=12 Crc32=0xFA11CF3E
   3562 VirtioNetTransmit:150: Buffer=7D12B6C1 BufferSize=46 TxCurPending=13 Crc32=0xE2FF4DBD
   3563 VirtioNetTransmit:150: Buffer=7D12A041 BufferSize=46 TxCurPending=14 Crc32=0xEF9F3F89
   3564 VirtioNetTransmit:150: Buffer=7D12A6C1 BufferSize=46 TxCurPending=15 Crc32=0xC5238786
   3565 VirtioNetTransmit:150: Buffer=7D129041 BufferSize=46 TxCurPending=16 Crc32=0xEE9AC51E
   3566 VirtioNetTransmit:150: Buffer=7D1296C1 BufferSize=46 TxCurPending=17 Crc32=0xD3864EEB
   3567 VirtioNetTransmit:150: Buffer=7D128041 BufferSize=46 TxCurPending=18 Crc32=0xDEE63CDF
   3568 VirtioNetTransmit:150: Buffer=7D1286C1 BufferSize=46 TxCurPending=19 Crc32=0xD5504620
   3569 VirtioNetTransmit:150: Buffer=7D127041 BufferSize=46 TxCurPending=20 Crc32=0xC5872470
   3570 VirtioNetTransmit:150: Buffer=7D1276C1 BufferSize=46 TxCurPending=21 Crc32=0xDD69A6F3
   3571 VirtioNetTransmit:150: Buffer=7D126041 BufferSize=46 TxCurPending=22 Crc32=0xD009D4C7
   3572 VirtioNetTransmit:150: Buffer=7D1266C1 BufferSize=46 TxCurPending=23 Crc32=0x8688FCE3
   3573 VirtioNetTransmit:150: Buffer=7D125041 BufferSize=46 TxCurPending=24 Crc32=0xDBEDFFEB
   3574 VirtioNetTransmit:150: Buffer=7D1256C1 BufferSize=46 TxCurPending=25 Crc32=0xCE3C0EDB
   3575 VirtioNetTransmit:150: Buffer=7D124041 BufferSize=46 TxCurPending=26 Crc32=0xC35C7CEF
   3576 VirtioNetTransmit:150: Buffer=7D1246C1 BufferSize=46 TxCurPending=27 Crc32=0xC8EA0610
   3577 VirtioNetTransmit:150: Buffer=7D123041 BufferSize=46 TxCurPending=28 Crc32=0xD83D6440
   3578 VirtioNetTransmit:150: Buffer=7D1236C1 BufferSize=46 TxCurPending=29 Crc32=0xC0D3E6C3
   3579 VirtioNetTransmit:150: Buffer=7D122041 BufferSize=46 TxCurPending=30 Crc32=0xCDB394F7
   3580 VirtioNetTransmit:150: Buffer=7D1226C1 BufferSize=46 TxCurPending=31 Crc32=0x9DB54FCF
   3581 VirtioNetGetStatus:175: (*TxBuf)=7D131041 TxCurPending=31
   3582 VirtioNetGetStatus:175: (*TxBuf)=7D1316C1 TxCurPending=30
   3583 VirtioNetGetStatus:175: (*TxBuf)=7D130041 TxCurPending=29
   3584 VirtioNetGetStatus:175: (*TxBuf)=7D1306C1 TxCurPending=28
   3585 VirtioNetGetStatus:175: (*TxBuf)=7D12F041 TxCurPending=27
   3586 VirtioNetGetStatus:175: (*TxBuf)=7D12F6C1 TxCurPending=26
   3587 VirtioNetGetStatus:175: (*TxBuf)=7D12E041 TxCurPending=25
   3588 VirtioNetGetStatus:175: (*TxBuf)=7D12E6C1 TxCurPending=24
   3589 VirtioNetGetStatus:175: (*TxBuf)=7D12D041 TxCurPending=23
   3590 VirtioNetGetStatus:175: (*TxBuf)=7D12D6C1 TxCurPending=22
   3591 VirtioNetGetStatus:175: (*TxBuf)=7D12C041 TxCurPending=21
   3592 VirtioNetGetStatus:175: (*TxBuf)=7D12C6C1 TxCurPending=20
   3593 VirtioNetGetStatus:175: (*TxBuf)=7D12B041 TxCurPending=19
   3594 VirtioNetGetStatus:175: (*TxBuf)=7D12B6C1 TxCurPending=18
   3595 VirtioNetGetStatus:175: (*TxBuf)=7D12A041 TxCurPending=17
   3596 VirtioNetGetStatus:175: (*TxBuf)=7D12A6C1 TxCurPending=16
   3597 VirtioNetGetStatus:175: (*TxBuf)=7D129041 TxCurPending=15
   3598 VirtioNetGetStatus:175: (*TxBuf)=7D1296C1 TxCurPending=14
   3599 VirtioNetGetStatus:175: (*TxBuf)=7D128041 TxCurPending=13
   3600 VirtioNetGetStatus:175: (*TxBuf)=7D1286C1 TxCurPending=12
   3601 VirtioNetGetStatus:175: (*TxBuf)=7D127041 TxCurPending=11
   3602 VirtioNetGetStatus:175: (*TxBuf)=7D1276C1 TxCurPending=10
   3603 VirtioNetGetStatus:175: (*TxBuf)=7D126041 TxCurPending=9
   3604 VirtioNetGetStatus:175: (*TxBuf)=7D1266C1 TxCurPending=8
   3605 VirtioNetGetStatus:175: (*TxBuf)=7D125041 TxCurPending=7
   3606 VirtioNetGetStatus:175: (*TxBuf)=7D1256C1 TxCurPending=6
   3607 VirtioNetGetStatus:175: (*TxBuf)=7D124041 TxCurPending=5
   3608 VirtioNetGetStatus:175: (*TxBuf)=7D1246C1 TxCurPending=4
   3609 VirtioNetGetStatus:175: (*TxBuf)=7D123041 TxCurPending=3
   3610 VirtioNetGetStatus:175: (*TxBuf)=7D1236C1 TxCurPending=2
   3611 VirtioNetGetStatus:175: (*TxBuf)=7D122041 TxCurPending=1
   3612 VirtioNetGetStatus:175: (*TxBuf)=7D1226C1 TxCurPending=0

This sequence is from my debug patches, and it repeats a good number of
times, while OVMF downloads "shimx64.efi", and "shimx64.efi" downloads
"grubx64.efi".

We can see that the buffers that MnpDxe pre-allocated get mass-queued for
TX. The outgoing buffer size is small (46 bytes -- likely TFTP ACKs). We
also see that VirtioNetDxe's internal queue depth grows, from 0 to 32 (the
log message contains the depth just before queueing).

Then we see that MnpDxe mass-recycles its buffers, and ultimately the
VirtioNetDxe queue goes empty (the length is printed after de-queueing).

(3a)

   5349 VirtioNetTransmit:150: Buffer=7D131041 BufferSize=46 TxCurPending=0 Crc32=0x17CF970C
   5350 VirtioNetTransmit:150: Buffer=7D1316C1 BufferSize=46 TxCurPending=1 Crc32=0x14746D0A
   5351 VirtioNetTransmit:150: Buffer=7D130041 BufferSize=46 TxCurPending=2 Crc32=0x27B36ECD
   5352 VirtioNetTransmit:150: Buffer=7D1306C1 BufferSize=46 TxCurPending=3 Crc32=0x240894CB
   5353 VirtioNetTransmit:150: Buffer=7D12F041 BufferSize=46 TxCurPending=4 Crc32=0xA5E0012F
   5354 VirtioNetTransmit:150: Buffer=7D12F6C1 BufferSize=46 TxCurPending=5 Crc32=0xA65BFB29
   5355 VirtioNetTransmit:150: Buffer=7D12E041 BufferSize=46 TxCurPending=6 Crc32=0xB06EF198
   5356 VirtioNetTransmit:150: Buffer=7D12E6C1 BufferSize=46 TxCurPending=7 Crc32=0xB3D50B9E
   5357 PixelBlueGreenRedReserved8BitPerColor
   5358 VirtioNetTransmit:150: Buffer=7AEC14A0 BufferSize=42 TxCurPending=8 Crc32=0x4040269D
   5359 VirtioNetGetStatus:175: (*TxBuf)=7D131041 TxCurPending=8
   5360 VirtioNetTransmit:150: Buffer=7AEC14A0 BufferSize=81 TxCurPending=8 Crc32=0xC424431C
   5361 ASSERT /home/lacos/src/upstream/edk2/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184): ((BOOLEAN)(0==1))

This is the problem site. MnpDxe starts re-queueing its TX packets to SNP
(lines 5349 through 5356). It queues 8 packets.

On line 5357, we see a (very obscure) message from OVMF that indicates that
grub2 performed a GOP call.

On line 5358, we see grub2 queue its very first packet, at address 7AEC14A0,
with size 42, and  CRC32 0x4040269D. (I used CRC32 as a "poor man's
fingerprint" -- CRC32 has zero cryptographic strength, so it is not suitable
against malicious actors, but it was good enough for now.)

On line 5359, we see grub2 recycle a buffer (consume a TX completion
notification). Unfortunately, address 7D131041 belongs to MnpDxe, not grub2.
However, because [a] grub2 only ever queues 1 packet, *and* because [b] it
intends to open SNP in exclusive mode, it (correctly) assumes that whatever
non-NULL buffer it sees recycled, must be its own.

Therefore on line 5360, grub2 attempts to queue a *different* packet
(different size, different CRC32), which is however held at the same
location. That's where the UEFI spec is violated, because, until
VirtioNetGetStatus() hands address 7AEC14A0 back to grub2, grub2 has no
business touching the data at 7AEC14A0 (or queueing 7AEC14A0 again).

So why does this happen? Because the exclusive open [b] is not performed. I
don't know why *that* is not performed -- that's the subject of this BZ.

*** Expected results:

For comparison, consider the logs produced while PXE booting with the
"regressed set of components". Sections (1) and (2) are the same. However,
the log transitions to grub2 differently:

(3b)

   4703 VirtioNetTransmit:150: Buffer=7D131041 BufferSize=46 TxCurPending=0 Crc32=0x7AB0D275
   4704 VirtioNetTransmit:150: Buffer=7D1316C1 BufferSize=46 TxCurPending=1 Crc32=0xE314A243
   4705 VirtioNetTransmit:150: Buffer=7D130041 BufferSize=46 TxCurPending=2 Crc32=0xDE049BB6
   4706 VirtioNetTransmit:150: Buffer=7D1306C1 BufferSize=46 TxCurPending=3 Crc32=0xD3685B82
   4707 VirtioNetTransmit:150: Buffer=7D12F041 BufferSize=46 TxCurPending=4 Crc32=0xA1D92DC
   4708 VirtioNetTransmit:150: Buffer=7D12F6C1 BufferSize=46 TxCurPending=5 Crc32=0x513B3460
   4709 VirtioNetTransmit:150: Buffer=7D12E041 BufferSize=46 TxCurPending=6 Crc32=0x6C2B0D95
   4710 VirtioNetTransmit:150: Buffer=7D12E6C1 BufferSize=46 TxCurPending=7 Crc32=0x44B5C4D7
   4711 VirtioNetTransmit:150: Buffer=7D12D041 BufferSize=46 TxCurPending=8 Crc32=0x23608473
   4712 VirtioNetTransmit:150: Buffer=7D12D6C1 BufferSize=46 TxCurPending=9 Crc32=0x585909BA
   4713 VirtioNetTransmit:150: Buffer=7D12C041 BufferSize=46 TxCurPending=10 Crc32=0x6549304F
   4714 VirtioNetTransmit:150: Buffer=7D12C6C1 BufferSize=46 TxCurPending=11 Crc32=0x23C1E297
   4715 VirtioNetTransmit:150: Buffer=7D12B041 BufferSize=46 TxCurPending=12 Crc32=0xA2477380
   4716 VirtioNetTransmit:150: Buffer=7D12B6C1 BufferSize=46 TxCurPending=13 Crc32=0xF961D53C
   4717 VirtioNetTransmit:150: Buffer=7D12A041 BufferSize=46 TxCurPending=14 Crc32=0xC471ECC9
   4718 VirtioNetTransmit:150: Buffer=7D12A6C1 BufferSize=46 TxCurPending=15 Crc32=0xECEF258B
   4719 VirtioNetTransmit:150: Buffer=7D129041 BufferSize=46 TxCurPending=16 Crc32=0x97C6BB50
   4720 VirtioNetTransmit:150: Buffer=7D1296C1 BufferSize=46 TxCurPending=17 Crc32=0xE62CB66
   4721 VirtioNetTransmit:150: Buffer=7D128041 BufferSize=46 TxCurPending=18 Crc32=0x3372F293
   4722 VirtioNetTransmit:150: Buffer=7D1286C1 BufferSize=46 TxCurPending=19 Crc32=0x3E1E32A7
   4723 VirtioNetTransmit:150: Buffer=7D127041 BufferSize=46 TxCurPending=20 Crc32=0x9DD198CE
   4724 VirtioNetTransmit:150: Buffer=7D1276C1 BufferSize=46 TxCurPending=21 Crc32=0xC6F73E72
   4725 VirtioNetTransmit:150: Buffer=7D126041 BufferSize=46 TxCurPending=22 Crc32=0xFBE70787
   4726 VirtioNetTransmit:150: Buffer=7D1266C1 BufferSize=46 TxCurPending=23 Crc32=0xD379CEC5
   4727 VirtioNetTransmit:150: Buffer=7D125041 BufferSize=46 TxCurPending=24 Crc32=0xDE8C118E
   4728 VirtioNetTransmit:150: Buffer=7D1256C1 BufferSize=46 TxCurPending=25 Crc32=0x195CCAED
   4729 VirtioNetTransmit:150: Buffer=7D124041 BufferSize=46 TxCurPending=26 Crc32=0x244CF318
   4730 VirtioNetTransmit:150: Buffer=7D1246C1 BufferSize=46 TxCurPending=27 Crc32=0xF50C0418
   4731 VirtioNetTransmit:150: Buffer=7D123041 BufferSize=46 TxCurPending=28 Crc32=0x748A950F
   4732 VirtioNetTransmit:150: Buffer=7D1236C1 BufferSize=46 TxCurPending=29 Crc32=0xF668DCC0
   4733 PixelBlueGreenRedReserved8BitPerColor
   4734 MnpDriverBindingStop:328
   4735 VirtioNetGetStatus:175: (*TxBuf)=7D131041 TxCurPending=29
   4736 VirtioNetGetStatus:175: (*TxBuf)=7D1316C1 TxCurPending=28
   4737 VirtioNetGetStatus:175: (*TxBuf)=7D130041 TxCurPending=27
   4738 VirtioNetGetStatus:175: (*TxBuf)=7D1306C1 TxCurPending=26
   4739 VirtioNetGetStatus:175: (*TxBuf)=7D12F041 TxCurPending=25
   4740 VirtioNetGetStatus:175: (*TxBuf)=7D12F6C1 TxCurPending=24
   4741 VirtioNetGetStatus:175: (*TxBuf)=7D12E041 TxCurPending=23
   4742 VirtioNetGetStatus:175: (*TxBuf)=7D12E6C1 TxCurPending=22
   4743 VirtioNetGetStatus:175: (*TxBuf)=7D12D041 TxCurPending=21
   4744 VirtioNetGetStatus:175: (*TxBuf)=7D12D6C1 TxCurPending=20
   4745 VirtioNetGetStatus:175: (*TxBuf)=7D12C041 TxCurPending=19
   4746 VirtioNetGetStatus:175: (*TxBuf)=7D12C6C1 TxCurPending=18
   4747 VirtioNetGetStatus:175: (*TxBuf)=7D12B041 TxCurPending=17
   4748 VirtioNetGetStatus:175: (*TxBuf)=7D12B6C1 TxCurPending=16
   4749 VirtioNetGetStatus:175: (*TxBuf)=7D12A041 TxCurPending=15
   4750 VirtioNetGetStatus:175: (*TxBuf)=7D12A6C1 TxCurPending=14
   4751 VirtioNetGetStatus:175: (*TxBuf)=7D129041 TxCurPending=13
   4752 VirtioNetGetStatus:175: (*TxBuf)=7D1296C1 TxCurPending=12
   4753 VirtioNetGetStatus:175: (*TxBuf)=7D128041 TxCurPending=11
   4754 VirtioNetGetStatus:175: (*TxBuf)=7D1286C1 TxCurPending=10
   4755 VirtioNetGetStatus:175: (*TxBuf)=7D127041 TxCurPending=9
   4756 VirtioNetGetStatus:175: (*TxBuf)=7D1276C1 TxCurPending=8
   4757 VirtioNetGetStatus:175: (*TxBuf)=7D126041 TxCurPending=7
   4758 VirtioNetGetStatus:175: (*TxBuf)=7D1266C1 TxCurPending=6
   4759 VirtioNetGetStatus:175: (*TxBuf)=7D125041 TxCurPending=5
   4760 VirtioNetGetStatus:175: (*TxBuf)=7D1256C1 TxCurPending=4
   4761 VirtioNetGetStatus:175: (*TxBuf)=7D124041 TxCurPending=3
   4762 VirtioNetGetStatus:175: (*TxBuf)=7D1246C1 TxCurPending=2
   4763 VirtioNetGetStatus:175: (*TxBuf)=7D123041 TxCurPending=1
   4764 VirtioNetGetStatus:175: (*TxBuf)=7D1236C1 TxCurPending=0
   4765 VirtioNetShutdown:67
   4766 VirtioNetStop:53
   4767 MnpDriverBindingStop:328
   4768 VirtioNetStart:52
   4769 VirtioNetInitialize:523
   4770 VirtioNetTransmit:150: Buffer=7AF4E4A0 BufferSize=42 TxCurPending=0 Crc32=0x4040269D
   4771 VirtioNetGetStatus:175: (*TxBuf)=7AF4E4A0 TxCurPending=0
   4772 VirtioNetTransmit:150: Buffer=7AF4E4A0 BufferSize=102 TxCurPending=0 Crc32=0x3AE76CAB
   4773 VirtioNetGetStatus:175: (*TxBuf)=7AF4E4A0 TxCurPending=0
   4774 VirtioNetTransmit:150: Buffer=7AF4E4A0 BufferSize=90 TxCurPending=0 Crc32=0x64148BAF
   4775 VirtioNetGetStatus:175: (*TxBuf)=7AF4E4A0 TxCurPending=0
   4776 VirtioNetTransmit:150: Buffer=7AF4E4A0 BufferSize=89 TxCurPending=0 Crc32=0xA18ECC7
   4777 VirtioNetGetStatus:175: (*TxBuf)=7AF4E4A0 TxCurPending=0

On lines 4703-4732, we again see MnpDxe re-queueing its TX packets to SNP.
(Now it manages to queue 30 packets, before grub2 takes over, but that
doesn't really make any difference.)

On line 4733, we see grub2 make the same GOP call.

The crucial difference is on line 4734. We actually see
MnpDriverBindingStop() being called! The same debug patch was obviously
applied to OVMF during the other test too (I listed OVMF under "neutral
components"), but there we don't see the message. Therefore, using RHEL7
grub2, the exclusive open does have the intended effect.

On lines 4735-4764, MnpDxe cleans up (recycles) all the packets it
previously queued for TX (all 30 of them).

On lines 4765-4766, we see MnpDxe shut down, and stop, SNP. Also as part of
disconnecting.

On line 4767, the DisconnectController() boot service calls
MnpDriverBindingStop() again. This most likely corresponds to the two modes
that are possible when calling Stop() against a UEFI bus driver: destroy
child handles, vs. disconnect the driver from the parent handle.

On lines 4768-4769, we see grub2 correctly starting and initializing SNP.

On lines 4770-4777, we see that grub2 correctly uses the same buffer address
for queueing packets for TX -- in strict succession.

*** Summary

In RHEL-7 grub2, the exclusive open of the SNP interface works as expected;
MNP is disconnected, the internal SNP TX queue is cleared, and SNP is
generally power-cycled. Grub2's assumption of "owning" the SNP interface is
valid, and all packets ever pending TX are owned by grub2. (And there is at
most one packet pending TX at any time.)

In RHEL-8 grub2, the exclusive open does not seem to occur, MNP remains
connected, and grub2 mistakes MNP's buffers, when they are recycled, for its
own.

Comment 1 Laszlo Ersek 2018-11-28 16:28:46 UTC
Below I'll attach the three debug patches that I used on top of edk2 @ 18a700945f20.

Comment 2 Laszlo Ersek 2018-11-28 16:29:47 UTC
Created attachment 1509567 [details]
edk2 debug patch 1/3

Comment 3 Laszlo Ersek 2018-11-28 16:30:27 UTC
Created attachment 1509568 [details]
edk2 debug patch 2/3

Comment 4 Laszlo Ersek 2018-11-28 16:30:56 UTC
Created attachment 1509569 [details]
edk2 debug patch 3/3

Comment 5 Laszlo Ersek 2018-11-28 16:52:08 UTC
Sigh. In comment 0, under "Actual results" (i.e., where I described the failure symptom), I used the term "working set of components":

> *** Actual results:
> 
> The ultimate result is that grub2 trips an ASSERT() in OVMF's built-in
> virtio-net-pci device driver (VirtioNetDxe), which implements SNP:
> 
> > ASSERT .../OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184): ((BOOLEAN)(0==1))
> 
> [...] The logs
> were captured while running the "working set of components".

And, under "expected results" (i.e., where I described the correct behavior), I used the term "regressed set of components":

> *** Expected results:
> 
> For comparison, consider the logs produced while PXE booting with the
> "regressed set of components". [...]

Obviously, those should be switched over. The actual results were collected from the regressed set of components (RHEL8 grub2), and the expected results were collected from the working set of components (RHEL7 grub2).

Sorry!

Comment 6 Laszlo Ersek 2018-11-29 00:01:57 UTC
* I vaguely suspect that this issue is caused by the following patch, on the rhel-8.0 branch in the rhboot grub2 repo:

- d59a4e119022 Be more aggro about actually using the *configured* network device.

* Considering the rhel-7.* branches, the patch first appeared on rhel-7.2; for BZ#1257475:

- 46d5b1ccce24 Be more aggro about actually using the *configured* network device.

* Later it was reverted (in 7.2.z); for BZ#1348520:

- 5c7efabea8eb Revert "Be more aggro about actually using the *configured* network device."

The commit message said, 'This patch appears not to be needed once the exclusive opens are "correct", and breaks some platforms.'.

* Interestingly, the patch is present on the rhel-7.3 branch (without a revert):

- 46d5b1ccce24 Be more aggro about actually using the *configured* network device.

and not present on rhel-7.4 through rhel-7.6.

So... I'm not quite sure why rhel-8.0 carries it again.

Comment 7 Javier Martinez Canillas 2018-11-29 08:24:39 UTC
(In reply to Laszlo Ersek from comment #6)
> * I vaguely suspect that this issue is caused by the following patch, on the
> rhel-8.0 branch in the rhboot grub2 repo:
> 
> - d59a4e119022 Be more aggro about actually using the *configured* network
> device.
> 

I was also looking at the differences between the rhel-7.6 and rhel-8.0 branches yesterday and came to the same conclusion.

I'm not that familiar with this part of the code but d59a4e119022 changes the logic introduced in upstream commit:

- 52ae40570c efinet: skip virtual IPv4 and IPv6 devices when enumerating cards

That has the following paragraph in its commit message:

    EDK2 PXE driver creates two child devices - IPv4 and IPv6 - with
    bound SNP instance. This means we get three cards for every physical
    adapter when enumerating. Not only is this confusing, this may result
    in grub ignoring packets that come in via the "wrong" card.

Comment 8 Laszlo Ersek 2018-11-29 08:52:26 UTC
(

meta comment:

(In reply to Javier Martinez Canillas from comment #7)

> [...] d59a4e119022 changes the logic introduced in upstream commit:
> 
> - 52ae40570c efinet: skip virtual IPv4 and IPv6 devices when enumerating
> cards

The above commit hash has a typo: the first nibble ("c") was lost, presumably while cutting & pasting. The commit hash is c52ae40570c.

)

Comment 9 Laszlo Ersek 2018-11-29 09:05:41 UTC
Javier, I think you are on to something with this reference.

I think upstream c52ae40570c is correct (picked to rhel-7.6 as 0faa80b06bc1, and to rhel-8.0 as c52ae40570c3).

(

As a side comment, I might have implemented c52ae40570c differently: the UEFI boot service LocateDevicePath() is exactly for this purpose (assuming I understand the purpose of the commit correctly). Namely, it walks the device path from the starting node node-wise, until the last handle on which the current device path prefix is installed also carries the searched-for protocol instance. This means that given a device path, the longest prefix whose handle carries SNP, can be found with LocateDevicePath(). The UEFI spec says, "This function is useful for locating the proper instance of a protocol interface to use from
a logical parent device driver."

)

If rhel-8.0 commit d59a4e119022 ("Be more aggro about actually using the *configured* network device.", 2018-11-27) interferes with this logic, then the exclusive open might be directed at the IPv4() or IPv6() child nodes. That wouldn't give grub2 exclusive access to the parent MAC() node (whose handle carries the SNP interface).

I'll try to scratch-build a RHEL8 grub2 package with d59a4e119022 reverted. Thanks!

Comment 17 Laszlo Ersek 2018-11-29 11:06:14 UTC
* Reproduced the issue with 2.02-61.el8.

* Using "grub2-efi-x64-2.02-62.el8.x86_64.rpm" from Javier's scratch build
  (taskid 19313944, grubx64.efi sha256sum
  64fff2e5725231d1d3ee99104f014ef1adde2dcd3920b792e66ebd4807c6572a), the PXE
  boot completes fine, end-to-end. The PXE-booted installer (Anaconda) comes
  up alright in the PXE client guest.

* I repeated the successful test both with and without the edk2 debug
  patches. The debug log looks as expected too (see (3b) in comment 0).

* As an added bonus (with MnpDxe and grub2 no longer competing), the PXE
  boot is as fast again as it used to be with RHEL-7. :) In my virtualized,
  single-host environment, it took 1 second to download "vmlinuz" (7.6MB)
  and 6 seconds to download "initrd.img" (58M). (Intervals computed from the
  "in.tftpd" timestamps in "/var/log/messages" on the PXE server VM.)

Thank you Javier!

Comment 18 Javier Martinez Canillas 2018-11-29 17:05:35 UTC
(In reply to Laszlo Ersek from comment #17)
> * Reproduced the issue with 2.02-61.el8.
> 
> * Using "grub2-efi-x64-2.02-62.el8.x86_64.rpm" from Javier's scratch build
>   (taskid 19313944, grubx64.efi sha256sum
>   64fff2e5725231d1d3ee99104f014ef1adde2dcd3920b792e66ebd4807c6572a), the PXE
>   boot completes fine, end-to-end. The PXE-booted installer (Anaconda) comes
>   up alright in the PXE client guest.
> 
> * I repeated the successful test both with and without the edk2 debug
>   patches. The debug log looks as expected too (see (3b) in comment 0).
> 
> * As an added bonus (with MnpDxe and grub2 no longer competing), the PXE
>   boot is as fast again as it used to be with RHEL-7. :) In my virtualized,
>   single-host environment, it took 1 second to download "vmlinuz" (7.6MB)
>   and 6 seconds to download "initrd.img" (58M). (Intervals computed from the
>   "in.tftpd" timestamps in "/var/log/messages" on the PXE server VM.)
> 
> Thank you Javier!

Thanks to you for reporting the problem and the test!

Comment 20 Michael 2018-12-05 01:22:00 UTC
Hi all:

KVM QE has been re-test BZ 1653140. The PXE net boot working well and marked that BZ as verified. Maybe this information is good to know in here. 

Thanks all.

Comment 21 Pavel Holica 2018-12-05 05:49:39 UTC
Thank you Michael, I'll hopefully try to verify this bug by end of this week.

Comment 22 Pavel Holica 2018-12-05 13:04:44 UTC
Hm, I'm now struggeling with the TFTP boot, I'm able to get the shim, but it doesn't seem to even try downloading the next file (using tcpdump on server side).

On the client I get:
>>Start PXE over IPv4.
  Station IP address is 192.168.100.100

  Server IP address is 192.168.100.2
  NBP filename is shimx64.efi
  NBP filesize is 1202952 Bytes
 Downloading NBP file...

  NBP file downloaded successfully.
Fetching Netboot Image
Unable to fetch TFTP image: Invalid Parameter
start_image() returned Invalid Parameter


On the host, I have:
# rpm -q ipxe-roms-qemu edk2-ovmf qemu-kvm kernel
ipxe-roms-qemu-20170710-6.git0600d3ae.el8.noarch
edk2-ovmf-20180508gitee3198e672e2-8.el8.noarch
qemu-kvm-2.12.0-44.module+el8+2259+6d80f0a6.x86_64
kernel-4.18.0-48.el8.x86_64

On the tftp server I'm using content from following packages:
shim-x64-15-4.x86_64.rpm
grub2-efi-x64-2.02-62.el8.x86_64.rpm

The client is run by libvirt with no special params set by me (except --boot uefi passed to virt-install). I'm using virtio NIC.

I can see that bug 1653140 is in verified but it doesn't have "fixed in version" and I didn't find in which build is the problem of that bug fixed.

Comment 23 Laszlo Ersek 2018-12-05 13:57:46 UTC
*** Bug 1653140 has been marked as a duplicate of this bug. ***

Comment 24 Laszlo Ersek 2018-12-05 13:58:19 UTC
Hi Pavel,

replying out of order: you are making a good point about bug 1653140. In fact it should not be marked as VERIFIED, given that the component the BZ was reported for (namely, ipxe-roms-qemu) didn't need a fix. Therefore bug 1653140 should either be marked as NOTABUG, or closed as a duplicate of this bug. In general, first we like to keep such bugs open, and just mark them with the TestOnly keyword, until the (suspected) actual component is fixed (which is tracked by a different RHBZ). Given that virt-QE's testing is now complete, I'm going to set the component on that BZ to grub2 as well, and I'll close it as a duplicate now.

Comment 25 Laszlo Ersek 2018-12-05 14:03:29 UTC
Second, please check your TFTP server messages, and the SELinux context on /var/lib/tftpboot files.

It's possible that you extracted the right grubx64.efi binary (i.e., from the "grub2-efi-x64-2.02-62.el8.x86_64.rpm" package), but you forgot to set the file mode bits, and/or ownership, and/or SELinux context on it correctly. I suggest running "restorecon -FvvR /var/lib/tftpboot".

Comment 26 Laszlo Ersek 2018-12-05 14:05:34 UTC
Third, if you must use a RHEL8 virtualization host for testing (implied by your host kernel Version-Release 4.18.0-48.el8), then please use 4.18.0-49.el8 instead. The latter contains an important KVM bugfix series, for bug 1636610. Thanks!

Comment 27 Michael 2018-12-06 00:50:04 UTC
(In reply to Pavel Holica from comment #22)
> Hm, I'm now struggeling with the TFTP boot, I'm able to get the shim, but it
> doesn't seem to even try downloading the next file (using tcpdump on server
> side).
> 
> On the client I get:
> >>Start PXE over IPv4.
>   Station IP address is 192.168.100.100
> 
>   Server IP address is 192.168.100.2
>   NBP filename is shimx64.efi
>   NBP filesize is 1202952 Bytes
>  Downloading NBP file...
> 
>   NBP file downloaded successfully.
> Fetching Netboot Image
> Unable to fetch TFTP image: Invalid Parameter
> start_image() returned Invalid Parameter
> 
> 
> On the host, I have:
> # rpm -q ipxe-roms-qemu edk2-ovmf qemu-kvm kernel
> ipxe-roms-qemu-20170710-6.git0600d3ae.el8.noarch
> edk2-ovmf-20180508gitee3198e672e2-8.el8.noarch
> qemu-kvm-2.12.0-44.module+el8+2259+6d80f0a6.x86_64
> kernel-4.18.0-48.el8.x86_64
> 
> On the tftp server I'm using content from following packages:
> shim-x64-15-4.x86_64.rpm
> grub2-efi-x64-2.02-62.el8.x86_64.rpm
> 
> The client is run by libvirt with no special params set by me (except --boot
> uefi passed to virt-install). I'm using virtio NIC.
> 
> I can see that bug 1653140 is in verified but it doesn't have "fixed in
> version" and I didn't find in which build is the problem of that bug fixed.

Hi pavel : 

Maybe. file "grubx64.efi" needs to change the permission. 

ll /var/lib/tftpboot/
... ...
-rwxrwxrwx  1 root root 1735096 Nov  8 13:33 grubx64.efi
... ...

Hope this can give some help. Thanks.

Comment 28 Pavel Holica 2018-12-06 06:21:07 UTC
Thank you about the explanation about the bug, it just confused me since I'm not used to the TestOnly.

About the acces rights, I've made sure they are following:
-rwxr--r--. 1 root root unconfined_u:object_r:tftpdir_rw_t:s0 1734072 Dec  5 07:15 grubx64.efi
-rwxr--r--. 1 root root unconfined_u:object_r:tftpdir_rw_t:s0 1202952 Dec  5 07:14 shimx64.efi

I've made sure that I'm able to download the files using tftp $server_ip -c get grubx64.efi so there wasn't problem with the access rights, the shim just didn't even tried downloading.

Nevermind I've just found the issue and I'll probably open new bugs on shim and grub. I've created dhcpd.conf from scratch and forgot to add next-server. Firmware correctly downloaded the boot file from the IP address of the DHCP server but then neither shim nor grub used that IP for downloading of other files.

Comment 29 Pavel Holica 2018-12-06 06:39:35 UTC
Great, so I can finally move this bug to verified:

I've reproduced the issue on configuration from comment 22 when used grub2-2.02-61.el8 (version prior to the fix):
error: ../../grub-core/net/net.c:1795:timeout reading
`/images/pxeboot/initrd.img'.

Press any key to continue...


When I used grub2-2.02-62.el8 the netboot worked as expected and grub was able to successfully download files without problem.

Moving to VERIFIED.

Thank you all for you effort on this!

Comment 30 Pavel Holica 2018-12-06 06:55:46 UTC
And here are the promised next-server bugs:
bug 1656708
bug 1656710