Bug 2180046

Summary: virt-install with --boot uefi --pxe fails to get a DHCP address, no Request or Ack stage
Product: [Fedora] Fedora Reporter: ykuksenko
Component: edk2Assignee: Paolo Bonzini <pbonzini>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 37CC: berrange, crobinso, fdeutsch, fhirtz, kraxel, lersek, pbonzini, philmd, phoracek, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ykuksenko 2023-03-20 15:45:17 UTC
Description of problem:
I am trying to deploy a VM over PXE using virt-install but when I use UEFI with PXE, DHCP never completes. Turning off UEFI, with the same settings makes the process work immediately. The host contains the DHCP server (dnsmasq).
I am using tcpdump on the newly created interface of the VM to see that only Discover, and Offer stages occur, repeatedly but not the Request or Ack stages. I am using an e1000 NIC to avoid the DHCP/UDP packets with bad checksums issue. The same thing happens if I use virtio NIC. 

Version-Release number of selected component (if applicable):
edk2-ovmf-20221117gitfff6d81270b5-14.fc37

How reproducible:
always

Steps to Reproduce:
1. `sudo dnf install libvirt virt-install`
2. `sudo virt-install --boot uefi,bios.useserial=on --pxe --network network=default,model=e1000,target.dev=fedora37 --os-variant fedora37 --name fedora37 --graphics none`
3. `sudo tcpdump -i fedora37 -v` in another terminal to see DHCP packets

Actual results:
System says 'PXE-E16: No valid offer received' and does not try to boot.

Expected results:
System should get an IP and attempt to boot.

Additional info:
- removing `uefi,` from the above command immediately fixes the issue. ie: 
`sudo virt-install --boot bios.useserial=on --pxe --network network=default,model=e1000,target.dev=fedora37 --os-variant fedora37 --name fedora37 --graphics none`

Comment 1 Fabian Deutsch 2023-08-22 18:53:26 UTC
We seem to see this with UEFI VM on KubeVirt as well.

Paolo, who could help here?

Comment 2 Gerd Hoffmann 2023-08-23 08:32:58 UTC
How does your libvirt network configuration look like (i.e. 'virsh net-dumpxml default') ?

Comment 4 Petr Horáček 2023-08-24 13:24:10 UTC
Following up on Fabian's comment, I attached the dom.xml that is problematic in our case. We are not using libvirt's network API. I can't speak for ykuksenko.

Comment 14 Laszlo Ersek 2023-08-26 17:01:13 UTC
(In reply to ykuksenko from comment #0)
> Description of problem:
> I am trying to deploy a VM over PXE using virt-install but when I use
> UEFI with PXE, DHCP never completes. Turning off UEFI, with the same
> settings makes the process work immediately. The host contains the
> DHCP server (dnsmasq).

> 2. `sudo virt-install --boot uefi,bios.useserial=on --pxe --network
> network=default,model=e1000,target.dev=fedora37 --os-variant fedora37
> --name fedora37 --graphics none`

Can you confirm your "default" network looks something like this (see
also Gerd's comment 2):

# virsh net-dumpxml --inactive default

<network>
  <name>default</name>
  <uuid>c71c33d9-96dc-4873-860c-ab525ffc72ca</uuid>
  <forward mode='nat'/>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:50:a8:98'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <tftp root='/var/lib/tftpboot'/>
    <dhcp>
      <range start='192.168.122.3' end='192.168.122.254'/>
      <bootp file='shim.efi'/>
    </dhcp>
  </ip>
</network>

(Because this config works fine on my end.)

> I am using tcpdump on the newly created interface of the VM to see
> that only Discover, and Offer stages occur, repeatedly but not the
> Request or Ack stages. I am using an e1000 NIC to avoid the DHCP/UDP
> packets with bad checksums issue. The same thing happens if I use
> virtio NIC.

> 3. `sudo tcpdump -i fedora37 -v` in another terminal to see DHCP
> packets

- Can you attach your captured packets?

- Any particular reason for sniffing the fedora37 interface (from
"target.dev=fedora37" on the virt-install cmdline) rather than "virbr0"?

> Additional info:
> - removing `uefi,` from the above command immediately fixes the issue.

The problem with a libvirt-managed dnsmasq for PXE boot is that libvirt
doesn't let us customize the bootp/@file attribute, dependent on PXE
client architecture. Meaning you can't specify "shim.efi" for UEFI
guests, vs. "pxelinux.0" for BIOS guests.

dnsmasq itself is capable of such a distinction (I forget the exact
syntax, but a few years ago I had worked it out -- it was difficult),
but libvirt doesn't expose it, AFAIK.