This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 507391 - qemu-kvm PXE boot with e1000 results in bogus packets
qemu-kvm PXE boot with e1000 results in bogus packets
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: etherboot (Show other bugs)
11
All Linux
high Severity medium
: ---
: ---
Assigned To: Mark McLoughlin
Fedora Extras Quality Assurance
:
: 494541 (view as bug list)
Depends On:
Blocks: F11VirtTarget 494832
  Show dependency treegraph
 
Reported: 2009-06-22 12:03 EDT by Gilboa Davara
Modified: 2009-07-02 01:41 EDT (History)
9 users (show)

See Also:
Fixed In Version: 5.4.4-16.fc11
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-07-02 01:41:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
DSL VM configuration (487 bytes, text/plain)
2009-06-22 12:03 EDT, Gilboa Davara
no flags Details
Private bridge configuration. (Bridge running in promisc mode) (251 bytes, text/plain)
2009-06-22 12:04 EDT, Gilboa Davara
no flags Details
tap42 wireshark recording. (1.21 KB, application/octet-stream)
2009-06-22 12:05 EDT, Gilboa Davara
no flags Details

  None (edit)
Description Gilboa Davara 2009-06-22 12:03:07 EDT
Created attachment 348934 [details]
DSL VM configuration

Description of problem:
I've upgraded my first KVM host to F11.
I'm trying to boot DSL (Damn Small Linux) using bootpxe.
This test works just fine under F9 and F10.

Version-Release number of selected component (if applicable):
qemu-0.10.4-4.fc11.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Setup a private bridge. (Configuration attached.)
2. Setup a qemu empty VM. (Configuration attached.)
3. Boot.

Actual results:
Client fails to receive an IP. Host sees invalid packets. (pcap attached)

Expected results:
boot.
Comment 1 Gilboa Davara 2009-06-22 12:04:02 EDT
Created attachment 348935 [details]
Private bridge configuration. (Bridge running in promisc mode)
Comment 2 Gilboa Davara 2009-06-22 12:05:37 EDT
Created attachment 348936 [details]
tap42 wireshark recording.
Comment 3 Gilboa Davara 2009-06-22 12:31:18 EDT
P.S. dhcp works just fine, once the OS actually boots.
Comment 4 Mark McLoughlin 2009-06-22 12:49:22 EDT
What version of etherboot is this? Does etherboot-5.4.4-15.fc11 help?

  https://admin.fedoraproject.org/updates/etherboot-5.4.4-15.fc11

I doubt it - those frames are pretty messed up. Does it work with e.g. rtl8139, virtio, ne2k_pci or pcnet?
Comment 5 Gilboa Davara 2009-06-22 13:56:57 EDT
Works just fine with rtl8139 with etherboot-5.4.4-13.
I'm still getting trashed 0xff frames with etherboot-5.4.4-15.

- Gilboa
Comment 6 Mark McLoughlin 2009-06-23 07:40:12 EDT
Okay, so the packet dump shows the type field in the ethernet header is (incorrectly) zero.

Enabling debugging in etherboot-5.4.4/drivers/net/e1000.c made the problem go away, which was the first clue.

The code is as follows:

    struct eth_hdr {
        unsigned char dst_addr[ETH_ALEN];
	unsigned char src_addr[ETH_ALEN];
        unsigned short type;
    } hdr;
    ...
    hdr.type = htons (type);
    txhd = tx_base + tx_tail;
    tx_tail = (tx_tail + 1) % 8;
    ...
    txhd->buffer_addr = virt_to_bus (&hdr);
    ...
    E1000_WRITE_REG (&hw, TDT, tx_tail);

i.e. we're setting the type in the header on the stack, setting up a tx descriptor to point to header on the stack and then writing the descriptor number to the device queue.

Looking at the assembly, I see:

     36d:       8b 4c 24 38             mov    0x38(%esp),%ecx
     371:       86 cd                   xchg   %cl,%ch
     ...
     3fb:       89 90 18 38 00 00       mov    %edx,0x3818(%eax)
     ...
     407:       66 89 4c 24 1e          mov    %cx,0x1e(%esp)

i.e. we're only actually moving the results of the htons() into the header on the stack until after we've set the TDT register. At that point the packet has already been sent.

The problem is that the compiler has no way of knowing this memory is used as a result of us writing to the register. So, if we do:

-       struct eth_hdr {
+       volatile struct eth_hdr {

we see:

     36c:       8b 44 24 38             mov    0x38(%esp),%eax
     370:       86 c4                   xchg   %al,%ah
     372:       66 89 44 24 1e          mov    %ax,0x1e(%esp)
     ...
     400:       89 90 18 38 00 00       mov    %edx,0x3818(%eax)

This fixes the problem.
Comment 7 Mark McLoughlin 2009-06-23 07:47:56 EDT
* Tue Jun 23 2009 Mark McLoughlin <markmc@redhat.com> - 5.4.4-16
- Fix e1000 PXE boot - caused by compiler optimization (bug #507391)
Comment 8 Mark McLoughlin 2009-06-23 07:57:41 EDT
*** Bug 494541 has been marked as a duplicate of this bug. ***
Comment 9 Fedora Update System 2009-06-23 07:59:01 EDT
etherboot-5.4.4-16.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/etherboot-5.4.4-16.fc11
Comment 11 Fedora Update System 2009-06-26 22:58:25 EDT
etherboot-5.4.4-16.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update etherboot'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-7024
Comment 12 Gilboa Davara 2009-06-29 10:32:14 EDT
etherboot-5.4.4-16.fc11.noarch seems to solve the problem.

- Gilboa
Comment 13 Kari Hautio 2009-06-30 07:19:22 EDT
etherboot-5.4.4-16.fc11 works for me also and solves no IP problem (bug #494541)
Comment 14 Mark McLoughlin 2009-06-30 09:22:58 EDT
Gilboa and Kari, thanks for testing - I'll push to stable now

Note, in future, if you go to the update url:

  https://admin.fedoraproject.org/updates/F11/FEDORA-2009-7024

you can login and add a comment - this increases the update's 'karma'; if enough people comment, the update gets pushed automatically
Comment 15 Gilboa Davara 2009-06-30 10:57:54 EDT
Thanks. Will do.

- Gilboa
Comment 16 Fedora Update System 2009-07-02 01:41:47 EDT
etherboot-5.4.4-16.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.