Bug 247641

Summary: guest OS reports same MAC for every virtual NIC
Product: [Fedora] Fedora Reporter: Peter Hanecak <peter.hanecak>
Component: kvmAssignee: Jeremy Katz <katzj>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 7CC: avi, berrange
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-07-16 13:27:30 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
Fix MMIO region mappings
none
Make QEMU support subpage IO mappings none

Description Peter Hanecak 2007-07-10 12:58:13 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20070603 Galeon/2.0.3 Firefox/2.0.0.4

Description of problem:
I installed F7 on Intel(R) Core(TM)2 CPU machine and tried to use virt-manager to run guest F7 system using QEMU/KVM. When running with one virtual NIC using 'default' virtual network (virbr0) network works as expected.

Then I created another isolated virtual network using Virtual Manager > Host Details > Virtual Network > Add not touching IP ranges, etc. (thus creating vlan0). After that I added second virtual NIC using ... > Machine Details > Hardware > Add setting source device to second virtual network (vlan0).

Now, guest OS boots OK but DHCP fails for eth0. eth1 gets the IP.

After closer inspection I found out, that:

a) `dmesg | grep eth` (in guest) reports both eth0 and eth1 with same MAC address but

b) virt. machine configuration file in /etc/libvirt/qemu contains properly two different MAC addresses for each configured virtual NIC (<mac address='...)

c) `ps auwx|grep` (in host) reports "-net nic,macaddr=..." arguments for qemu-kvm with correct (i.e. different) MAC addresses

I tried to add also third virtual network and virtual NIC and again, all three NIC aunder guest reported same MAC address.

After that I retried several times adding and removing virtual NICs (removing by hand by editing XML configuration in /etc/libvirt/qemu becasue "Remove" in Virt Manager did not work) and always the "last" MAC is used for all the virtual NICs (by "last" I mean the MAC specified for the last virtual NIC).

I also tried FC6 guest OS with same results.

Version-Release number of selected component (if applicable):
kvm-24-1

How reproducible:
Always


Steps to Reproduce:
1. Create a virtual machine using Virtual Manager and QEMU/KVM HW accelerated hypervisor.
2. Install guest OS and after you successfully finish, shut down the virtual machine.
2. Add second, third, ... virtual NIC into the machine.
3. Boot virtual machine.

Actual Results:
All the virtual NICs under the guest OS have same MAC address. Some of them fail to get the IP address assigned by DHCP. And only some of them, if any, then work (i.e. are pingable from host OS).

Expected Results:
a) All the virtual NICs reporting different MAC under guest - ech the same MAC specified for the device in virt. nmachine configuration XML file.

b) All the mechines gets the IP address assigned to them via DHCP.

c) All the machines are pingable from host OS.

Additional info:
Comment 1 Daniel Berrange 2007-07-15 22:36:10 EDT
I've reproduced this too.

    <interface type='network'>
      <mac address='00:16:3e:58:0d:35'/>
      <source network='default'/>
      <target dev='vnet2'/>
    </interface>
    <interface type='network'>
      <mac address='00:16:3e:63:21:a5'/>
      <source network='private'/>
      <target dev='vnet3'/>
    </interface>

It is clearly passing the MAC addrs on the command line

 /usr/bin/qemu-kvm -M pc -m 500 -smp 1 -monitor pty -no-acpi -boot c -hda
/root/test.img -net nic,macaddr=00:16:3e:58:0d:35,vlan=0 -net
tap,fd=14,script=,vlan=0 -net nic,macaddr=00:16:3e:63:21:a5,vlan=1 -net
tap,fd=16,script=,vlan=1 -vnc :2

But when looking inside the guest they are being probed as identical

eth0: RTL-8139C+ at 0xffffc2000000c000, 00:16:3e:63:21:a5, IRQ 11
eth1: RTL-8139C+ at 0xffffc2000000e100, 00:16:3e:63:21:a5, IRQ 9

NB, both nics are basically seeing the same MAC addr - so it is seeing the
'macaddr' args on the command line, for some reason just not differentiating
Comment 2 Daniel Berrange 2007-07-15 22:41:49 EDT
Converting those HEX values into base 10

00:16:3e:58:0d:35 -> 0 22 62 88 13 53
00:16:3e:63:21:a5 -> 0 22 62 99 33 165


Now, if I attach GDB to the KVM process and dump the 'nd_table'

 (gdb) print nd_table[0]
$3 = {macaddr = {0 '\0', 22 '\026', 62 '>', 88 'X', 13 '\r', 53 '5'}, model =
0x4de2f6 "rtl8139", vlan = 0x2b510f0}
 (gdb) print nd_table[1]
$4 = {macaddr = {0 '\0', 22 '\026', 62 '>', 99 'c', 33 '!', 165 '�'}, model =
0x4de2f6 "rtl8139", vlan = 0x2b512b0}

So we can see QEMU's internal data struct has clearly got the correct MAC addr
per NIC.

So two possibilities:

  - The emulated NIC is sending wrong data to the guest
  - The guest driver is fetching the data from the wrong NIC

Very odd
Comment 3 Daniel Berrange 2007-07-15 23:37:10 EDT
More debugging. The PCI device state in QEMU has correct info too

(gdb) print ((PCIRTL8139State*)first_bus->devices[24])->rtl8139->phys
$15 = {0 '\0', 22 '\026', 62 '>', 88 'X', 13 '\r', 53 '5', 0 '\0', 0 '\0'}
(gdb) print ((PCIRTL8139State*)first_bus->devices[32])->rtl8139->phys
$16 = {0 '\0', 22 '\026', 62 '>', 99 'c', 33 '!', 165 '�', 0 '\0', 0 '\0'}

Reading the Linux driver, it seems to get the MAC address out of the EEPROM
instead though. So I looked at the eeprom contents:

The first NIC

(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[0]
$41 = 0 '\0'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[1]
$42 = 22 '\026'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[2]
$43 = 62 '>'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[3]
$44 = 88 'X'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[4]
$45 = 13 '\r'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[5]
$46 = 53 '5'

And the second:

(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[0]
$35 = 0 '\0'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[1]
$36 = 22 '\026'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[2]
$37 = 62 '>'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[3]
$38 = 99 'c'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[4]
$39 = 33 '!'
(gdb) print ((unsigned char *)&(((PCIRTL8139State
*)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[5]
$40 = 165 '�'

Again the EEPROM in QEMU has correct info per NIC. So I'm thinking perhaps the
wrong EEPROM is being read by the guest
Comment 4 Daniel Berrange 2007-07-15 23:56:08 EDT
A little more poking in QEMU and I can confirm that the guest is accessing the
wrong EEPROM for the first NIC. When the 8139cp module loads, it is reading the
first NIC's data from the second NIC's EEPROM.

If I run the guest with an identical QEMU command line, except changing
'qemu-kvm' to 'qemu-system-x86_64' then the correct EEPROM is seen and both NICs
have distinct MAC. So this is definitely a KVM bug - something todo with way the
NICs PCI iomem is remapped. Well out of my sphere of knowledge now...
Comment 5 Daniel Berrange 2007-07-16 00:03:25 EDT
Urgh. No I take that back. Regular qemu binary defaults to ne2k NIC, while kvm
defaults to rtl8139. Regular QEMU fails with rtl8139 too if I specify model
explicitly.
Comment 6 Daniel Berrange 2007-07-16 10:45:42 EDT
During startup, each NIC registers itself an IO memory region with
cpu_register_io_memory. Instrumenting this method we see the first NIC gets
region 12, and the second region 13. Note the opaque pointer address - this is
the RTL8139State struct that will later be passed back into the IO memory
read/write functions.

  Register IO memory index 12 opaque 0xa863638
  Register IO memory index 13 opaque 0xa863c28

Now when the guest sets up the NICs, these MMIO regions are mapped into physical
RAM. This in the rtl8139_mmi_map method. With this instrumented, we see the
first NIC is mapped to 0xf2001000 and the second is mapped to 0xf2001100

  Register MMIO RTL8139State 0xa863638 phys addr 0xf2001000 
  Register MMIO RTL8139State 0xa863c28 phys addr 0xf2001100

Finally, when the guest reads/writes from/to the MMIO region, QEMU has to
convert from the physical address back to the IO memory index.

This is where it seems to go wrong - the correct physical address is coming into
from the guest - either 0xf2001000 or 0xf2001100, but it looks like QEMU is
always translating both of these addresses to IO index '13', and never '12'. So
EEPROM reads/writes *always* end up going to the second NIC.
Comment 7 Daniel Berrange 2007-07-16 11:10:29 EDT
Found the code which converts from physical address back to IO memory region. In
the softmmu_template.h, first glue method

   index = (tlb_adr >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1)

Well,  IO_MEM_NB_ENTRIES is defined as

  #define IO_MEM_NB_ENTRIES  (1 << (TARGET_PAGE_BITS  - IO_MEM_SHIFT))

So, for the physical address -> IO region index conversion to work, each NICs IO
region must be at *least* one page apart from each other. Which in turn means
they must be at least one page in size. The comment for
cpu_register_physical_memory even says this:

  /* register physical memory. 'size' must be a multiple of the target
     page size. If (phys_offset & ~TARGET_PAGE_MASK) != 0, then it is an
     io memory page */
  void cpu_register_physical_memory(target_phys_addr_t start_addr, 
                                    unsigned long size,
                                    unsigned long phys_offset)

Unfortunately it seems RTL 8139 code registers its IO regions as 256 bytes in
size - less than page size.

 cpu_register_physical_memory(addr + 0, 0x100, s->rtl8139_mmio_io_addr);

Simply changing this 0x100, to 0x1000 wasn't sufficient though. I also needed to
change the calls pci_register_io_region in the rtl8139 driver to request 0x1000
instead of 0x100 bytes of memory.

With these two changes made, the conversion works, and thus the correct EEPROM
gets access per NIC & I finally see distinct MAC addrs in the guest!
Comment 8 Daniel Berrange 2007-07-16 11:48:59 EDT
Created attachment 159338 [details]
Fix MMIO region mappings

This patch applies the fix to make MMIO regions at least 1 page in size
Comment 9 Jeremy Katz 2007-07-16 13:27:30 EDT
Thanks for the fix -- building for rawhide and will push back to F7 when I do an
update there (which is going to be real soon now)
Comment 10 Daniel Berrange 2007-07-16 14:55:57 EDT
Created attachment 159358 [details]
Make QEMU support subpage IO mappings

After discussions with upstream the better solution was to make QEMU able to
deal with sub-page granularity I/O mappings. This has the benefit of fixing all
the drivers which use < page size mappings, of which rtl 8139 is only one. The
attached patch is from upstream CVS