Bug 247641
Summary: | guest OS reports same MAC for every virtual NIC | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Peter Hanecak <peter.hanecak> | ||||||
Component: | kvm | Assignee: | Jeremy Katz <katzj> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 7 | CC: | avi, berrange | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-07-16 17:27:30 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Peter Hanecak
2007-07-10 16:58:13 UTC
I've reproduced this too. <interface type='network'> <mac address='00:16:3e:58:0d:35'/> <source network='default'/> <target dev='vnet2'/> </interface> <interface type='network'> <mac address='00:16:3e:63:21:a5'/> <source network='private'/> <target dev='vnet3'/> </interface> It is clearly passing the MAC addrs on the command line /usr/bin/qemu-kvm -M pc -m 500 -smp 1 -monitor pty -no-acpi -boot c -hda /root/test.img -net nic,macaddr=00:16:3e:58:0d:35,vlan=0 -net tap,fd=14,script=,vlan=0 -net nic,macaddr=00:16:3e:63:21:a5,vlan=1 -net tap,fd=16,script=,vlan=1 -vnc :2 But when looking inside the guest they are being probed as identical eth0: RTL-8139C+ at 0xffffc2000000c000, 00:16:3e:63:21:a5, IRQ 11 eth1: RTL-8139C+ at 0xffffc2000000e100, 00:16:3e:63:21:a5, IRQ 9 NB, both nics are basically seeing the same MAC addr - so it is seeing the 'macaddr' args on the command line, for some reason just not differentiating Converting those HEX values into base 10 00:16:3e:58:0d:35 -> 0 22 62 88 13 53 00:16:3e:63:21:a5 -> 0 22 62 99 33 165 Now, if I attach GDB to the KVM process and dump the 'nd_table' (gdb) print nd_table[0] $3 = {macaddr = {0 '\0', 22 '\026', 62 '>', 88 'X', 13 '\r', 53 '5'}, model = 0x4de2f6 "rtl8139", vlan = 0x2b510f0} (gdb) print nd_table[1] $4 = {macaddr = {0 '\0', 22 '\026', 62 '>', 99 'c', 33 '!', 165 '�'}, model = 0x4de2f6 "rtl8139", vlan = 0x2b512b0} So we can see QEMU's internal data struct has clearly got the correct MAC addr per NIC. So two possibilities: - The emulated NIC is sending wrong data to the guest - The guest driver is fetching the data from the wrong NIC Very odd More debugging. The PCI device state in QEMU has correct info too (gdb) print ((PCIRTL8139State*)first_bus->devices[24])->rtl8139->phys $15 = {0 '\0', 22 '\026', 62 '>', 88 'X', 13 '\r', 53 '5', 0 '\0', 0 '\0'} (gdb) print ((PCIRTL8139State*)first_bus->devices[32])->rtl8139->phys $16 = {0 '\0', 22 '\026', 62 '>', 99 'c', 33 '!', 165 '�', 0 '\0', 0 '\0'} Reading the Linux driver, it seems to get the MAC address out of the EEPROM instead though. So I looked at the eeprom contents: The first NIC (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[0] $41 = 0 '\0' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[1] $42 = 22 '\026' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[2] $43 = 62 '>' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[3] $44 = 88 'X' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[4] $45 = 13 '\r' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[24])->rtl8139->eeprom.contents[7]))[5] $46 = 53 '5' And the second: (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[0] $35 = 0 '\0' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[1] $36 = 22 '\026' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[2] $37 = 62 '>' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[3] $38 = 99 'c' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[4] $39 = 33 '!' (gdb) print ((unsigned char *)&(((PCIRTL8139State *)first_bus->devices[32])->rtl8139->eeprom.contents[7]))[5] $40 = 165 '�' Again the EEPROM in QEMU has correct info per NIC. So I'm thinking perhaps the wrong EEPROM is being read by the guest A little more poking in QEMU and I can confirm that the guest is accessing the wrong EEPROM for the first NIC. When the 8139cp module loads, it is reading the first NIC's data from the second NIC's EEPROM. If I run the guest with an identical QEMU command line, except changing 'qemu-kvm' to 'qemu-system-x86_64' then the correct EEPROM is seen and both NICs have distinct MAC. So this is definitely a KVM bug - something todo with way the NICs PCI iomem is remapped. Well out of my sphere of knowledge now... Urgh. No I take that back. Regular qemu binary defaults to ne2k NIC, while kvm defaults to rtl8139. Regular QEMU fails with rtl8139 too if I specify model explicitly. During startup, each NIC registers itself an IO memory region with cpu_register_io_memory. Instrumenting this method we see the first NIC gets region 12, and the second region 13. Note the opaque pointer address - this is the RTL8139State struct that will later be passed back into the IO memory read/write functions. Register IO memory index 12 opaque 0xa863638 Register IO memory index 13 opaque 0xa863c28 Now when the guest sets up the NICs, these MMIO regions are mapped into physical RAM. This in the rtl8139_mmi_map method. With this instrumented, we see the first NIC is mapped to 0xf2001000 and the second is mapped to 0xf2001100 Register MMIO RTL8139State 0xa863638 phys addr 0xf2001000 Register MMIO RTL8139State 0xa863c28 phys addr 0xf2001100 Finally, when the guest reads/writes from/to the MMIO region, QEMU has to convert from the physical address back to the IO memory index. This is where it seems to go wrong - the correct physical address is coming into from the guest - either 0xf2001000 or 0xf2001100, but it looks like QEMU is always translating both of these addresses to IO index '13', and never '12'. So EEPROM reads/writes *always* end up going to the second NIC. Found the code which converts from physical address back to IO memory region. In the softmmu_template.h, first glue method index = (tlb_adr >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1) Well, IO_MEM_NB_ENTRIES is defined as #define IO_MEM_NB_ENTRIES (1 << (TARGET_PAGE_BITS - IO_MEM_SHIFT)) So, for the physical address -> IO region index conversion to work, each NICs IO region must be at *least* one page apart from each other. Which in turn means they must be at least one page in size. The comment for cpu_register_physical_memory even says this: /* register physical memory. 'size' must be a multiple of the target page size. If (phys_offset & ~TARGET_PAGE_MASK) != 0, then it is an io memory page */ void cpu_register_physical_memory(target_phys_addr_t start_addr, unsigned long size, unsigned long phys_offset) Unfortunately it seems RTL 8139 code registers its IO regions as 256 bytes in size - less than page size. cpu_register_physical_memory(addr + 0, 0x100, s->rtl8139_mmio_io_addr); Simply changing this 0x100, to 0x1000 wasn't sufficient though. I also needed to change the calls pci_register_io_region in the rtl8139 driver to request 0x1000 instead of 0x100 bytes of memory. With these two changes made, the conversion works, and thus the correct EEPROM gets access per NIC & I finally see distinct MAC addrs in the guest! Created attachment 159338 [details]
Fix MMIO region mappings
This patch applies the fix to make MMIO regions at least 1 page in size
Thanks for the fix -- building for rawhide and will push back to F7 when I do an update there (which is going to be real soon now) Created attachment 159358 [details]
Make QEMU support subpage IO mappings
After discussions with upstream the better solution was to make QEMU able to
deal with sub-page granularity I/O mappings. This has the benefit of fixing all
the drivers which use < page size mappings, of which rtl 8139 is only one. The
attached patch is from upstream CVS
|