Bug 884990
Summary: | non deterministic bios dev naming in KVM guests | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dan Kenigsberg <danken> | ||||
Component: | biosdevname | Assignee: | Narendra K <narendra_k> | ||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 19 | CC: | bazulay, harald, iheim, jeder, jordan_hargrave, knoel, lpeer, matt_domsch, mebrown, mst, pbonzini, praveen_paladugu, vpavlin | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 888529 (view as bug list) | Environment: | |||||
Last Closed: | 2015-02-17 14:36:22 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 888529 | ||||||
Attachments: |
|
Description
Dan Kenigsberg
2012-12-07 08:30:56 UTC
Hi, Why is this biosdevname issue? I went through the thread but I am not sure how is the random naming related to biosdevname - biosdevname should terminate itself if it is in virtual machine. The problem you see there is that it terminates, or that it doesn't terminate? This was not clear from the thread for me. Michael, if you are sure it is biosdevname problem, please provide the patch and I will apply it. Thanks, Vaclav I think the problem is that it terminates. Why does it want to terminate on a VM? The reason to terminate is apparently that vmware does not emulate a pci bus so biosdevname makes no sense? but kvm does emulate pci with consistent addresses same as a physical machine. Generally I see two possible approaches: - detect a pci device and do not terminate even on vm - detect kvm and do not terminate even on vm Thoughts? kvm guests do not have the knowledge (SMBIOS table or etc) of the physical machine to guest hardware mapping know if a NIC is really embedded or an add-in card. biosdevname should just terminate if running in a guest OS and all NICs should be named ethX. Unfortunately it appears the CPUID instruction ECX bit is not correctly implemented in some CPUs/kernels. Apparently vmware does not have a bios? kvm certainly does so it can provide all necessary information to guests. There is no reason to special-case kvm guests that I see. What happens on the physical machine seems completely irrelevant: guests run on virtual machines not physical ones. and yes all kvm releases set hypervisor bit in cpuid. > kvm guests do not have the knowledge (SMBIOS table or etc) of the physical > machine to guest hardware mapping know if a NIC is really embedded or an > add-in card. Guest hardware is never embedded, in KVM it can always be treated as a PCI add-in card. Even if it is the passthrough of the host's embedded NIC. > biosdevname should just terminate if running in a guest OS and all NICs should > be named ethX. Wrong. biosdevname should just check for VPD as usual. It won't find it as of now, but if one day we implement it *in the host* the guest will get it for free. Additionally, if a VPD-enabled network card is passed to the guest with PCI device assignment, it should get stable names *even now*. Biosdevname requires SMBIOS type 9 records to name PCI add-in adapters (and SMBIOS type 41 records to name onboard network interfaces). In the absence of type 9 records, it looks for Slot # from the 'SltCap'. On a Fedora 17 guest installed on a Fedora 17 host/hypervisor, it seemed like the guest BIOS did not have SMBIOS type 9 records (dmidecode -t 9 did not show any data). I also assigned a PCI device from host to the guest, but it did not result in a corresponding type 9 record created. Biosdevname uses Vital Product Data to retrieve NIC Partition information on adapters which are NIC Partition capable. To comment 7. First, unfortunately biosdevname boils out early after seeing it's a VM so it will not check any of this. It's a biosdevname bug and I don't know why it does this - probably some vmware workaround. Yes current VMs by default don't have this bios record but using command line one can add SHPC which gives you a pci slot capability (this is what you mean by SltCap yes?). Also if you assign a NIC with VPD you can get VPD today. Second, once biosdevname failed udev should make more effort to make the device name persistent based on mac address, like it did for RHEL. It was not done on assumption that it's uncommon but it's common for VMs. This second bit needs a udev BZ though. > Biosdevname requires SMBIOS type 9 records to name PCI add-in adapters This can be added. But the slot id in type 9 records would have nothing to do with the slot id of the host machine, it will be just a number from 0 to 31 equal to the PCI device number. The question is _how_ do you want the persistent names to be built? Should they match physical slots of the host, and thus work only for assigned devices, or should the guest just make them up for the sake of persistent naming? Either choice has a tradeoff. The latter is easy but it may look strange in the presence of an assigned device. In any case, note that it is basically impossible to take an assigned PCI device that is "emNN" on the host and make it "emNN" in the guest too, because the device may come and go at any time after boot, long after SMBIOS type 41 records have been prepared. > a pci slot capability (this is what you mean by SltCap yes?). No, he means the physical slot number from the PCI Express capability. Michael - bailing out when it sees it's running in a VM may be considered a bug I suppose, but it was quite intentional, no VMware conspiracy theories needed, and a quick 'git blame' shows I added that code in Feb 2011 (nearly 2 years ago). At the time, there were no VM BIOSes (KVM, Xen, VMware, VirtualBox, Hyper-V, ...) that exposed SMBIOS type 9 (slot) or type 41 (embedded) device structures, nor were there any SltCap data (and then, biosdevname wasn't reading the PCI Express capability field either). The only fallback we had was the PCI IRQ Routing Table, which too tended to have arbitrary values in them for the slots. This was causing the "eth0" devices to appear in slot 123, which is complete non-intuitive for a VM with a single virtual NIC. So, rather than present non-intuitive names for devices, the common case of which there is but a single NIC, and absent a better way to get information from the VM BIOS for all the different virt platforms, I punted, and put in the running_in_vm() check. If KVM can expose the SMBIOS and/or PCI SltCap data now, and that's reliable and reproducible, it's fair to reconsider. As for udev persistent naming based on MAC address - that's exactly what biosdevname intends to avoid. If I move a card from Slot 1 to Slot 2, the name should move from p1p1 to p2p1, not retain the same name because some secondary lookup in udev forced it back to p1p1, which would then cause a renaming collision with the device biosdevname thinks is p1p1, and we'd be right back where we started this mess. > The only fallback we had was the PCI IRQ Routing Table, which too tended to
> have arbitrary values in them for the slots. This was causing the "eth0"
> devices to appear in slot 123, which is complete non-intuitive for a VM with a
> single virtual NIC.
What virt platform was this on?
To comment 10: OK that's good to know. But it seems misguided: how are VMs special here? Any box without SMBIOS and with a single PCI NIC should be handled in exactly same way as the VM IMHO. Also as others pointed out, KVM is able to expose VPD and soltcap for assigned devices and if it does biosdevname can use it right? You also see this issue: > a renaming collision with the device biosdevname thinks is p1p1 however we should be able to detect that biosdevname bailed out and therefore no collision is going to occur. Besides udev names were using a sheme different from biosdevname so no conflict seems possible. Anyway, there's now a regression on old hypervisors which we can't fix retroactively. Names changing across reboots is bad experience. What is your suggestion for a fix? Boxes without SMBIOS won't use biosdevname in RedHat... it is disabled for systems that aren't running smbios 2.5+ I still think that guests should report ethX instead of emX/pX names as the name in the guest may not relate to a specific NIC on the hypervisor. If it is causing problems, then use biosdevname=0 on the kernel command line during installation of a guest. > the name in the guest may not relate to a specific NIC on the hypervisor.
E.g. if the NIC has VPD then it does.
also >Boxes without SMBIOS won't use biosdevname in RedHat... it is disabled for systems that aren't running smbios 2.5+ >I still think that guests should report ethX instead of emX/pX names as the name in the guest may not relate to a specific NIC on the hypervisor. If it is causing problems, then use biosdevname=0 on the kernel command line during installation of a guest. this will not help fix the issue. The issue is that names are now unstable. Fedora used to rely on the mac to make names stable. Now it uses biosdevname but if that fails, a random non stable name. This makes fedora unuseable as a guest if one hotunplug/hotplugs the NIC a lot. Since biosdevname detects a hypervisor and refuses to give a stable name, there is nothing a hypervisor can do. >Boxes without SMBIOS won't use biosdevname in RedHat... so there is no reason to blacklist a hypervisor then: it does not have an smbios. (In reply to comment #15) >>If it is causing problems, then use biosdevname=0 on the kernel command line >>during installation of a guest. > > this will not help fix the issue. > The issue is that names are now unstable. > Fedora used to rely on the mac to make names stable. > Now it uses biosdevname but if that fails, a random > non stable name. > In a host, when 'biosdevname=0' is passed, '/etc/udev/rules.d/70-persistent-net.rules' file would be generated and it would ensure that names are persistent across future reboots. In the guest, it looks like '70-persistent-net.rules' is not generated and seems like an issue. Further, should be automatic in guest without need to tweak kernel command line. In guest, 'biosdevname=0' need not be passed as biosdevname exits without suggesting any name (current behavior). As a result, interfaces are named ethN and the expected behavior is the generation of '/etc/udev/rules.d/70-persistent-net.rules' to ensure persistence of names across future reboots. It seems like an issue with udev and needs to be addressed in udev scripts which generate 70-persistent-net.rules ? I think there are two issues: - if smbios or VPD name a device, biosdevname should use that even in VM. This will let hypervisor control device naming in the future, making the names in guest and host match. - udev issue in case there's no smbios and no VPD. For the second issue clone this BZ to udev? This is a WONTFIX for biosdevname... there's just not any way of knowing what NIC the ethX device in the guest is actually referring to on the physical hardware. What happens if you dynamically add a NIC once in QEMU? QEMU BIOS will already have built type 9/41 structures at boot time, these are not dynamic. biosdevname should always be disabled on VMs... if it is not it is due to a bug/feature in cpuid that isn't recording hypervisor properly. I can add another test to disable on KVM as well if there is a way to determine this from within the guest? The only option would be to name NICs as something else. vm0, vm1, etc. if a NIC has a VPD then this is a very good reliable way to know what does the NIC refer to on physical hardware. what you say about hotplug applies to actual hardware as well. In other words there is no real reason to special-case VMs that I can see. > What happens if you dynamically add a NIC once in QEMU? QEMU BIOS will
> already have built type 9/41 structures at boot time, these are not dynamic.
You can create type 9/41 structures for empty slots, and populate the corresponding PCI slots later with hotplug.
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19 To Comment 10: > So, rather than present non-intuitive names for devices, the common case of > which there is but a single NIC, and absent a better way to get information > from the VM BIOS for all the different virt platforms, I punted, > and put in the running_in_vm() check. So this makes it impossible for hypervisor Also, would it work to simply fail if there are no known NICs instead of checking biosdevname is running on HV? This way a HV can fix the problem by updating bios or whatever. E.g. like the below (also attached): diff -rup biosdevname-0.4.1-bak/src/bios_dev_name.c biosdevname-0.4.1/src/bios_dev_name.c --- biosdevname-0.4.1-bak/src/bios_dev_name.c 2013-06-20 10:09:15.989303465 +0300 +++ biosdevname-0.4.1/src/bios_dev_name.c 2013-06-20 10:19:36.593337106 +0300 @@ -153,8 +153,6 @@ int main(int argc, char *argv[]) if (!running_as_root()) exit(3); - if (running_in_virtual_machine()) - exit(4); cookie = setup_bios_devices(opts.namingpolicy, opts.prefix); if (!cookie) { rc = 1; diff -rup biosdevname-0.4.1-bak/src/naming_policy.c biosdevname-0.4.1/src/naming_policy.c --- biosdevname-0.4.1-bak/src/naming_policy.c 2013-06-20 10:09:15.988303465 +0300 +++ biosdevname-0.4.1/src/naming_policy.c 2013-06-20 10:18:34.041333715 +0300 @@ -13,7 +13,7 @@ #include "state.h" #include "dmidecode/dmidecode.h" -static void use_all_ethN(const struct libbiosdevname_state *state) +static int use_all_ethN(const struct libbiosdevname_state *state) { struct bios_device *dev; unsigned int i=0; @@ -26,9 +26,10 @@ static void use_all_ethN(const struct li dev->bios_name = strdup(buffer); } } + return 0; } -static void use_physical(const struct libbiosdevname_state *state, const char *prefix) +static int use_physical(const struct libbiosdevname_state *state, const char *prefix) { struct bios_device *dev; char buffer[IFNAMSIZ]; @@ -37,6 +38,7 @@ static void use_physical(const struct li char interface[IFNAMSIZ]; unsigned int portnum=0; int known=0; + int status=-1; struct pci_device *vf; list_for_each_entry(dev, &state->bios_devices, node) { @@ -88,9 +90,11 @@ static void use_physical(const struct li if (known) { snprintf(buffer, sizeof(buffer), "%s%s%s", location, port, interface); dev->bios_name = strdup(buffer); + status = 0; } } } + return status; } @@ -99,11 +103,11 @@ int assign_bios_network_names(const stru int rc = 0; switch (policy) { case all_ethN: - use_all_ethN(state); + rc = use_all_ethN(state); break; case physical: default: - use_physical(state, prefix); + rc = use_physical(state, prefix); break; } Created attachment 763294 [details]
patch disabling running-on-hypervisor check, instead check any devices have persistent names
Jordan, could you please check Michaels patch? VPD still doesn't tell us what slot number the device is in.. it only tells us the port number. Even if VPD is implemented in KVM. I don't think this patch is going to work. General rule: everything that real hardware implements, KVM can implement. Even if it doesn't today, this does not mean it does not make sense in a VM - just that emulation is not perfect. So please don't put the logic "if I'm in VM assume xyz is not implemented" anywhere. Always code it up if "xyz is not implemented". This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |