RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 888529 - non deterministic bios dev naming in KVM guests
Summary: non deterministic bios dev naming in KVM guests
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: biosdevname
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Václav Pavlín
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On: 884990
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-12-18 20:26 UTC by Dan Kenigsberg
Modified: 2023-09-14 01:39 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 884990
Environment:
Last Closed: 2013-10-16 10:50:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dan Kenigsberg 2012-12-18 20:26:21 UTC
+++ This bug was initially created as a clone of Bug #884990 +++

Description of problem:
biosdevname fails to allocate a stable address to nic in KVM guests.
More details at
https://lists.fedorahosted.org/pipermail/vdsm-devel/2012-December/001842.html

Version-Release number of selected component (if applicable):
biosdevname-0.4.1-2.fc18.x86_64

How reproducible:
quite often

Steps to Reproduce:
1. Boot a Fedora 18 guest with multiple nics under the KVM hypervisor
2. nics receive random names.
  

On Thu, Dec 06, 2012 at 01:12:52PM +0200, Michael S. Tsirkin wrote:
> This is not a qemu issue. This is a biosdevname/VMware issue.
> biodevname has this code:
> 
> /*
>   Algorithm suggested by:
>   http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
> */
> 
> static int
> running_in_virtual_machine (void)
> {
>     u_int32_t eax=1U, ecx=0U;
> 
>     ecx = cpuid (eax, ecx);
>     if (ecx & 0x80000000U)
>        return 1;
>     return 0;
> }
> 
> So it just looks for a hypervisor.
> 
> It should look at the hypervisor leaf
> and either blacklist vmware specifically or whitelist kvm.
> 
> Please open (preferably urgent prio) bugzilla for biosdevname component
> so we can fix it in F18, cc me.
> I can write you a patch but maintainer needs to apply it.

--- Additional comment from Václav Pavlín on 2012-12-07 12:04:07 IST ---

Hi,
Why is this biosdevname issue? I went through the thread but I am not sure how is the random naming related to biosdevname - biosdevname should terminate itself if it is in virtual machine. The problem you see there is that it terminates, or that it doesn't terminate? This was not clear from the thread for me.

Michael, if you are sure it is biosdevname problem, please provide the patch and I will apply it.

Thanks,
Vaclav

--- Additional comment from Michael S. Tsirkin on 2012-12-07 14:35:26 IST ---

I think the problem is that it terminates.
Why does it want to terminate on a VM?
The reason to terminate is apparently that vmware does not
emulate a pci bus so biosdevname makes no sense? but kvm does
emulate pci with consistent addresses same as a physical machine.
Generally I see two possible approaches:
- detect a pci device and do not terminate even on vm
- detect kvm and do not terminate even on vm

Thoughts?

--- Additional comment from jordan hargrave on 2012-12-10 18:14:07 IST ---

kvm guests do not have the knowledge (SMBIOS table or etc) of the physical machine to guest hardware mapping know if a NIC is really embedded or an add-in card.  biosdevname should just terminate if running in a guest OS and all NICs should be named ethX.  Unfortunately it appears the CPUID instruction ECX bit is not correctly implemented in some CPUs/kernels.

--- Additional comment from Michael S. Tsirkin on 2012-12-10 18:21:20 IST ---

Apparently vmware does not have a bios? kvm certainly does
so it can provide all necessary information to guests.
There is no reason to special-case kvm guests that I see.

What happens on the physical machine seems completely irrelevant:
guests run on virtual machines not physical ones.

--- Additional comment from Michael S. Tsirkin on 2012-12-10 18:36:15 IST ---

and yes all kvm releases set hypervisor bit in cpuid.

--- Additional comment from Paolo Bonzini on 2012-12-12 10:27:09 IST ---

> kvm guests do not have the knowledge (SMBIOS table or etc) of the physical 
> machine to guest hardware mapping know if a NIC is really embedded or an 
> add-in card.

Guest hardware is never embedded, in KVM it can always be treated as a PCI add-in card.  Even if it is the passthrough of the host's embedded NIC.

> biosdevname should just terminate if running in a guest OS and all NICs should 
> be named ethX.

Wrong.  biosdevname should just check for VPD as usual.  It won't find it as of now, but if one day we implement it *in the host* the guest will get it for free.

Additionally, if a VPD-enabled network card is passed to the guest with PCI device assignment, it should get stable names *even now*.

--- Additional comment from Narendra K on 2012-12-12 20:16:25 IST ---

Biosdevname requires SMBIOS type 9 records to name PCI add-in adapters (and SMBIOS type 41 records to name onboard network interfaces). In the absence of type 9 records, it looks for Slot # from the 'SltCap'.

On a Fedora 17 guest installed on a Fedora 17 host/hypervisor, it seemed like the guest BIOS did not have SMBIOS type 9 records (dmidecode -t 9 did not show any data). I also assigned a PCI device from host to the guest, but it did not result in a corresponding type 9 record created.

Biosdevname uses Vital Product Data to retrieve NIC Partition information on adapters which are NIC Partition capable.

--- Additional comment from Michael S. Tsirkin on 2012-12-12 23:55:56 IST ---

To comment 7.
First, unfortunately biosdevname boils out early after seeing it's
a VM so it will not check any of this. It's a biosdevname bug and
I don't know why it does this - probably some vmware workaround.
Yes current VMs by default don't have this bios record but
using command line one can add SHPC which gives you
a pci slot capability (this is what you mean by SltCap yes?).
Also if you assign a NIC with VPD you can
get VPD today.


Second, once biosdevname failed udev should make more effort
to make the device name persistent based on mac address,
like it did for RHEL.
It was not done on assumption that it's uncommon but it's
common for VMs. This second bit needs a udev BZ though.

--- Additional comment from Paolo Bonzini on 2012-12-13 15:20:52 IST ---

> Biosdevname requires SMBIOS type 9 records to name PCI add-in adapters

This can be added.  But the slot id in type 9 records would have nothing to do with the slot id of the host machine, it will be just a number from 0 to 31 equal to the PCI device number.  

The question is _how_ do you want the persistent names to be built?  Should they match physical slots of the host, and thus work only for assigned devices, or should the guest just make them up for the sake of persistent naming?  Either choice has a tradeoff.  The latter is easy but it may look strange in the presence of an assigned device.

In any case, note that it is basically impossible to take an assigned PCI device that is "emNN" on the host and make it "emNN" in the guest too, because the device may come and go at any time after boot, long after SMBIOS type 41 records have been prepared.

> a pci slot capability (this is what you mean by SltCap yes?).

No, he means the physical slot number from the PCI Express capability.

--- Additional comment from Matt Domsch on 2012-12-13 16:13:22 IST ---

Michael - bailing out when it sees it's running in a VM may be considered a bug I suppose, but it was quite intentional, no VMware conspiracy theories needed, and a quick 'git blame' shows I added that code in Feb 2011 (nearly 2 years ago).

At the time, there were no VM BIOSes (KVM, Xen, VMware, VirtualBox, Hyper-V, ...) that exposed SMBIOS type 9 (slot) or type 41 (embedded) device structures, nor were there any SltCap data (and then, biosdevname wasn't reading the PCI Express capability field either).

The only fallback we had was the PCI IRQ Routing Table, which too tended to have arbitrary values in them for the slots.  This was causing the "eth0" devices to appear in slot 123, which is complete non-intuitive for a VM with a single virtual NIC.

So, rather than present non-intuitive names for devices, the common case of which there is but a single NIC, and absent a better way to get information from the VM BIOS for all the different virt platforms, I punted, and put in the running_in_vm() check.

If KVM can expose the SMBIOS and/or PCI SltCap data now, and that's reliable and reproducible, it's fair to reconsider.

As for udev persistent naming based on MAC address - that's exactly what biosdevname intends to avoid. If I move a card from Slot 1 to Slot 2, the name should move from p1p1 to p2p1, not retain the same name because some secondary lookup in udev forced it back to p1p1, which would then cause a renaming collision with the device biosdevname thinks is p1p1, and we'd be right back where we started this mess.

--- Additional comment from Paolo Bonzini on 2012-12-17 18:41:17 IST ---

> The only fallback we had was the PCI IRQ Routing Table, which too tended to 
> have arbitrary values in them for the slots.  This was causing the "eth0" 
> devices to appear in slot 123, which is complete non-intuitive for a VM with a 
> single virtual NIC.

What virt platform was this on?

--- Additional comment from Michael S. Tsirkin on 2012-12-17 18:59:28 IST ---

To comment 10:
OK that's good to know.

But it seems misguided: how are VMs special here?
Any box without SMBIOS and with a single PCI NIC should be handled in
exactly same way as the VM IMHO.
Also as others pointed out, KVM is able to expose VPD and
soltcap for assigned devices and if it does biosdevname can use it right?

You also see this issue:
> a renaming collision with the device biosdevname thinks is p1p1
however we should be able to detect that biosdevname bailed out
and therefore no collision is going to occur.
Besides udev names were using a sheme different from biosdevname
so no conflict seems possible.

Anyway, there's now a regression on old hypervisors which we can't fix
retroactively. Names changing across reboots is bad experience.
What is your suggestion for a fix?

Comment 1 Václav Pavlín 2013-07-31 12:12:32 UTC
Potential patch for this issue can be found in original bug, but it is still waiting for review from upstream. So if the upstream confirms it fixes this issue, we can probably consider it as well.

Comment 2 Michael S. Tsirkin 2013-07-31 13:46:12 UTC
upstream doesn't want to take the patch, it does not seem to care about KVM

fedora and rhel will have to fix it on their own.

Comment 3 Václav Pavlín 2013-08-06 10:56:39 UTC
I am still not sure we want to implement VM support in biosdevname. Actually, I am quite sure we don't. Could you please retest if this issue still occurs with the latest systemd in RHEL7? It supports Predictable Network Interface Naming  (http://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/) and thus should provide reliable and stable interface naming as biosdevname goes off the way by exiting as soon as it figures out it's in the VM.

Comment 4 Václav Pavlín 2013-10-16 10:50:02 UTC
As neither me nor upstream is going to implement this functionality to biosdevname, I am going to close this bug as WONTFIX.

Comment 5 Red Hat Bugzilla 2023-09-14 01:39:35 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.