RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1720578 - [OpenStack] NIC naming validation can produce false positives and block a valid upgrade
Summary: [OpenStack] NIC naming validation can produce false positives and block a val...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: leapp-repository
Version: 7.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Leapp team
QA Contact: Alois Mahdal
URL:
Whiteboard:
Depends On:
Blocks: 1708241 1727807
TreeView+ depends on / blocked
 
Reported: 2019-06-14 09:26 UTC by Jiri Stransky
Modified: 2019-08-28 12:24 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-21 08:26:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jiri Stransky 2019-06-14 09:26:09 UTC
Description of problem:

The PersistentNetNamesDisable actor validates NIC names and blocks the upgrade if the machine has more than one "eth<number>" NICs under the assumption that the names will change after upgrade. While this is probably the usual case, it can also happen that the "eth<number>" names are set up in a way that the naming would be stable after the upgrade. In this scenario the upgrade is safe/valid, but Leapp will still block it.

It is probably unrealistic to produce 100% correct heuristic whether the NIC names will change or not as there are multiple ways to stabilize them (ifcfg files with HWADDR entry, udev rules to name the ones i know about :) ). Therefore it would probably be best to keep the actor enabled by default, but provide a way to disable it if the user is certain they have made their NIC naming stable.

Version-Release number of selected component (if applicable):

leapp-repository-0.7.0-5.el7_6.noarch

https://github.com/oamg/leapp-repository/blob/d8387b491f4d10536ec2927af2825ef4da1187cf/repos/system_upgrade/el7toel8/actors/persistentnetnamesdisable/actor.py

How reproducible: always

Steps to Reproduce:

* Have a machine with multiple eth<number> NICs.

* Make the naming stable e.g. via HWADDR entries in ifcfg files.

* `leapp upgrade` refuses to do the upgrade.

* (After manually editing the PersitentNetNamesDisable actor to mark it as experimental, the upgrade proceeds without issues, and machine boots up with the same NIC names as before the upgrade.)

Expected results:

If i'm certain my NIC names will be the same even after the upgrade, i'd like to be able to disable the PersistentNetNamesDisable actor.

Comment 2 Michal Bocek 2019-06-19 12:31:25 UTC
Michal Sekletar can comment on the feasibility of improving the actor by detecting "stable" multiple-ethX-NIC configurations.

In regards to disabling actors, we may add a generic option to leapp to disable any actor, but this option would always be considered at-your-own-risk option causing the upgrade to be unsupported should any issue related to the disabled actor arise.

Comment 3 Petr Stodulka 2019-06-19 12:44:51 UTC
> In regards to disabling actors, we may add a generic option to leapp to disable any actor, but this option would always be considered at-your-own-risk option causing the upgrade to be unsupported should any issue related to the disabled actor arise.

I really do not like the idea. Maybe we will implement that, but I hope that not. The solution of the problem in this case should be a user question. So the user would specify that issues connected to the NIC names are "False positives" and upgrade should be processed anyway (on their own risk). - Unless there is a bug in actor. Currently it is not possible to ask user for cooperation now. So probably this issues will be postponed until the user questions will be implemented.

Anyway, @msekleta, what do you think about the issue?

Comment 4 Jiri Stransky 2019-06-20 09:46:06 UTC
Our use of Leapp is not interactive, so users will not have the opportunity to answer questions on CLI, let's please also include a non-interactive solution (e.g. CLI parameter or environment variable or generic way to disable an actor).

Regarding improvement of the heuristic, i think one way is to look at `ifcfg` files for the eth interfaces and check if there is HWADDR entry, which will stabilize the name even on RHEL 8. That would reduce the false positives but it does not remove them completely, e.g. another way to stabilize names is via custom udev rules (OpenStack project os-net-config utilizes them [1]), but those could be more difficult to parse as they are much more free-form than ifcfg files. We should probably have a manual override of the actor logic anyway, to ensure users don't get blocked from upgrading.

[1] https://github.com/openstack/os-net-config/blob/22aeee1b87731618356f2792d49795a01f40ec19/os_net_config/sriov_config.py#L194-L208

Comment 5 Jiri Stransky 2019-06-20 09:51:24 UTC
E.g. on my machine right now i have:

[heat-admin@controller-0 ~]$ cat /etc/udev/rules.d/70-persistent-net.rules 
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:0c:3c:41", NAME="eth0"

which stabilizes the name for eth0. But this solution is quite specific to OpenStack so i wouldn't necessarily expect Leapp to look for this. I think we'd be fine with disabling/overriding the actor.

Comment 6 Petr Stodulka 2019-06-20 09:55:23 UTC
@Jiri: We are not talking so much about interactive upgrade as about something like possibility to specify answers using asnwerfile - not telling that answerfile will be the way, but imagine that we will produce something which user would be able to modify and use it for 2nd run.

I will wait for asnwer from Michal. AFAIK ifcfg files are deprecated and are not 100% safe. But Michal knows more about that.

Comment 7 Michal Sekletar 2019-06-21 08:11:33 UTC
Jiri, what makes you believe that ifcfg with NAME+HWADDR where NAME=ethX makes naming stable? It is just a pure luck that you didn't run into problems yet. Imagine system with 2 virtio-net NICs.

It is very likely that two virtio NICs might show up in parallel and things then just blow up (in second column are names configured in ifcfg files).

eth0 (virtio) -> eth1 (eth1 name is already taken, rename fails) 
eth1 (virtio) -> eth0 (eth0 name is already taken, rename fails)

Sure NICs have ethX names but in a swapped order.

What I described above is the situation from RHEL-7. In RHEL-6 udev had couple more tricks up its sleeve to prevent this (temporary rename to rename_X name to free up the name), however, I can make the situation bit more complex by throwing one more NIC into mix (different driver) and that scheme will fall flat on its face (as it did many times in RHEL-6). I mean, there is a reason why we ditched it in RHEL-7.

Btw, NAME+HWADDR works perfectly well, *IF*, you use different prefix for your NICs (e.g. net, lan).

Comment 8 Petr Stodulka 2019-06-21 08:26:35 UTC
Closing as cantfix. Jiri, if you think that this is not right, discuss it with Michal first before you reopen the bug. Without working safe solution around ethX (includes Michal's approval the solution is correct), there will not be plan to do any changes around. Only thing that still would be implemented in future, is possibility of user-question, that will allow user to shoot themselves in leg - but from this point, I don't want to any such solution like that - as we have bunch of other precedences, why we should not do that.

Comment 9 Jiri Stransky 2019-06-21 10:57:21 UTC
Thanks for the responses, i didn't know about the renaming collisions issue which could cause the naming to fail. Given that this is not solvable safely and we know the reasons why now, i agree with CLOSED/CANTFIX.


Note You need to log in before you can comment on or make changes to this bug.