Bug 1118816
Summary: | Unexpected NICs rename after node approval | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Robert McSwain <rmcswain> | ||||||||||
Component: | ovirt-node | Assignee: | Douglas Schilling Landgraf <dougsland> | ||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 3.3.0 | CC: | asegundo, bazulay, bugs, cshao, dfediuck, dougsland, ederevea, fdeutsch, gouyang, harald, huiwa, iheim, laerciomasala, leiwang, mgoldboi, ovirt-bugs, ovirt-maint, rbalakri, rbarry, rhodain, rmcswain, yaniwang, ycui, yeylon, ylavi | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | 3.6.0 | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | node | ||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | 1061672 | Environment: | |||||||||||
Last Closed: | 2015-07-22 21:34:07 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | 1061672 | ||||||||||||
Bug Blocks: | |||||||||||||
Attachments: |
|
Description
Robert McSwain
2014-07-11 15:12:17 UTC
Could reproduce this issue in rhev-hypervisor6-6.5-20140407.0 + Red Hat Enterprise Virtualization Manager Version: 3.3.4-0.53.el6ev [root@hp-xw4550-02 admin]# dmesg |grep rename udev: renamed network interface eth0 to eth4 udev: renamed network interface eth3 to eth5 udev: renamed network interface eth1 to eth6 but this bug has been fixed in rhev-hypervisor6-6.5-20140715.0 + Red Hat Enterprise Virtualization Manager Version: 3.3.4-0.53.el6ev Haiyang, could you please provide the files 70-persistent-net.rules 71-persistent-node-net.rules from /etc/udev/rules.d From both RHEV-H versions if possible rhev-hypervisor6-6.5-20140407.0 and rhev-hypervisor6-6.5-20140715.0 Created attachment 922400 [details] attached 70-persistent-net.rules from /etc/udev/rules.d (In reply to Fabian Deutsch from comment #2) > Haiyang, > > could you please provide the files 70-persistent-net.rules > 71-persistent-node-net.rules from /etc/udev/rules.d [root@hp-xw4550-02 rules.d]# diff 70-persistent-net.rules 71-persistent-node-net.rules 18,29d17 < < # PCI device 0x14e4:0x167b (tg3) (custom name provided by external tool) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:23:7d:53:ab:75", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a2", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" < < # PCI device 0x8086:0x107c (e1000) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0e:0c:b0:95:6a", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6" Created attachment 922401 [details]
attached 71-persistent-node-net.rules from /etc/udev/rules.d
Could reproduce this issue in rhev-hypervisor6-6.5-20140725.0.el6ev + Red Hat Enterprise Virtualization Manager Version: 3.3.4-0.53.el6ev [root@hp-xw4550-02 admin]# dmesg |grep rename udev: renamed network interface eth1 to eth4 udev: renamed network interface eth3 to eth5 udev: renamed network interface eth0 to eth6 [root@hp-xw4550-02 admin]# diff /etc/udev/rules.d/70-persistent-net.rules /etc/udev/rules.d/71-persistent-node-net.rules 18,29d17 < < # PCI device 0x14e4:0x167b (tg3) (custom name provided by external tool) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:23:7d:53:ab:75", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a2", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" < < # PCI device 0x8086:0x107c (e1000) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0e:0c:b0:95:6a", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6" but couldn't reproduce it in rhev-hypervisor6-6.5-20140725.0.el6ev + Red Hat Enterprise Virtualization Manager Version: 3.4.1-0.30.el6ev Hi Douglas, We need to investigate this bug again. As comment 1, QE did not encounter this issue with rhev-hypervisor6-6.5-20140715.0.iso + RHEVM 3.3.4-0.53.el6ev. But comment 9, QE encountered this issue with rhev-hypervisor6-6.5-20140725.0.el6ev + RHEVM 3.3.4-0.53.el6ev. # diff 0725_manifest-srpm.txt 0715_manifest-srpm.txt 53c53 < e2fsprogs-1.41.12-18.el6_5.1.src.rpm --- > e2fsprogs-1.41.12-18.el6.src.rpm 80c80 < glusterfs-3.4.0.57rhs-1.el6_5.src.rpm --- > glusterfs-3.6.0.24-1.el6_5.src.rpm 107c107 < kernel-2.6.32-431.20.5.el6.src.rpm --- > kernel-2.6.32-431.20.3.el6.src.rpm 150a151 > librdmacm-1.0.17-1.el6.src.rpm 198c199 < nss-3.16.1-4.el6_5.src.rpm --- > nss-3.15.3-6.el6_5.src.rpm 208c209 < ovirt-node-3.0.1-18.el6.14.src.rpm --- > ovirt-node-3.0.1-18.el6_5.13.src.rpm 287c288 < sos-2.2-47.el6_5.7.src.rpm --- > sos-2.2-47.el6_5.1.src.rpm 316c317 < vdsm-4.14.11-5.el6ev.src.rpm --- > vdsm-4.14.7-7.el6ev.src.rpm (In reply to Ying Cui from comment #10) … > < ovirt-node-3.0.1-18.el6.14.src.rpm > --- > > ovirt-node-3.0.1-18.el6_5.13.src.rpm … The node change does not look suspicious. Is vdsm doing something with those files? Haiyang, could you please attach the contents of both files. Created attachment 923337 [details]
70-persistent-net.rules
Created attachment 923338 [details]
rules_after_approval
(In reply to Fabian Deutsch from comment #11) > (In reply to Ying Cui from comment #10) > … > > < ovirt-node-3.0.1-18.el6.14.src.rpm > > --- > > > ovirt-node-3.0.1-18.el6_5.13.src.rpm > … > Hi Fabian, > The node change does not look suspicious. > Thanks for jumping into. > Is vdsm doing something with those files? I am investigating, I cannot say it's VDSM yet. I have reproduce only once. > > Haiyang, could you please attach the contents of both files. I have attached. This point to: 'udevadm control --reload-rules' which seems to indicate that for some reason, udev's /lib/udev/write_net_rules when reloading the rules writes new rules for devices that already matched. well, that's an unsolved problem We removed that in RHEL-7. We do not rename in the eth* namespace anymore, because it simply does not work. Also not only 70-persistent-net.rules plays a role, also the ifcfg-* files. You should check the ifcfg-* files for HWADDR, that they are in sync But in the end it's an illusion, that it works every time. Just rename the interfaces out of the eth* namespace to something like net0 and all is fine. That said, the current scheme "could" work, if _all_ but really _all_ eth* interfaces are defined in ifcfg-* or 70-persistent-net.rules and are in sync. Guys, Thanks for all comments, my findings: I have reproduced the report on rhev-hypervisor6-6.5-20140407.0 and rhev-hypervisor6-6.5-20140715.0. Steps to reproduce: 1) Install RHEV-H as Virtual Machine on Virt-Manager with 5 (or more) nics. In my case: Network source: Virtual Network 'default': NAT Device model: rtl8139 2) Configure Network (eth0) via TUI 3) Register the node against RHEV-M in my case: RHEV-M-3.3.1-0.48.el6ev ====== What happens after user approval on RHEV-M? ==== /etc/udev/rules.d/70-persistent-net.rules contains a duplicate information about the nics. However, increasing the nic names, for example: eth1: SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" is now eth6 as well: SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6" The duplication data also happens to the below interfaces: eth2 now is eth7 eth3 now is eth9 eth4 now is eth8 eth5 now is eth10 =================================================== Based on Harald's comment#16 I did a different test. 1) Configured eth0 (now we have /etc/sysconfig/network-scripts/ifcfg-eth0 2) Manually added for interfaces eth2, eth3, eth4, eth5 /etc/sysconfig/network-scripts/ifcfg-<interface-name> Example: ... DEVICE="eth2" HWADDR="52:54:00:D4:07:4E" ... So all interaces will have the same HWADDR on /etc/sysconfig/network-scripts/ifcfg-<interface-name> and in /etc/udev/rules.d/70-persistent-net.rules. After node approval I don't see in the TUI eth7, eth8, eth9, eth10. However, I see in /etc/udev/rules.d/70-persistent-net.rules the rules of eth0, eth1, eth2, eth3, eth4, eth5 been duplicated. /etc/udev/rules.d/70-persistent-net.rules: # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:cc:03:2d", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:01:35:ef", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:d4:07:4e", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:8c:e7:59", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:68:f4:2b", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" ======= here starts the duplication =========== # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:cc:03:2d", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:d4:07:4e", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:01:35:ef", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:8c:e7:59", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:68:f4:2b", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" Possible workarounds are: - sync ifcfg-<interfaces> files in TUI to contain the same HWADDR as 70-persistent-net.rules. - in vdsm plugin before the registration rm -f 70-persistent-net.rules since during the registration we will regenerate it or in case of failure it will be generated next node boot. Thoughts/Suggestions are welcome. ======= Additional data =========== As I see reports working with other iso, I have generated rhev-hypervisor6-6.5-20140715.0 with updated udev (udev-147-2.57) or updated kernel (kernel-2.6.32-431.20.5) or updated ovirt-node (ovirt-node-3.0.1-18.el6.14) and doesn't work either. We can probably not fix it completely according to comment 16, but we can improve the situation. A couple of things that can be improved: 1. Prevent duplication of the entries in th e71-presistent file 2. Ensure that the HWADDR entries in ifcfg-* match the 71-persisten entries But then it still depends on vdsm to do the same thing. Ying, can you team please try to reproduce this bug on RHEV-H 6.6 and RHEV-H 7.1 from 3.5.1? See comment 1, haiyang will trace this bug. Thanks. (In reply to Fabian Deutsch from comment #19) > Ying, can you team please try to reproduce this bug on RHEV-H 6.6 and RHEV-H > 7.1 from 3.5.1? This bug couldn't reproduce in the follow version, seems it has been fixed. RHEV Hypervisor - 6.6 - 20150421.0.el6ev RHEV Hypervisor - 7.1 - 20150512.1.el7ev Red Hat Enterprise Virtualization Manager Version: 3.5.1.1-0.1.el6ev(vt14.4) Ronert, can you please ask the customer if he is still seeing this issue with RHEV-H 3.5.1-1? After all I am quite sure that the bug is not generally fixed. Because we are aware f the general problem. But it can be that timing other udev changes reduce the probability of this bug in 3.5.1-1. The version of this bug is RHEV 3.3 - Since then a lot has been improved in this area, thus lowering the priority and waiting for the reply from Robert. Can you please get us the needed info? This bug exists in RHEV 3.3, and some improvement were made in RHEV 3.4. In 3.5 there was another bug in that area discovered which will be fixed with 3.5.4. However, this bug should not be present in RHEV-H 7 (because predictive device names are used) and thus I'd recommend to close this bug, and suggest to the user to move to RHEV-H 7. (In reply to Fabian Deutsch from comment #29) > This bug exists in RHEV 3.3, and some improvement were made in RHEV 3.4. In > 3.5 there was another bug in that area discovered which will be fixed with > 3.5.4. > > However, this bug should not be present in RHEV-H 7 (because predictive > device names are used) and thus I'd recommend to close this bug, and suggest > to the user to move to RHEV-H 7. agreed. |