Description of problem:After approving the host on RHEV, the "NICs" are disordered. Only eth0 is OK (I add an image). The ethernet are duplicated in the file "70-persistent-net.rules" in the hypervisor. Version-Release number of selected component (if applicable): RHEV 3.3 , rhev-hypervisor6-6.5-20140407.0 and enclosure HP with virtual connect. How reproducible:You only need to add a hypervisor using RHEV 3.3 Actual results: You have the order of the ethernet disordered +++ This bug was initially created as a clone of Bug #1061672 +++ Description of problem: After approving the host on "oVirt", the "NICs" are disordered. Only eth0 is OK... eth0 -> eth0 (NICE) eth1 -> eth5 eth2 -> eth6 eth3 -> eth4 /etc/udev/rules.d/70-persistent-net.rules duplicate info Version-Release number of selected component (if applicable): ovirt-node-iso-3.0.4-1.0.201401291204.vdsm.el6.iso How reproducible: Steps to Reproduce: 1. Install oVirt Node 2. Config network / hostname / DNS 3. Enable SSH password authentication 4. Enter Management Server / Port / Password and Save & Register 5. Aprove in oVirt Manager Actual results: eth0 eth4 eth5 eth6 Expected results: eth0 eth1 eth2 eth3 Additional info: [root@ptihost-vdsm-dev04 admin]# cd /etc/udev/rules.d/ [root@ptihost-vdsm-dev04 rules.d]# ls 12-ovirt-iosched.rules 60-raw.rules 70-persistent-cd.rules 70-persistent-net.rules 71-persistent-node-net.rules 80-kvm.rules 90-hal.rules 98-kexec.rules 99-fuse.rules [root@ptihost-vdsm-dev04 rules.d]# cat 70-persistent-net.rules # This file was automatically generated by the /lib/udev/write_net_rules # program, run by the persistent-net-generator.rules rules file. # # You can modify it, as long as you keep each rule on a single # line, and change only the value of the NAME= key. # PCI device 0x8086:0x105e (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:cb", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" # PCI device 0x8086:0x1096 (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:c8", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x8086:0x1096 (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:c9", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" # PCI device 0x8086:0x105e (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:ca", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" # PCI device 0x8086:0x105e (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:cb", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" # PCI device 0x8086:0x1096 (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:c9", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" # PCI device 0x8086:0x1096 (e1000e) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:c8", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x8086:0x105e (e1000e) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1e:68:4a:10:ca", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6" [root@ptihost-vdsm-dev04 rules.d]# [root@ptihost-vdsm-dev04 rules.d]# dmesg | grep rename udev: renamed network interface eth3 to eth4 udev: renamed network interface eth1 to eth5 udev: renamed network interface eth2 to eth6 --- Additional comment from Itamar Heim on 2014-02-09 03:52:49 EST --- Setting target release to current version for consideration and review. please do not push non-RFE bugs to an undefined target release to make sure bugs are reviewed for relevancy, fix, closure, etc. --- Additional comment from Sandro Bonazzola on 2014-03-04 04:26:39 EST --- This is an automated message. Re-targeting all non-blocker bugs still open on 3.4.0 to 3.4.1. --- Additional comment from Sandro Bonazzola on 2014-05-08 09:50:34 EDT --- This is an automated message. oVirt 3.4.1 has been released. This issue has been retargeted to 3.4.2 as it has priority high, please retarget if needed. If this is a blocker please add it to the tracker Bug #1095370 --- Additional comment from Sandro Bonazzola on 2014-06-11 03:04:35 EDT --- This is an automated message: oVirt 3.4.2 has been released. This bug has been re-targeted from 3.4.2 to 3.4.3 since priority or severity were high or urgent. --- Additional comment from Sandro Bonazzola on 2014-06-11 03:05:13 EDT --- This is an automated message: oVirt 3.4.2 has been released. This bug has been re-targeted from 3.4.2 to 3.4.3 since priority or severity were high or urgent.
Could reproduce this issue in rhev-hypervisor6-6.5-20140407.0 + Red Hat Enterprise Virtualization Manager Version: 3.3.4-0.53.el6ev [root@hp-xw4550-02 admin]# dmesg |grep rename udev: renamed network interface eth0 to eth4 udev: renamed network interface eth3 to eth5 udev: renamed network interface eth1 to eth6 but this bug has been fixed in rhev-hypervisor6-6.5-20140715.0 + Red Hat Enterprise Virtualization Manager Version: 3.3.4-0.53.el6ev
Haiyang, could you please provide the files 70-persistent-net.rules 71-persistent-node-net.rules from /etc/udev/rules.d
From both RHEV-H versions if possible rhev-hypervisor6-6.5-20140407.0 and rhev-hypervisor6-6.5-20140715.0
Created attachment 922400 [details] attached 70-persistent-net.rules from /etc/udev/rules.d (In reply to Fabian Deutsch from comment #2) > Haiyang, > > could you please provide the files 70-persistent-net.rules > 71-persistent-node-net.rules from /etc/udev/rules.d [root@hp-xw4550-02 rules.d]# diff 70-persistent-net.rules 71-persistent-node-net.rules 18,29d17 < < # PCI device 0x14e4:0x167b (tg3) (custom name provided by external tool) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:23:7d:53:ab:75", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a2", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" < < # PCI device 0x8086:0x107c (e1000) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0e:0c:b0:95:6a", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6"
Created attachment 922401 [details] attached 71-persistent-node-net.rules from /etc/udev/rules.d
Could reproduce this issue in rhev-hypervisor6-6.5-20140725.0.el6ev + Red Hat Enterprise Virtualization Manager Version: 3.3.4-0.53.el6ev [root@hp-xw4550-02 admin]# dmesg |grep rename udev: renamed network interface eth1 to eth4 udev: renamed network interface eth3 to eth5 udev: renamed network interface eth0 to eth6 [root@hp-xw4550-02 admin]# diff /etc/udev/rules.d/70-persistent-net.rules /etc/udev/rules.d/71-persistent-node-net.rules 18,29d17 < < # PCI device 0x14e4:0x167b (tg3) (custom name provided by external tool) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:23:7d:53:ab:75", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a2", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" < < # PCI device 0x8086:0x107c (e1000) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0e:0c:b0:95:6a", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" < < # PCI device 0x14e4:0x1639 (bnx2) < SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:10:18:81:a4:a0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6" but couldn't reproduce it in rhev-hypervisor6-6.5-20140725.0.el6ev + Red Hat Enterprise Virtualization Manager Version: 3.4.1-0.30.el6ev
Hi Douglas, We need to investigate this bug again. As comment 1, QE did not encounter this issue with rhev-hypervisor6-6.5-20140715.0.iso + RHEVM 3.3.4-0.53.el6ev. But comment 9, QE encountered this issue with rhev-hypervisor6-6.5-20140725.0.el6ev + RHEVM 3.3.4-0.53.el6ev. # diff 0725_manifest-srpm.txt 0715_manifest-srpm.txt 53c53 < e2fsprogs-1.41.12-18.el6_5.1.src.rpm --- > e2fsprogs-1.41.12-18.el6.src.rpm 80c80 < glusterfs-3.4.0.57rhs-1.el6_5.src.rpm --- > glusterfs-3.6.0.24-1.el6_5.src.rpm 107c107 < kernel-2.6.32-431.20.5.el6.src.rpm --- > kernel-2.6.32-431.20.3.el6.src.rpm 150a151 > librdmacm-1.0.17-1.el6.src.rpm 198c199 < nss-3.16.1-4.el6_5.src.rpm --- > nss-3.15.3-6.el6_5.src.rpm 208c209 < ovirt-node-3.0.1-18.el6.14.src.rpm --- > ovirt-node-3.0.1-18.el6_5.13.src.rpm 287c288 < sos-2.2-47.el6_5.7.src.rpm --- > sos-2.2-47.el6_5.1.src.rpm 316c317 < vdsm-4.14.11-5.el6ev.src.rpm --- > vdsm-4.14.7-7.el6ev.src.rpm
(In reply to Ying Cui from comment #10) … > < ovirt-node-3.0.1-18.el6.14.src.rpm > --- > > ovirt-node-3.0.1-18.el6_5.13.src.rpm … The node change does not look suspicious. Is vdsm doing something with those files? Haiyang, could you please attach the contents of both files.
Created attachment 923337 [details] 70-persistent-net.rules
Created attachment 923338 [details] rules_after_approval
(In reply to Fabian Deutsch from comment #11) > (In reply to Ying Cui from comment #10) > … > > < ovirt-node-3.0.1-18.el6.14.src.rpm > > --- > > > ovirt-node-3.0.1-18.el6_5.13.src.rpm > … > Hi Fabian, > The node change does not look suspicious. > Thanks for jumping into. > Is vdsm doing something with those files? I am investigating, I cannot say it's VDSM yet. I have reproduce only once. > > Haiyang, could you please attach the contents of both files. I have attached.
This point to: 'udevadm control --reload-rules' which seems to indicate that for some reason, udev's /lib/udev/write_net_rules when reloading the rules writes new rules for devices that already matched.
well, that's an unsolved problem We removed that in RHEL-7. We do not rename in the eth* namespace anymore, because it simply does not work. Also not only 70-persistent-net.rules plays a role, also the ifcfg-* files. You should check the ifcfg-* files for HWADDR, that they are in sync But in the end it's an illusion, that it works every time. Just rename the interfaces out of the eth* namespace to something like net0 and all is fine. That said, the current scheme "could" work, if _all_ but really _all_ eth* interfaces are defined in ifcfg-* or 70-persistent-net.rules and are in sync.
Guys, Thanks for all comments, my findings: I have reproduced the report on rhev-hypervisor6-6.5-20140407.0 and rhev-hypervisor6-6.5-20140715.0. Steps to reproduce: 1) Install RHEV-H as Virtual Machine on Virt-Manager with 5 (or more) nics. In my case: Network source: Virtual Network 'default': NAT Device model: rtl8139 2) Configure Network (eth0) via TUI 3) Register the node against RHEV-M in my case: RHEV-M-3.3.1-0.48.el6ev ====== What happens after user approval on RHEV-M? ==== /etc/udev/rules.d/70-persistent-net.rules contains a duplicate information about the nics. However, increasing the nic names, for example: eth1: SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" is now eth6 as well: SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6" The duplication data also happens to the below interfaces: eth2 now is eth7 eth3 now is eth9 eth4 now is eth8 eth5 now is eth10 =================================================== Based on Harald's comment#16 I did a different test. 1) Configured eth0 (now we have /etc/sysconfig/network-scripts/ifcfg-eth0 2) Manually added for interfaces eth2, eth3, eth4, eth5 /etc/sysconfig/network-scripts/ifcfg-<interface-name> Example: ... DEVICE="eth2" HWADDR="52:54:00:D4:07:4E" ... So all interaces will have the same HWADDR on /etc/sysconfig/network-scripts/ifcfg-<interface-name> and in /etc/udev/rules.d/70-persistent-net.rules. After node approval I don't see in the TUI eth7, eth8, eth9, eth10. However, I see in /etc/udev/rules.d/70-persistent-net.rules the rules of eth0, eth1, eth2, eth3, eth4, eth5 been duplicated. /etc/udev/rules.d/70-persistent-net.rules: # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:cc:03:2d", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:01:35:ef", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:d4:07:4e", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:8c:e7:59", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" # PCI device 0x10ec:0x8139 (8139cp) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:68:f4:2b", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" ======= here starts the duplication =========== # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:cc:03:2d", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:91:21:c4", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:d4:07:4e", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:01:35:ef", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:8c:e7:59", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5" # PCI device 0x10ec:0x8139 (8139cp) (custom name provided by external tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:68:f4:2b", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" Possible workarounds are: - sync ifcfg-<interfaces> files in TUI to contain the same HWADDR as 70-persistent-net.rules. - in vdsm plugin before the registration rm -f 70-persistent-net.rules since during the registration we will regenerate it or in case of failure it will be generated next node boot. Thoughts/Suggestions are welcome. ======= Additional data =========== As I see reports working with other iso, I have generated rhev-hypervisor6-6.5-20140715.0 with updated udev (udev-147-2.57) or updated kernel (kernel-2.6.32-431.20.5) or updated ovirt-node (ovirt-node-3.0.1-18.el6.14) and doesn't work either.
We can probably not fix it completely according to comment 16, but we can improve the situation. A couple of things that can be improved: 1. Prevent duplication of the entries in th e71-presistent file 2. Ensure that the HWADDR entries in ifcfg-* match the 71-persisten entries But then it still depends on vdsm to do the same thing.
Ying, can you team please try to reproduce this bug on RHEV-H 6.6 and RHEV-H 7.1 from 3.5.1?
See comment 1, haiyang will trace this bug. Thanks.
(In reply to Fabian Deutsch from comment #19) > Ying, can you team please try to reproduce this bug on RHEV-H 6.6 and RHEV-H > 7.1 from 3.5.1? This bug couldn't reproduce in the follow version, seems it has been fixed. RHEV Hypervisor - 6.6 - 20150421.0.el6ev RHEV Hypervisor - 7.1 - 20150512.1.el7ev Red Hat Enterprise Virtualization Manager Version: 3.5.1.1-0.1.el6ev(vt14.4)
Ronert, can you please ask the customer if he is still seeing this issue with RHEV-H 3.5.1-1?
After all I am quite sure that the bug is not generally fixed. Because we are aware f the general problem. But it can be that timing other udev changes reduce the probability of this bug in 3.5.1-1.
The version of this bug is RHEV 3.3 - Since then a lot has been improved in this area, thus lowering the priority and waiting for the reply from Robert.
Can you please get us the needed info?
This bug exists in RHEV 3.3, and some improvement were made in RHEV 3.4. In 3.5 there was another bug in that area discovered which will be fixed with 3.5.4. However, this bug should not be present in RHEV-H 7 (because predictive device names are used) and thus I'd recommend to close this bug, and suggest to the user to move to RHEV-H 7.
(In reply to Fabian Deutsch from comment #29) > This bug exists in RHEV 3.3, and some improvement were made in RHEV 3.4. In > 3.5 there was another bug in that area discovered which will be fixed with > 3.5.4. > > However, this bug should not be present in RHEV-H 7 (because predictive > device names are used) and thus I'd recommend to close this bug, and suggest > to the user to move to RHEV-H 7. agreed.