Created attachment 1103087 [details] screenshot vlan upgrade fail Description of problem: The vlan network is not up after upgrade from RHEVH-7.2/RHEVH-7.1 publicly released version to RHEV-H 7.2 for 3.6 beta2 Version-Release number of selected component (if applicable): RHEV-H 7.2-20151201.2.el7ev ovirt-node-3.6.0-0.23.20151201git5eed7af.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1. TUI install RHEV-H 7.2-20151129.1.el7ev 2. Login RHEV-H 7.2-20151129.1.el7ev, setup vlan network via dhcp, can obtain dhcp vlan ip successful 3. Upgrade from RHEV-H 7.2-20151129.1.el7ev to RHEV-H 7.2-20151201.2.el7ev via TUI 4. Login RHEV-H 7.2-20151201.2.el7ev, check the vlan network Actual results: After step4, the vlan network is not up, it shows NIC(configured vlan via dhcp) unconfigured. Enter NIC configure page, it shows NIC Disabled. Expected results: After step4, the vlan network should be up and obtain dhcp vlan ip successful Additional info: Also encounter this issue on RHEV-H 7.2-20151112.1.el7ev
Created attachment 1103088 [details] vlan log
Ido, can you tell anything from the logs?
from supervdsm.log it looks like there were no networks persisted. This means that nothing is restored after the boot: restore-net::DEBUG::2015-12-04 10:00:38,367::libvirtconnection::160::root::(get) trying to connect libvirt restore-net::INFO::2015-12-04 10:00:38,398::vdsm-restore-net-config::385::root::(restore) starting network restoration. restore-net::DEBUG::2015-12-04 10:00:38,399::vdsm-restore-net-config::183::root::(_remove_networks_in_running_config) Not cleaning running configuration since it is empty. restore-net::INFO::2015-12-04 10:00:38,402::netconfpersistence::179::root::(_clearDisk) Clearing /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ restore-net::DEBUG::2015-12-04 10:00:38,402::netconfpersistence::187::root::(_clearDisk) No existent config to clear. restore-net::INFO::2015-12-04 10:00:38,402::netconfpersistence::129::root::(save) Saved new config RunningConfig({}, {}) to /var/run/vdsm/netconf/nets/ and /var/run/vdsm/netconf/bonds/ restore-net::DEBUG::2015-12-04 10:00:38,402::vdsm-restore-net-config::329::root::(_wait_for_for_all_devices_up) All devices are up. restore-net::INFO::2015-12-04 10:00:38,409::vdsm-restore-net-config::396::root::(restore) restoration completed successfully.
*** This bug has been marked as a duplicate of bug 1289026 ***
From the logs I see that a dhcp address is obtained: Dec 4 09:59:25 localhost dhclient[2194]: DHCPDISCOVER on p3p1.20 to 255.255.255.255 port 67 interval 15 (xid=0x7f635803) Dec 4 09:59:40 localhost dhclient[2194]: DHCPDISCOVER on p3p1.20 to 255.255.255.255 port 67 interval 11 (xid=0x7f635803) Dec 4 09:59:51 localhost dhclient[2194]: DHCPDISCOVER on p3p1.20 to 255.255.255.255 port 67 interval 8 (xid=0x7f635803) Dec 4 09:59:52 localhost dhclient[2194]: DHCPREQUEST on p3p1.20 to 255.255.255.255 port 67 (xid=0x7f635803) Dec 4 09:59:52 localhost dhclient[2194]: DHCPOFFER from 192.168.20.2 Dec 4 09:59:52 localhost dhclient[2194]: DHCPACK from 192.168.20.2 (xid=0x7f635803) Dec 4 09:59:54 localhost NET[2279]: /usr/sbin/dhclient-script : updated /etc/resolv.conf Dec 4 09:59:54 localhost dhclient[2194]: bound to 192.168.20.129 -- renewal in 8463 seconds. Dec 4 09:59:54 localhost network: Determining IP information for p3p1.20... done. Dec 4 09:59:54 localhost NET[2330]: /etc/sysconfig/network-scripts/ifup-post : updated /etc/resolv.conf Dec 4 09:59:55 localhost network: [ OK ] I suppose the problem is thus just a visual one in the TUI, and thus it's not a dupe of bug 1289026.
The question is if there is a correct networking, and if there is, why the TUI does not detect that it's there.
This might be related to bug 1280241
I'm not able to reproduce this. 1. TUI install RHEV-H 7.2-20151129.1.el7ev 2. Log into RHEV-H 7.2-20151129.1.el7ev, setup VLAN network via DHCP, DHCP works 3. Upgrade from RHEV-H 7.2-20151129.1.el7ev to RHEV-H 7.2-20151201.2.el7ev via TUI 4. Login RHEV-H 7.2-20151201.2.el7ev, check the status page and network page 5. Both say "Configured", networking works. Were there any other steps taken? .... 2015-12-04 09:53:24,211 INFO Saving network stuff 2015-12-04 09:53:24,245 INFO Effective changes {'nics': 'p3p1'} .. upgrade .. 2015-12-04 10:01:09,522 INFO Saving network stuff 2015-12-04 10:01:09,570 INFO Effective changes {'nics': 'em1'} 2015-12-04 10:01:10,601 ERROR An error appeared in the UI: UnknownNicError("Unknown network interface: 'em1'",) .... 2015-12-04 10:02:22,238 INFO Saving network stuff 2015-12-04 10:02:22,288 INFO Effective changes {'nics': 'p3p1'} What happened in the middle here? Can you please provide a test system?
Also, the contents of /etc/default/ovirt both before and after the upgrade would be helpful
Ryan, test steps is all right in comment 9, but maybe you should reproduce this issue on the machine with at least two NICs. As I checked the contents of /etc/default/ovirt both before and after the upgrade, the OVIRT_BOOTIF is different before and after the upgrade, maybe this is the issue. 1. Before the upgrade: # cat /etc/default/ovirt OVIRT_BOOTIF="p3p1" …… [root@localhost admin]# ifconfig lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 2254 bytes 365657 (357.0 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 2254 bytes 365657 (357.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 p3p1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::21b:21ff:fe27:470b prefixlen 64 scopeid 0x20<link> ether 00:1b:21:27:47:0b txqueuelen 1000 (Ethernet) RX packets 866 bytes 70296 (68.6 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 307 bytes 55233 (53.9 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 p3p1.20: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.20.129 netmask 255.255.255.0 broadcast 192.168.20.255 inet6 2001:db8:1:0:21b:21ff:fe27:470b prefixlen 64 scopeid 0x0<global> inet6 fe80::21b:21ff:fe27:470b prefixlen 64 scopeid 0x20<link> ether 00:1b:21:27:47:0b txqueuelen 0 (Ethernet) RX packets 269 bytes 30710 (29.9 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 290 bytes 43481 (42.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 2. After the upgrade: # cat /etc/default/ovirt OVIRT_BOOTIF="em1" …… [root@localhost admin]# ifconfig lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 2254 bytes 365657 (357.0 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 2254 bytes 365657 (357.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 p3p1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::21b:21ff:fe27:470b prefixlen 64 scopeid 0x20<link> ether 00:1b:21:27:47:0b txqueuelen 1000 (Ethernet) RX packets 866 bytes 70296 (68.6 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 307 bytes 55233 (53.9 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 p3p1.20: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.20.129 netmask 255.255.255.0 broadcast 192.168.20.255 inet6 2001:db8:1:0:21b:21ff:fe27:470b prefixlen 64 scopeid 0x0<global> inet6 fe80::21b:21ff:fe27:470b prefixlen 64 scopeid 0x20<link> ether 00:1b:21:27:47:0b txqueuelen 0 (Ethernet) RX packets 269 bytes 30710 (29.9 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 290 bytes 43481 (42.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Additional info: please refer to attachment for detailed information: 1. the contents of /etc/default/ovirt both before and after the upgrade (vlan.tar.gz) 2. screenshot of ifconfig after upgrade
Created attachment 1106332 [details] the contents of /etc/default/ovirt both before and after the upgrade
Created attachment 1106333 [details] screenshot of ifconfig after upgrade
I'll add another NIC. You upgraded via TUI or PXE (or TUI over PXE)? I would guess that you're correct, and the TUI is wrong because OVIRT_BOOTIF changed. My question now is why it changed. This could happen over PXE, but I'll reproduce via TUI over CDROM if that was the boot method.
Ryan, I upgraded via TUI over PXE
(In reply to Huijuan Zhao from comment #12) > Created attachment 1106332 [details] > the contents of /etc/default/ovirt both before and after the upgrade There are two problems -- First is the blank MANAGED_IFNAMES, which is a symptom of bz#1280241, and it will make the networking appear to be unconfigured. Not registering this system to RHEV-M will allow you to see... Second is that PXE upgrading from a different interface than the one which was configured from the TUI will set OVIRT_BOOTIF to a different value, and the TUI will show the wrong interface as configured after upgrades. I'd like to track this bug here. But I can't reproduce it, and it looks like bz#1053425, which was fixed two years ago. I'd also like to lower the severity, because it's cosmetic only, and it would be unexpected for users to set new configuration values in the TUI after upgrades (or even see the TUI, since common upgrade flows are over RHEV-M or PXE). I suspect that pxebooting from a non-management interface is also rare. I tried the following: 1. TUI install RHEV-H 7.2-20151129.1.el7ev 2. Log into RHEV-H 7.2-20151129.1.el7ev, setup VLAN network via DHCP on ens3, DHCP works 3. Upgrade from RHEV-H 7.2-20151129.1.el7ev to RHEV-H 7.2-20151201.2.el7ev via PXE on ens8 4. Login RHEV-H 7.2-20151201.2.el7ev, check the status page and network page 5. Both say "Configured", networking works, OVIRT_BOOTIF is still ens3 after the upgrade. Can you please provide a test environment?
That's perfect. Are both images available over cobbler? VLAN 20?
Created attachment 1109136 [details] vlan.tar.gz
(In reply to Huijuan Zhao from comment #0) > Created attachment 1103087 [details] > screenshot vlan upgrade fail > > Description of problem: > The vlan network is not up after upgrade from RHEVH-7.2/RHEVH-7.1 publicly > released version to RHEV-H 7.2 for 3.6 beta2 > > Version-Release number of selected component (if applicable): > RHEV-H 7.2-20151201.2.el7ev > ovirt-node-3.6.0-0.23.20151201git5eed7af.el7ev.noarch > > How reproducible: > 100% > > Steps to Reproduce: > 1. TUI install RHEV-H 7.2-20151129.1.el7ev > 2. Login RHEV-H 7.2-20151129.1.el7ev, setup vlan network via dhcp, can > obtain dhcp vlan ip successful > 3. Upgrade from RHEV-H 7.2-20151129.1.el7ev to RHEV-H 7.2-20151201.2.el7ev > via TUI > 4. Login RHEV-H 7.2-20151201.2.el7ev, check the vlan network > > Actual results: > After step4, the vlan network is not up, it shows NIC(configured vlan via > dhcp) unconfigured. Enter NIC configure page, it shows NIC Disabled. > > Expected results: > After step4, the vlan network should be up and obtain dhcp vlan ip successful > > Additional info: > Also encounter this issue on RHEV-H 7.2-20151112.1.el7ev Additional info for the reproduce steps: In the above "Steps to Reproduce": 2. setup vlan network (NIC is: p3p1) 3. Upgrade via PXE + TUI (default NIC in cmdline is: em1)
Please take a look at the system you provided. I made a change there to fix it. Is this what the desired functionality you want ?
Created attachment 1111278 [details] network.tar.gz
(In reply to Anatoly Litovsky from comment #26) > Please take a look at the system you provided. > I made a change there to fix it. > Is this what the desired functionality you want ? No. The current results: In Status page, it shows "Networking: Connected p3p1.20", In Network page, NIC p3p1 shows Unconfigured, but p3p1.20 shows Configured(enter it, actually no configuration) Expected results: In Status page, it shows "Networking: Connected p3p1", In Network page, NIC p3p1 should show Configured(Bootprotocol DHCP, VLAN ID:20), there should be no p3p1.20. Please refer to attachment "network.tar.gz" for detailed info, there are two screenshot including Status page and Network page.
(In reply to Huijuan Zhao from comment #25) > Hi, Tolik and Ying, I reproduced this bug on latest build RHEV-H > 7.2-20151129.0.el7ev, the ENV: > 192.168.20.129 > admin/redhat > Update: I reproduced this bug on latest build RHEV-H 7.2-20151229.0.el7ev
Considering comment 11, Huijuan, is the bug fixed if you 1. after upgrade 2. change the BOOTIF value to p3p1 again 3. and re-login into the tui? It could be as ryan says, that the BOOTIF is just changed, because the upgrade is performed using PXE. Also: Does this bug also appear if you perform the upgrade using USB?
(In reply to Fabian Deutsch from comment #30) > Considering comment 11, Huijuan, is the bug fixed if you > 1. after upgrade > 2. change the BOOTIF value to p3p1 again > 3. and re-login into the tui? > > It could be as ryan says, that the BOOTIF is just changed, because the > upgrade is performed using PXE. > Fabian, the bug is fixed according to the above steps. > Also: Does this bug also appear if you perform the upgrade using USB? There is not this bug when I perform the upgrade using USB.
Thanks Huijuan. This supports the assumption that the problem is that the BOOTIF is beeing updated during the PXE upgrade flow. The solution is then to prevent this.
Okay, I could reproduce it: 1. Install inside a VM (with two nics, i.e. ens3 + ens11) using CDROM 2. Configure ens3 with static IP and a vlan 3. Boot from CDROM media again, append to the commandline: BOOTIF=ens11 4. Perform the TUI upgrade 5. After installation: Reboot, boot from disk and login Findings: After 2. The network appears as configured in the TUI, and the vlan is correctly configured on the system, BOOTIF==ens3 After 3. The TUI upgrade will be started After 5. The TUI shows the network as unconfiguerd, BOOTIF==ens11 The root cause of this bug has two conditions that need to be met: 1. Boot from PXE 2. Perform upgrade through TUI In that flow, the BOOTIF will be overwritten during the TUI upgrade. Huijuan, 1. please provide all kernel arguments you use for the TUI PXE upgrade. 2. Can you reproduce the issue according to the steps above? Possible solutions: 1. Do automatic upgrade by appending "upgrade=1" 2. Fix the TUI flow to unset BOOTIF in the uprgade flow.
(In reply to Fabian Deutsch from comment #37) > Okay, I could reproduce it: > > 1. Install inside a VM (with two nics, i.e. ens3 + ens11) using CDROM > 2. Configure ens3 with static IP and a vlan > 3. Boot from CDROM media again, append to the commandline: BOOTIF=ens11 > 4. Perform the TUI upgrade > 5. After installation: Reboot, boot from disk and login > > Findings: > After 2. The network appears as configured in the TUI, and the vlan is > correctly configured on the system, BOOTIF==ens3 > After 3. The TUI upgrade will be started > After 5. The TUI shows the network as unconfiguerd, BOOTIF==ens11 > > > The root cause of this bug has two conditions that need to be met: > 1. Boot from PXE > 2. Perform upgrade through TUI > > In that flow, the BOOTIF will be overwritten during the TUI upgrade. > > Huijuan, > 1. please provide all kernel arguments you use for the TUI PXE upgrade. > 2. Can you reproduce the issue according to the steps above? > 1. All kernel arguments for the TUI PXE upgrade: /images/rhevh-vdsm7-7.2-20151229.0_36/vmlinuz0 initrd=/images/rhevh-vdsm7-7.2-20151229.0_36/initrd0.img ksdevice=bootif rootflags=loop rootflags=ro rd.dm=0 rd_NO_MULTIPATH rd.md=0 crashkernel=256M rootfstype=auto lang= max_loop=256 rhgb quiet elevator=deadline rd.live.check rd.luks=0 install ro root=live:/rhev-hypervisor7-7.2-20151229.0.iso rd.live.image BOOTIF=01-d4-be-d9-95-61-ca 2. I can reproduce the issue according to the steps above.
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
This issue will not be fixed with an eventually attached in RHEV 4.0. Instead this bug is getting fixed by the new functionality in Cockpit.
Encounter the bug on rhev-hypervisor6-6.8-20160630.2.iso, and added this bug to: Bug 1352452 - [Tracker] Track RHEV-H 6.8 bugs