Verified and tested successfully with new build rhevm-3.5.1-0.3.el6ev.noarch vdsm-4.16.13-1.el6ev.x86_64 rhel 6.6 vdsm upgrade from 4.14 >> 4.16.13-1 followed next steps: 1) bond0 and bond0.162 created manually outside RHEV-M(network restart) 2) installed server in RHEV-M setup successfully 3) attached VM vlan tagged network(162) to host via SN 4) copied relevant repo's(vt14.12) to server and run 'yum update', vdsm upgraded successfully, but regarding this BZ 1200467, needed to restart vdsmd service manually, operation was successful and no network configuration are broken.(refreshed capabilities) 5) rebooted server with success. all network configuration saved and didn't broke. including the manually bond0 and bond0.162 I would like to perform the same test with rhev-H 6.6 that including this fix, before moving this bug to verified. In the origin i managed to reproduce this issue only with rhev-H. Dan, do we have such rhev-H build with this fix?
tolik - when do you plan to build rhev-h?
Verified and tested successfully with new rhev-hypervisor6-6.6-20150402.0.el6ev.noarch.rpm that includes vdsm-4.16.13-1.el6ev.x86_64 vdsm upgrade from vdsm-4.14.18-6.el6ev >> vdsm-4.16.13-1.el6ev.x86_64 followed next steps: 1) bond0 and bond0.162 created manually outside RHEV-M - via TUI 2) installed server in RHEV-M setup(vt14.2) successfully 3) attached VM vlan tagged network (162) to host via SN 4) downloaded and installed rhev-hypervisor6-6.6-20150402.0.el6ev.noarch.rpm in engine 5) put host to maintenance and run 'upgrade' via RHEV-M 6) host rebooted successfully. All network configuration saved and didn't broke. Including the manually(TUI) configured bond0 and bond0.162.
Looks like the fix for this bug, created another issue regarding to this. bond0 and bond0.162 persistent and nothing breaks, but non of the networks or nic's have BOOTPROTO= line in ifcfg files.
Moving back to Assigned instead of creating new BZ.
Does the network ever have BOOTPROTO line? Are they bridged? Would you please attach fresh logs of the new effect? Include the generated ifcfg-* files and vdsm's own /var/lib/vdsm/persistence/netconf.
[root@navy-vds1 ~]# brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no net_bondash 8000.001018244afc no bond0.162 rhevm 8000.00145edd0924 no eth2 Yes, network should have BOOTPROTO= line. [root@navy-vds1 ~]# ls /var/lib/vdsm/persistence/netconf bonds nets [root@navy-vds1 ~]# ls /var/lib/vdsm/persistence/netconf/nets/ net_bondash rhevm [root@navy-vds1 ~]# ls /var/lib/vdsm/persistence/netconf/bonds/ bond0 Note, that when testing this bug, after reboot, host didn't get ip, because BOOTPROTO= was missing in ifcfg-rhevm file, i added BOOTPROTO=dhcp manually.
Created attachment 1011272 [details] vdsm log and generated ifcfg files
The interesting part is the content of net_bondash % rhevm files, as well as supervdsm.log during boot time. Can you supply them?
Hi Dan, I run more tests for that and have several strange behaviors- - upgrade from rhev-h 6.6 3.4 >>3.5.1 (unsigned build- rhev-hypervisor6-6.6-20150402.0.el6ev) BOOTPROTO= line exists after reboot, host is up. vdsm.log attached - upgrade from rhev-h 6.6 3.5.0>>3.5.1(unsigned build - rhev-hypervisor6-6.6-20150402.0.el6ev) No ifcfg files for 'rhevm' and 'net_bondash1' after reboot. host is in non-responsive state, no ip off course. vdsm.log attached. - upgrade from rhev-h 7.1 3.5.1(vt14.1) >> 3.5.1(vt14.2) unsigned build - rhev-hypervisor6-6.6-20150402.0.el6ev Can't install server in RHEV-M MainThread::ERROR::2015-04-07 08:34:45,251::vdsm::134::vds::(run) Exception raised Traceback (most recent call last): File "/usr/share/vdsm/vdsm", line 132, in run serve_clients(log) File "/usr/share/vdsm/vdsm", line 82, in serve_clients cif = clientIF.getInstance(irs, log) File "/usr/share/vdsm/clientIF.py", line 158, in getInstance File "/usr/share/vdsm/clientIF.py", line 112, in __init__ File "/usr/share/vdsm/clientIF.py", line 162, in _createAcceptor File "/usr/share/vdsm/clientIF.py", line 173, in _createSSLContext File "/usr/lib/python2.7/site-packages/vdsm/sslutils.py", line 141, in __init__ File "/usr/lib/python2.7/site-packages/vdsm/sslutils.py", line 166, in _initContext File "/usr/lib/python2.7/site-packages/vdsm/sslutils.py", line 145, in _loadCertChain File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Context.py", line 103, in load_cert_chain SSLError: Permission denied Dan, we are investigating all of this, still not clear what is going on.
Created attachment 1011659 [details] vdsm logs
*** Bug 1209486 has been marked as a duplicate of this bug. ***
To my understanding should be okay for upgrade 3.4.* --> 3.5.1, but needs release notes for 3.5.0 --> 3.5.1. Dan, could you supply documentation what goes wrong during a 3.5.0 --> 3.5.1 upgrade, and how to work around it once it breaks?
*** Bug 1209401 has been marked as a duplicate of this bug. ***
Dan, reading your release notes, can we refer to this kcs solution to address the workaround you are talking about ? https://access.redhat.com/solutions/1346443 Would it require additional steps? Thanks!
Verified on - 3.5.1-0.4.el6ev - rhev-h 6.6 3.4.z >> rhev-h 6.6 3.5.1 using the next builds: rhev-h 6.6 3.4.z 20150123.1.el6ev >> rhev-h 6.6 3.5.1 20150421.0.el6ev 1) clean rhev-h 6.6 3.4.z 20150123.1.el6ev installed via USB 2) bond0.162 configured via TUI with dhcp 3) installed server in RHEV-M, rhevm network created on top of bond0 4) via SN attached network to other NIC * Host is up after upgrade and reboot, all networks attached to server, rhevm got ip, host is up in RHEV-M and can be activated on 3.5 cluster. - clean rhev-h 6.6 3.5.1 20150421.0.el6ev 1) clean rhev-h 6.6 3.5.1 20150421.0.el6ev installed via USB 2) bond0.162 configured via TUI with dhcp 3) installed server in RHEV-M, rhevm network created on top of bond0 4) via SN attached network to other NIC * Host is up after reboot, all networks attached to server, rhevm got ip, host is up in RHEV-M. - rhev-h 7.1 3.5.1 20150420.0.el7ev 1) clean rhev-h 7.1 3.5.1 20150420.0.el7ev installed via USB 2) bond0.162 configured via TUI with dhcp 3) installed server in RHEV-M, rhevm network created on top of bond0 4) via SN attached network to other NIC * Host is up after reboot, all networks attached to server, rhevm got ip, host is up in RHEV-M.
(In reply to Marina from comment #15) Frankly, I still do not understand why this bug bites us only on upgrade, and not during any reboot. Hence, only testing would tell if https://access.redhat.com/solutions/1346443 is helpful for the upgrade case.
was released as part of 3.5.1