Description of problem: VDSM restore.net config is throwing a backtrace and the vdsm service is not starting after setting up a Hosted Engine setup. This is after a reboot and when trying to ultimately set this system up with bonded interfaces with tagged vlan interfaces. Provisioned with 2 interfaces bonded in mode 4, four vlans, three tagged and one native (used for initial kickstart and ssh access. Intended vlan/subnet for RHEVM is supposed to be NAT-access only via the host system, iptables provisioned and working via a non-bonded installation. Note: Installation via single interfaces does not present this issue. Bonded install was attempted two different ways, both with the same problem: vdsm-restore-net-config fails on reboot. One install method was basic, initial bonded interface in place with tagged vlans. Other method was run install on single interface, then convert to bonded after working. backtrace: :vdsm-restore-net-config:91:_filter_nets_bonds:KeyError: u'bond0' : :Traceback (most recent call last): : File "/usr/share/vdsm/vdsm-restore-net-config", line 137, in <module> : restore() : File "/usr/share/vdsm/vdsm-restore-net-config", line 123, in restore : unified_restoration() : File "/usr/share/vdsm/vdsm-restore-net-config", line 66, in unified_restoration : persistentConfig.bonds) : File "/usr/share/vdsm/vdsm-restore-net-config", line 91, in _filter_nets_bonds : bonds[bond]['nics'], net) :KeyError: u'bond0' : :Local variables in innermost frame: :bonds: {} :available_bonds: {} :available_nets: {} :attrs: {u'bondingOptions': u'mode=4 lacp_rate=1 miimon=200', u'vlan': 3009, u'ipaddr': u'c.c.113.211', u'netmask': u'255.255.255.0', u'bonding': u'bond0', u'bootproto': u'static'} :available_nics: ['eth0', 'eth1', 'eth2', 'eth3'] :net: 'rhevm' :nets: {'rhevm': {u'bondingOptions': u'mode=4 lacp_rate=1 miimon=200', u'vlan': 3009, u'ipaddr': u'c.c.113.211', u'netmask': u'255.255.255.0', u'bonding': u'bond0', u'bootproto': u'static'}} :bond: u'bond0' Version-Release number of selected component (if applicable): Single node with manual work around on boot: rhevm-3.5.0-0.20.el6ev.noarch vdsm-4.16.7.4-1.el6ev.x86_64 Hypervisors 6.5 and 6.6 both exhibit the same behavior How reproducible: Unknown Steps to Reproduce: Install RHEL 6.6 and update. Provision interface with native(untagged) VLAN and one tagged VLAN (nonrouted VLAN for RHEVM). Install/setup rhevm. Install/setup self-hosted engine. Stop self-hosted engine, ovirt-ha-agent and sanlock (kill -9 required) and reboot. On reboot, vdsm-restore-net-config fails with abrt response noted. Actual results: vdsm-restore-net-config fails with the ABRT above Expected results: Everything works as normal Additional info: Worked around the vdsm network problem by installing with single interface instead of bonded. However, this failed as well, the console address provisioned for a new VM by RHEVM was on the wrong IP address, it presented the gateway c.c.113.1 address instead of the host system This bug has the same traceback and is one of the few instances on the internet - https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=1154399
ovirt-hosted-engine-setup currently doesn't support vlan on bonded interface: it's an RFE for 3.6, please see https://bugzilla.redhat.com/1134346 Deploying with the plain interface and than moving to the bonded configuration could be a solution but it requires some manual action as described here: https://bugzilla.redhat.com/show_bug.cgi?id=1154399#c3
Antoni, can you please give a look too on why VDSM doesn't correctly restart on reboots?
Re: https://bugzilla.redhat.com/show_bug.cgi?id=1134346 Minor issue, hosted-engine --deploy DID accept a vlan interface at config time and worked fine until host reboot. FYI.
Has the bond0 interface been created manually? Soes /etc/sysconfig/network-scripts/ifcfg-bond0 have any of these headers: - '# Generated by VDSM version' - '# automatically generated by vdsm' If the bond0 interface has been created manually, the command: persist /etc/sysconfig/network-scripts/ifcfg-bond0 It worked until reboot because vdsm removes it at reboot if the configuration is not persisted. hosted-engine --deploy take care only of persisting the bridge which is created automatically by the tool itself. Bill, let's discuss the vlan over bonded in bug #1134346 here it seems that the issue is just a missing persistence of the bond0 configuration.
Bonded interface was manual configuration, before rhevm installation on host. What package supplies persist?
Sandro, The hosts in this instance are all RHEL 6.5/6.6, so there isn't any persisting needing to be done for this customer. Regards, Robert McSwain
Could you attach supervdsm.log of the attempted startup? This seems like a dup of bug 1154399, and https://bugzilla.redhat.com/show_bug.cgi?id=1154399#c3 can serve as a workaround. Does it help? However, the issue of removal of pre-vdsm ifcfg file seems to be more harmful than first perceived; we're tracking that in bug 1188251.
3.5.1 is already full with bugs (over 80), and since none of these bugs were added as urgent for 3.5.1 release in the tracker bug, moving to 3.5.2
Please re-open when requested information is available.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days