Created attachment 1066236 [details] Logs Description of problem: Vdsm should recover ifcfg files in case there are no longer exist and recover all networks on the server. In case that ifcfg files are removed/missing/deleted for any reason, vdsm should recover them on boot and restore all networks. Version-Release number of selected component (if applicable): 3.6.0-0.12.master.el6 vdsm-4.17.3-1.el7ev.noarch Steps to Reproduce: 1. Install clean rhel 7.2 in rhev-m 3.6 latest 2. Configure some networks on host via Setup Networks 3. Delete all ifcfg files, except the ifcfg-lo, ifcfg-ovirtmgmt and the NICs ifcfg that the management network attached to 4. Reboot server Actual results: All networks are missing, except the management network, host got ip, but is in non-responsive state. Vdsm didn't recovered the ifcfg files on boot and couldn't restore networks on server. Expected results: Vdsm should recover ifcfg files in case they are missing and restore all networks.
This happens since the network is still stored in libvirt and the ifcfg file is missing. the ifcfg configurator can't handle a missing nic ifcfg file when it tries to delete it: Traceback (most recent call last): File "/usr/share/vdsm/network/api.py", line 890, in setupNetworks configurator=configurator) File "/usr/share/vdsm/network/api.py", line 397, in _delBrokenNetwork implicitBonding=False, _netinfo=_netinfo) File "/usr/share/vdsm/network/api.py", line 219, in wrapped ret = func(**attrs) File "/usr/share/vdsm/network/api.py", line 486, in _delNetwork net_ent_to_remove.remove() File "/usr/share/vdsm/network/models.py", line 188, in remove self.configurator.removeBridge(self) File "/usr/share/vdsm/network/configurators/ifcfg.py", line 182, in removeBridge bridge.port.remove() File "/usr/share/vdsm/network/models.py", line 100, in remove self.configurator.removeNic(self) File "/usr/share/vdsm/network/configurators/ifcfg.py", line 231, in removeNic self.configApplier.removeNic(nic.name) File "/usr/share/vdsm/network/configurators/ifcfg.py", line 644, in removeNic with open(cf) as nicFile: IOError: [Errno 2] No such file or directory: u'/etc/sysconfig/network-scripts/ifcfg-ens1f1' I assume this is not a 3.6 regression, and the configurator never had the ability to handle such an evil situation where someone tries to remove a physical nic ifcfg. Lowering urgency.
A similar outcome happens when a placeholder file with "# original file did not exist" is placed instead of e.g. ifcfg-eth0 into /var/lib/vdsm/netconfback/ and a host is rebooted. I have created a patch that lets a configurator continue gracefully (and even read HWADDR from the system).
Verified on - 3.6.0-0.18.el6 with vdsm-4.17.8-1.el7ev.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0362.html