Created attachment 1066236 [details]
Description of problem:
Vdsm should recover ifcfg files in case there are no longer exist and recover all networks on the server.
In case that ifcfg files are removed/missing/deleted for any reason, vdsm should recover them on boot and restore all networks.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install clean rhel 7.2 in rhev-m 3.6 latest
2. Configure some networks on host via Setup Networks
3. Delete all ifcfg files, except the ifcfg-lo, ifcfg-ovirtmgmt and the NICs ifcfg that the management network attached to
4. Reboot server
All networks are missing, except the management network, host got ip, but is in non-responsive state.
Vdsm didn't recovered the ifcfg files on boot and couldn't restore networks on server.
Vdsm should recover ifcfg files in case they are missing and restore all networks.
This happens since the network is still stored in libvirt and the ifcfg file is missing. the ifcfg configurator can't handle a missing nic ifcfg file when it tries to delete it:
Traceback (most recent call last):
File "/usr/share/vdsm/network/api.py", line 890, in setupNetworks
File "/usr/share/vdsm/network/api.py", line 397, in _delBrokenNetwork
File "/usr/share/vdsm/network/api.py", line 219, in wrapped
ret = func(**attrs)
File "/usr/share/vdsm/network/api.py", line 486, in _delNetwork
File "/usr/share/vdsm/network/models.py", line 188, in remove
File "/usr/share/vdsm/network/configurators/ifcfg.py", line 182, in removeBridge
File "/usr/share/vdsm/network/models.py", line 100, in remove
File "/usr/share/vdsm/network/configurators/ifcfg.py", line 231, in removeNic
File "/usr/share/vdsm/network/configurators/ifcfg.py", line 644, in removeNic
with open(cf) as nicFile:
IOError: [Errno 2] No such file or directory: u'/etc/sysconfig/network-scripts/ifcfg-ens1f1'
I assume this is not a 3.6 regression, and the configurator never had the ability to handle such an evil situation where someone tries to remove a physical nic ifcfg.
A similar outcome happens when a placeholder file with "# original file did not exist" is placed instead of e.g. ifcfg-eth0 into /var/lib/vdsm/netconfback/ and a host is rebooted.
I have created a patch that lets a configurator continue gracefully (and even read HWADDR from the system).
Verified on - 3.6.0-0.18.el6 with vdsm-4.17.8-1.el7ev.noarch
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.