Bug 1392989

Summary: vdsmd start fails due to dead symlink /etc/resolv.conf
Product: [oVirt] vdsm Reporter: Yedidyah Bar David <didi>
Component: ToolsAssignee: Leon Goldberg <lgoldber>
Status: CLOSED CURRENTRELEASE QA Contact: Meni Yakove <myakove>
Severity: medium Docs Contact:
Priority: high    
Version: 4.18.15.2CC: bugs, danken, mburman, ratamir
Target Milestone: ovirt-4.0.6Flags: rule-engine: ovirt-4.0.z+
Target Release: 4.18.16   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-18 07:28:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine sosreport
none
host sosreport none

Description Yedidyah Bar David 2016-11-08 16:00:34 UTC
Created attachment 1218605 [details]
engine sosreport

Description of problem:

# ls -l /etc/resolv.conf
lrwxrwxrwx. 1 root root 35 Nov  6 17:36 /etc/resolv.conf -> /var/run/NetworkManager/resolv.conf

# namei /etc/resolv.conf
f: /etc/resolv.conf
 d /
 d etc
 l resolv.conf -> /var/run/NetworkManager/resolv.conf
   d /
   d var
   l run -> ../run
     d ..
     d run
     NetworkManager - No such file or directory

That is, /etc/resolv.conf is a dead symlink. NetworkManager is down.

Not fully certain how I reached this state, might open other bugs later on.

Now, when I 'systemctl start vdsmd', I have this in the journal:

Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]: Traceback (most recent call last):
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/share/vdsm/vdsm-restore-net-config", line 479, in <module>
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     restore(args)
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/share/vdsm/vdsm-restore-net-config", line 442, in restore
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     unified_restoration()
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/share/vdsm/vdsm-restore-net-config", line 134, in unified_restoration
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     changed_config = _filter_changed_nets_bonds(available_config)
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/share/vdsm/vdsm-restore-net-config", line 261, in _filter_changed_nets_bonds
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     kernel_config = kernelconfig.KernelConfig(NetInfo(netswitch.netinfo()))
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch.py", line 298, in netinfo
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     _netinfo = netinfo_get(compatibility=compatibility)
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py", line 103, in get
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     return _get(vdsmnets)
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py", line 63, in _get
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     'vlans': {}, 'nameservers': get_host_nameservers()}
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/dns.py", line 26, in get_host_nameservers
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     with open(DNS_CONF_FILE, 'r') as file_object:
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]: IOError: [Errno 2] No such file or directory: '/etc/resolv.conf'
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]: Traceback (most recent call last):
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/bin/vdsm-tool", line 219, in main
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     return tool_command[cmd]["command"](*args)
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 41, in restore_command
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     exec_restore(cmd)
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:   File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 54, in exec_restore
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]:     raise EnvironmentError('Failed to restore the persisted networks')
Nov 08 17:36:08 lvfhost1.home.local vdsm-tool[18191]: EnvironmentError: Failed to restore the persisted networks

Version-Release number of selected component (if applicable):

Current master snapshot, fedora 24

How reproducible:

No idea - but currently always, I think

Steps to Reproduce:
1. Install and setup an engine, add a host
2. Put host to maintenance
3. Make /etc/resolv.conf on the host a dead symlink as above
4. restart vdsmd

Actual results:

Fails as above

Expected results:

succeeds, as if the file was empty (no nameservers etc)

Additional info:

Attaching engine and vdsm sosreports.

Actual flow was something like that:

Installed and set up an engine and host, both fedora 24, around a week ago. All went well.

Put host to maintenance, shutdown both

Now started both, ran 'dnf update' on both and engine-setup on the engine

Tried Reinstall the host from web admin

Comment 1 Yedidyah Bar David 2016-11-08 16:01:27 UTC
Created attachment 1218606 [details]
host sosreport

Comment 2 Michael Burman 2016-11-20 13:23:24 UTC
Verified on - vdsm-4.18.16-1.el7ev.x86_64