Bug 988004
Summary: | [vdsm] OSError: [Errno 2] No such file or directory: '/sys/class/net/ovirtmgmt/brif' | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] oVirt | Reporter: | Martin Pavlik <mpavlik> | ||||||
Component: | vdsm | Assignee: | Assaf Muller <amuller> | ||||||
Status: | CLOSED NEXTRELEASE | QA Contact: | Haim <hateya> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.3 | CC: | abaron, acathrow, bazulay, danken, gklein, iheim, mgoldboi, mpavlik, yeylon | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 3.3 | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | network | ||||||||
Fixed In Version: | 4.12.0 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2013-08-07 14:22:21 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 988995 | ||||||||
Bug Blocks: | 918494 | ||||||||
Attachments: |
|
vdsm dies while doing the initial setupNetwork: Thread-15::DEBUG::2013-07-24 15:48:31,146::BindingXMLRPC::979::vds::(wrapper) client [10.34.63.66]::call ping with () {} Thread-15::DEBUG::2013-07-24 15:48:31,146::BindingXMLRPC::986::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}} Thread-16::DEBUG::2013-07-24 15:48:31,148::BindingXMLRPC::979::vds::(wrapper) client [10.34.63.66]::call setupNetworks with ({'ovirtmgmt': {'nic': 'em1', 'bootproto': 'dhcp', 'STP': 'no', 'bridged': 'true'}}, {}, {'connectivityCheck': 'true', 'connectivityTimeout': 120}) {} MainThread::INFO::2013-07-24 15:48:33,896::vdsm::101::vds::(run) (PID: 9487) I am the actual vdsm 4.12.0-0.1.rc3.fc19 dell-r210ii-08.rhev.lab.eng.brq.redhat.com (3.9.5-301.fc19.x86_64) There's a seeming race between setupNetwork and the response from dhclient, but this on its own right does not explain the vdsm crash. MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,149::supervdsmServer::88::SuperVdsm.ServerCallback::(wrapper) calling to setupNetworks with ({'ovirtmgmt': {'nic': 'em1', 'bootproto': 'dhcp', 'STP': 'no', 'bridged': 'true'}}, {}, {'connectivityCheck': 'true', 'connectivityTimeout': 120}) {} MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,149::utils::486::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,151::utils::505::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,152::libvirtconnection::124::libvirtconnection::(get) trying to connect libvirt MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,165::configNetwork::480::setupNetworks::(setupNetworks) Setting up network according to configuration: networks:{'ovirtmgmt': {'nic': 'em1', 'bootproto': 'dhcp', 'STP': 'no', 'bridged': 'true'}}, bondings:{}, options:{'connectivityCheck': 'true', 'connectivityTimeout': 120} MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,166::configNetwork::484::root::(setupNetworks) Validating configuration MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,166::utils::486::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,168::utils::505::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,172::configNetwork::487::setupNetworks::(setupNetworks) Applying... MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,173::utils::486::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,175::utils::505::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,179::utils::486::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,181::utils::505::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,186::configNetwork::524::setupNetworks::(setupNetworks) Adding network 'ovirtmgmt' MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,186::utils::486::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,188::utils::505::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,192::configNetwork::174::root::(addNetwork) validating network... MainProcess|Thread-16::INFO::2013-07-24 15:48:31,192::configNetwork::188::root::(addNetwork) Adding network ovirtmgmt with vlan=None, bonding=None, nics=['em1'], bondingOptions=None, mtu=None, bridged=True, options={'bootproto': 'dhcp', 'STP': 'no', 'implicitBonding': True} MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,193::ifcfg::362::root::(_persistentBackup) backing up ifcfg-ovirtmgmt: # original file did not exist MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,193::ifcfg::233::root::(writeBackupFile) Persistently backed up /var/lib/vdsm/netconfback/ifcfg-ovirtmgmt (until next 'set safe config') MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,195::ifcfg::644::Storage.Misc.excCmd::(ifdown) '/usr/sbin/ifdown ovirtmgmt' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,345::ifcfg::644::Storage.Misc.excCmd::(ifdown) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,345::utils::486::root::(execCmd) '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,347::utils::505::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,352::ifcfg::289::root::(_atomicBackup) Backed up /etc/sysconfig/network-scripts/ifcfg-em1 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,352::ifcfg::362::root::(_persistentBackup) backing up ifcfg-em1: # Generated by parse-kickstart IPV6INIT=no BOOTPROTO=dhcp DEVICE=em1 ONBOOT=yes UUID=c7b808a5-a299-4fa6-a488-69028ec2f56a HWADDR=D0:67:E5:F0:7F:2F TYPE=Ethernet DEFROUTE=yes PEERDNS=yes PEERROUTES=yes IPV4_FAILURE_FATAL=no NAME="System em1" MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,353::ifcfg::233::root::(writeBackupFile) Persistently backed up /var/lib/vdsm/netconfback/ifcfg-em1 (until next 'set safe config') MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,353::ifcfg::644::Storage.Misc.excCmd::(ifdown) '/usr/sbin/ifdown em1' (cwd None) MainThread::DEBUG::2013-07-24 15:48:31,417::sourceRouteThread::18::root::(process_IN_CLOSE_WRITE_filePath) Responding to DHCP response in /var/run/vdsm/sourceRoutes/1374673711 MainThread::INFO::2013-07-24 15:48:31,418::sourceRouteThread::35::root::(process_IN_CLOSE_WRITE_filePath) interface em1 is not a libvirt interface MainThread::WARNING::2013-07-24 15:48:31,419::utils::113::root::(rmFile) File: /var/run/vdsm/trackedInterfaces/em1 already removed MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,561::ifcfg::644::Storage.Misc.excCmd::(ifdown) SUCCESS: <err> = 'bridge ovirtmgmt does not exist!\n'; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,561::ifcfg::651::Storage.Misc.excCmd::(_ifup) '/usr/sbin/ifup em1' (cwd None) MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,714::ifcfg::651::Storage.Misc.excCmd::(_ifup) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-16::DEBUG::2013-07-24 15:48:31,715::ifcfg::651::Storage.Misc.excCmd::(_ifup) '/usr/sbin/ifup ovirtmgmt' (cwd None) MainProcess|storageRefresh::DEBUG::2013-07-24 15:48:32,376::misc::817::SamplingMethod::(__call__) Returning last result MainProcess::DEBUG::2013-07-24 15:48:34,224::supervdsmServer::88::SuperVdsm.ServerCallback::(wrapper) calling to readMultipathConf with () {} ============================== Unrelated rant: the introduction of the MainProcess|Thread-16::DEBUG::2013-07-24 15:49:15,045::libvirtconnection::101::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 9 edom: 19 level: 2 message: operation failed: network 'vdsm-ovirtmgmt' already exists with uuid 9973dc75-a45a-4c84-b779-53041e899745 log by http://gerrit.ovirt.org/13990 is annoying an misleading. ============================== Another unrelated rant: supervdsm.log should include the result of its functions, not only its calls. Created attachment 779367 [details]
Events aggregated from messages, vdsm and supervdsm logs
I reproduced the bug on a fresh F19 installation.
Two import distinctions:
1) It has NetworkManager on by default, like the host in the original bug report. However: I used a newer version of host_deploy which turns NetworkManager off. The bug still happened.
2) I disabled the source routing dhclient hook's ability to write to /var/run/vdsm/sourceRoutes, in effect neutralizing sourceRouteThread (A thread spawned by supervdsm which configures source routing). The bug still happened.
I added a file that aggregates all the events I find important from the messages, vdsm and supervdsm logs.
What I know so far:
1) VDSM is stopped immediately after its first successful initialization and started again.
2) The initial setupNetworks that configures the ovirtmgmt bridge fails and rolls back the config due to a libvirt exception as detailed in the log Martin attached.
The cause has been discovered and a fix is being worked on by Yeela Kaplan. Please verify that the solution for bug 988995 by vdsm 4.12.0 fixes this issue. |
Created attachment 777806 [details] log_collector Description of problem: When adding new (freshly installed) host (F19) to clean setup. Host install fails with following reason in GUI: Host dell-08 installation failed. Failed to configure manamgent network on the host. Version-Release number of selected component (if applicable): engine: oVirt Engine Version: 3.3.0-0.3.beta1.fc19 (running on F19) host: vdsm-4.12.0-0.1.rc3.fc19 How reproducible: 100% Steps to Reproduce: 1. add clean host to ovirt Actual results: install fails Expected results: install successfull Additional info: vdsm.log Thread-35::INFO::2013-07-24 15:50:18,179::netinfo::428::root::(_getNetInfo) Obtaining info for net ovirtmgmt. Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line 411, in _getNetInfo data.update({'ports': ports(iface), 'stp': bridge_stp_state(iface), File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line 181, in ports return os.listdir('/sys/class/net/' + bridge + '/brif') OSError: [Errno 2] No such file or directory: '/sys/class/net/ovirtmgmt/brif'