Created attachment 781041 [details] engine and vdsm logs Description of problem: When adding new host to cluster the installation failed and the host become non-operational because setupNetworks fail to configure rhevm bridge on the host. Attach rhevm to ethX from setupNetworks dialog solve the problem and the host is UP. Version-Release number of selected component (if applicable): rhevm-3.3.0-0.11.master.el6ev.noarch vdsm-4.12.0-rc3.12.git139ec2f.el6ev.x86_64 Steps to Reproduce: 1.add new host to cluster Actual results: Host is non-operational and rhevm bridge creation fails Expected results: Host is up with rhevm bridge
It looks like this can only happen if VDSM is installed and running and with the latest version.
From rose09-2013073114031375268609/var/log/vdsm/supervdsm.log: supervdsm was started on July 29. A day later, setupNetwork was called, and failed due to libvirtd's restart during that interval. MainThread::DEBUG::2013-07-29 16:38:01,209::supervdsmServer::363::SuperVdsm.Server::(main) Making sure I'm root - SuperVdsm MainThread::DEBUG::2013-07-29 16:38:01,212::libvirtconnection::124::libvirtconnection::(get) trying to connect libvirt Thread-15::DEBUG::2013-07-30 15:32:55,803::supervdsmServer::88::SuperVdsm.ServerCallback::(wrapper) calling to setupNetworks with ({'rhevm': {'nic ': 'eth0', 'bootproto': 'dhcp', 'STP': 'no', 'bridged': 'true'}}, {}, {'connectivityCheck': 'true', 'connectivityTimeout': 120}) {} MainProcess|Thread-15::ERROR::2013-07-30 15:32:55,809::libvirtconnection::94::libvirtconnection::(wrapper) connection to libvirt broken. ecode: 1 edom: 7 MainProcess|Thread-15::ERROR::2013-07-30 15:32:55,809::libvirtconnection::96::libvirtconnection::(wrapper) taking calling process down. Indeed, libvirtd was restarted 3 seconds earlier (note the UTC+3 tz of supervdsm.log): 2013-07-30 12:32:52.211+0000: 12320: info : libvirt version: 0.10.2, package: 18.el6_4.9 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2013-06-20-16:56:19, x86-002.build.bos.redhat.com) That restart was initiated by a failed ovirt-host-depoly meni-rhevm-33-2013073114031375268590/var/log/ovirt-engine/host-deploy/ovirt-20130730152902-rose08.qa.lab.tlv.redhat.com-14346118.log: 2013-07-30 15:29:01 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:441 execute-output: ('/usr/bin/vdsm-tool', 'libvirt-configure') stdout: Starting configure libvirt to VDSM ... =Done configuring libvirt= The first ovirt-host-deployment failed due to an iptables startup problem. A following deployment has finished successfully. I believe that the problem could have been averted if ovirt-host-deploy restarted supervdsd prior to configuring libvirt, just like it does to vdsmd itself. Note that before we made supervdsmd its own service, its restart was implicit by vdsmd's restart.
ovirt-host-deploy-1.1.0-0.6.master.el6ev.noarch
Closing - RHEV 3.3 Released