Created attachment 778268 [details] supervdsm Description of problem: install ovirt-node and configure network without bridges. apply workarounds for other issues (enable firewall port 54321, set root password and ssh password authentication manually using /usr/libexec/ovirt-config-password) add the host from ovirt-engine add host flow Version-Release number of selected component (if applicable): ovirt-node-iso-3.0.0-5.0.5.vdsm.fc19.iso vdsm-4.12.0-0.1.rc3 How reproducible: always (on f19, haven't tried el6 yet) Steps to Reproduce: 1.install ovirt-node 2.set root password and ssh passwd auth 3.add host from engine Actual results: fails to setup the management network Expected results: sets up management network Additional info:
Created attachment 778269 [details] vdsm
Created attachment 778271 [details] engine log
This also fails on EL6 using ovirt-node-iso-3.0.0-5.1.5.vdsm.el6.iso
Does this work with static (non-dhcp) addresses? Does using ovirt-host-deploy with http://gerrit.ovirt.org/#/c/17306/ make the issue hide away? MainProcess|Thread-15::DEBUG::2013-07-25 13:22:50,803::ifcfg::651::Storage.Misc.excCmd::(_ifup) SUCCESS: <err> = '/etc/dhcp/dhclient.d/sourceRoute.sh: line 6: /var/run/vdsm/sourceRoutes/1374758570: Permission denied\n'; <rc> = 0 MainProcess|Thread-15::ERROR::2013-07-25 13:22:50,803::supervdsmServer::91::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks Traceback (most recent call last): File "/usr/share/vdsm/supervdsmServer.py", line 89, in wrapper File "/usr/share/vdsm/supervdsmServer.py", line 187, in setupNetworks File "/usr/share/vdsm/configNetwork.py", line 541, in setupNetworks ConfigNetworkError: (10, 'connectivity check failed') This is, again, the yet-unexplained effect of systemd stopping vdsmd while in setupNetworks(). See bug 988004. Thread-15::DEBUG::2013-07-25 13:20:32,473::BindingXMLRPC::979::vds::(wrapper) client [172.31.0.3]::call setupNetworks with ({'ovirtmgmt': {'nic': 'eth0', 'bootproto': 'dhcp', 'STP': 'no', 'bridged': 'true'}}, {}, {'connectivityCheck': 'true', 'connectivityTimeout': 120}) {} Thread-16::DEBUG::2013-07-25 13:20:32,476::BindingXMLRPC::979::vds::(wrapper) client [172.31.0.3]::call ping with () {} Thread-16::DEBUG::2013-07-25 13:20:32,476::BindingXMLRPC::986::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}} MainThread::INFO::2013-07-25 13:20:52,126::vdsm::101::vds::(run) (PID: 4730) I am the actual vdsm 4.12.0-0.1.rc3.fc19 localhost.localdomain (3.9.9-302.fc19.x86_64)
(In reply to Dan Kenigsberg from comment #4) > Does this work with static (non-dhcp) addresses? Does using > ovirt-host-deploy with http://gerrit.ovirt.org/#/c/17306/ make the issue > hide away? > Static does not help. running with the newer ovirt-host-deploy does not help either (note: ovirt-host-deploy-offline is used in ovirt-node)
I'm a bit out of touch with ovirt-node these days; can you think of a way to disable NetworkManage and enable legacy "network" service before setupNetwork takes place?
(In reply to Dan Kenigsberg from comment #6) > I'm a bit out of touch with ovirt-node these days; can you think of a way to > disable NetworkManage and enable legacy "network" service before > setupNetwork takes place? No need to disable it, we don't use NetworkManager at all in ovirt-node currently. (it's on the roadmap, but not used in any way now).
Mike, would you attach logs (vdsm.log, supervdsm.log, journalctl) with static addresses? We need any hint we can get.
Need to re-run the scenario to do get the logs, so it will be a little bit. As for steps to reproduce, use the isos from ovirt.org/beta/iso * boot the iso to the installer and follow the screens to install * reboot * when it comes up again, login as admin with the password you provided in the installer * Press F2 to drop to a shell [1] * run "/usr/libexec/ovirt-config-password" (interactive command line tool) ** run "set_ssh_password_authentication" in the tool, answer "Y" to the question ** run "set_root_password" in the tool, follow prompts to set root password ** run "quit" to exit the tool * open firewall port ** f19: firewall-cmd --zone=public --add-port 54321/tcp ** EL6: iptables -A INPUT -p tcp --dport 54321 -j ACCEPT * exit to leave the shell and go back to the TUI * configure networking by selecting the Network tab ** Choose your nic ** select DHCP or static ** fill out fields if static ** choose save * Go to oVirt Engine * Add a new host with the root password and ip/hostname of the node [1] the firewall and password setting issues are bugs that are being worked on
Created attachment 778794 [details] supervdsm f19 static ip el6 dhcp and static worked today
Created attachment 778795 [details] vdsm f19 static ip
Thread-15::ERROR::2013-07-26 15:14:29,132::API::1261::vds::(setupNetworks) connectivity check failed Traceback (most recent call last): File "/usr/share/vdsm/API.py", line 1259, in setupNetworks File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> File "<string>", line 2, in setupNetworks File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod ConfigNetworkError: (10, 'connectivity check failed') Thread-15::DEBUG::2013-07-26 15:14:29,133::BindingXMLRPC::986::vds::(wrapper) return setupNetworks with {'status': {'message': 'connectivity check failed', 'code': 10}}
Reproduced in ovirt-node f19. Funny thing is that webui reports first setupnetworks failure on install. Then, without any action on my part, after a while, it notices that it is reachable (the net was properly created) and sets it to non-operational because it doesn't see ovirtmgmt (was correctly created but rolled back).
Remaining tasks: Test with the new iso resulting from the resolution of #988916 and if there are no suprises, this one will be closed.
Well, I tested it with the fix in http://resources.ovirt.org/releases/beta/iso/ovirt-node-iso-3.0.0-5.1.6.vdsm.fc19.iso The result is that ovirtmgmt is created and is up, but setupNetworks operation is marked as failed and the networks are not persisted (I guess that ovirt-host-deploy seeing that it fails doesn't call setSafeNetworkConfig).
Created attachment 778880 [details] webui of the process of fail then magically UP
Created attachment 778884 [details] ovirt-node-fix supervdsm.log supervdsm.log shows clearly how the operation of setupNetworks does not do rollback due to loss of connectivity.
Created attachment 778885 [details] ovirt-node-fix vdsm.log
(In reply to Antoni Segura Puimedon from comment #15) > Well, I tested it with the fix in > http://resources.ovirt.org/releases/beta/iso/ovirt-node-iso-3.0.0-5.1.6.vdsm. > fc19.iso > > The result is that ovirtmgmt is created and is up, but setupNetworks > operation is marked as failed and the networks are not persisted (I guess > that ovirt-host-deploy seeing that it fails doesn't call > setSafeNetworkConfig). Can I have log please? Recent ovirt-host-deploy calls vdsm-store-net-config on success and vdsm-restore-net-config on failure.
Created attachment 778886 [details] systemd-journald-vdsmd.log Result of doing journalctl -b -u vdsmd It shows that during vdmsd operation systemd is stopping and starting the daemon without any apparent reason. This was the reason that the successful setupNetworks never got the chance to return the successful message to the engine.
Created attachment 778887 [details] journal.log Full journal since boot to see more info of what is going on.
Created attachment 778888 [details] json-journal.log This log level shows much more information.
I do not believe there's anything to do here. Let's see if I'm wrong.
closing as this should be in 3.3 (doing so in bulk, so may be incorrect)