Description of problem: HE-RHEVH6.7 | Failed to deploy HE on bonded network infrastructure of RHEVH6.7. This bug cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1235591 Nikolai Sednev 2015-07-01 06:25:35 EDT Basic deployment using PXE of HE rhevm-3.5.3.1-1.4.el6ev.noarch on Red Hat Enterprise Linux Server release 6.7 (Santiago) OS VM, over two Red Hat Enterprise Virtualization Hypervisor release 7.1 (20150609.0.el7ev), while all were using NFS3, succeeded! Components used during deployment were: Hosts: ovirt-node-3.2.3-3.el7.noarch ovirt-node-branding-rhev-3.2.3-3.el7.noarch ovirt-node-plugin-hosted-engine-0.2.0-15.0.el7ev.noarch ovirt-node-plugin-vdsm-0.2.0-25.el7ev.noarch ovirt-host-deploy-1.3.0-2.el7ev.noarch ovirt-node-selinux-3.2.3-3.el7.noarch ovirt-host-deploy-offline-1.3.0-3.el7ev.x86_64 ovirt-hosted-engine-setup-1.2.4-2.el7ev.noarch ovirt-node-plugin-cim-3.2.3-3.el7.noarch ovirt-node-plugin-snmp-3.2.3-3.el7.noarch ovirt-node-plugin-rhn-3.2.3-3.el7.noarch ovirt-hosted-engine-ha-1.2.6-2.el7ev.noarch libvirt-daemon-driver-interface-1.2.8-16.el7_1.3.x86_64 libvirt-cim-0.6.3-6.el7.x86_64 libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-1.2.8-16.el7_1.3.x86_64 sanlock-python-3.2.2-2.el7.x86_64 libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.3.x86_64 libvirt-lock-sanlock-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-network-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-lxc-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64 libvirt-1.2.8-16.el7_1.3.x86_64 mom-0.4.1-5.el7ev.noarch qemu-kvm-rhev-2.1.2-23.el7_1.3.x86_64 vdsm-4.16.20-1.el7ev.x86_64 sanlock-3.2.2-2.el7.x86_64 libvirt-client-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-secret-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-config-network-1.2.8-16.el7_1.3.x86_64 libvirt-daemon-driver-storage-1.2.8-16.el7_1.3.x86_64 sanlock-lib-3.2.2-2.el7.x86_64 libvirt-python-1.2.8-7.el7_1.1.x86_64 On engine: rhevm-branding-rhev-3.5.0-3.el6ev.noarch rhevm-lib-3.5.3.1-1.4.el6ev.noarch rhevm-backend-3.5.3.1-1.4.el6ev.noarch rhevm-dbscripts-3.5.3.1-1.4.el6ev.noarch rhevm-iso-uploader-3.5.1-1.el6ev.noarch rhevm-log-collector-3.5.3-2.el6ev.noarch rhevm-spice-client-x86-cab-3.5-3.el6.noarch rhevm-dependencies-3.5.1-1.el6ev.noarch rhevm-doc-3.5.3-1.el6eng.noarch rhevm-image-uploader-3.5.0-4.el6ev.noarch rhevm-setup-plugin-ovirt-engine-common-3.5.3.1-1.4.el6ev.noarch rhevm-restapi-3.5.3.1-1.4.el6ev.noarch rhevm-tools-3.5.3.1-1.4.el6ev.noarch rhevm-spice-client-x64-msi-3.5-3.el6.noarch rhevm-setup-3.5.3.1-1.4.el6ev.noarch rhevm-sdk-python-3.5.2.1-1.el6ev.noarch ovirt-host-deploy-java-1.3.0-2.el6ev.noarch ovirt-host-deploy-1.3.0-2.el6ev.noarch rhevm-setup-plugin-websocket-proxy-3.5.3.1-1.4.el6ev.noarch rhevm-webadmin-portal-3.5.3.1-1.4.el6ev.noarch rhevm-spice-client-x64-cab-3.5-3.el6.noarch rhevm-userportal-3.5.3.1-1.4.el6ev.noarch rhevm-3.5.3.1-1.4.el6ev.noarch rhevm-guest-agent-common-1.0.10-2.el6ev.noarch rhevm-websocket-proxy-3.5.3.1-1.4.el6ev.noarch rhevm-spice-client-x86-msi-3.5-3.el6.noarch rhevm-setup-plugin-ovirt-engine-3.5.3.1-1.4.el6ev.noarch rhevm-extensions-api-impl-3.5.3.1-1.4.el6ev.noarch rhevm-cli-3.5.0.5-1.el6ev.noarch rhevm-setup-base-3.5.3.1-1.4.el6ev.noarch rhevm-setup-plugins-3.5.1-2.el6ev.noarch Status: ON_QA → VERIFIED Nikolai Sednev 2015-07-01 06:26:07 EDT Keywords: Triaged Sandro Bonazzola 2015-07-09 07:11:00 EDT Blocks: 1059435 Sandro Bonazzola 2015-07-09 07:11:00 EDT Duplicate of this bug: 1241470 [reply] [−] Private Comment 3 wanghui 2015-07-20 05:00:35 EDT We can still encounter this issue in rhev-hypervisor6-6.7-20150717.0.el6ev (vdsm-4.16.22-1.el6ev.x86_64). Test verison: rhev-hypervisor6-6.7-20150717.0.el6ev ovirt-node-3.2.3-11.el6.noarch ovirt-node-plugin-hosted-engine-0.2.0-16.0.el6ev.noarch ovirt-hosted-engine-setup-1.2.5.1-1.el6ev.noarch vdsm-4.16.22-1.el6ev.x86_64 Test steps: 1. Clean install rhev-hypervisor6-6.7-20150717.0.el6ev 2. Create network with bond1 with BONDING_OPTS='mode=1 miimon=100' 3. Setup hosted engine with bond1 as network Actual results: 1. It will report error and quit the setup process. Error in ovirt-hosted-engine-setup ============================================================================================== 2015-07-20 02:10:02 DEBUG otopi.context context._executeMethod:152 method exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/network/bridge.py", line 207, in _misc File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/network/bridge.py", line 225, in _setupNetworks RuntimeError: Failed to setup networks {'rhevm': {'bonding': 'bond1', 'bootproto': 'dhcp', 'blockingdhcp': True}}. Error code: "16" message: "Unexpected exception" 2015-07-20 02:10:02 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Misc configuration': Failed to setup networks {'rhevm': {'bonding': 'bond1', 'bootproto': 'dhcp', 'blockingdhcp': True}}. Error code: "16" message: "Unexpected exception" I noticed that this issue you verified is in rhevh7.1. Since rhevh6.7 still hase such issue, do I need to assign this issue or report a new one to track it?
could you attach vdsm and supervdsm logs? /var/log/message would be useful, too.
(In reply to Dan Kenigsberg from comment #1) > could you attach vdsm and supervdsm logs? > /var/log/message would be useful, too. Hi Huiwa, I only cloned this bug from the original, as you encountered with the same error on bonded network infrastructure. logs are required so can you please provide them?
We can't reproduce this report with rhel 6.7 Hosted-engine deploy over bond working as expected, management network is configured with success over bonded interface. Could it be that after creating the bond, it getting other ip from dhcp and then can't be resolved by it's hostname?
Created attachment 1055163 [details] log file
(In reply to Michael Burman from comment #3) > We can't reproduce this report with rhel 6.7 > Hosted-engine deploy over bond working as expected, management network is > configured with success over bonded interface. > > Could it be that after creating the bond, it getting other ip from dhcp and > then can't be resolved by it's hostname? All the hostname is resolved by local. During the creating process, the engine hostname should be resolved. And if the engine can not be resolved, it should report can not reach the engine not the "SSH connection error". So the hostname should not be the point issue.
In the logs from comment 4 I see: Thread-16::DEBUG::2015-07-20 02:10:02,925::BindingXMLRPC::1133::vds::(wrapper) client [127.0.0.1]::call setupNetworks with ({'rhevm': {'bonding': 'bond1', 'bootproto': 'dhcp', 'blockingdhcp': True}}, {}, {'connectivityCheck': False}) {} Thread-16::ERROR::2015-07-20 02:10:02,931::BindingXMLRPC::1143::vds::(wrapper) libvirt error Traceback (most recent call last): File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1136, in wrapper File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 554, in setupNetworks File "/usr/share/vdsm/API.py", line 1398, in setupNetworks File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> File "<string>", line 2, in setupNetworks File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod libvirtError: internal error client socket is closed And thus it looks like a dupe of bug 1206884 to me. *** This bug has been marked as a duplicate of bug 1206884 ***