Created attachment 1633902 [details] connection refused picture Description of problem: Hosted engine deploy failed with libvirt.libvirtError: unable to connect to server: Connection refused. 2019-11-08 00:11:59,001-0500 ERROR ansible failed {'status': 'FAILED', 'ansible_type': 'task', 'ansible_playbook': '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml', 'ansible_host': 'localhost', 'ansible_task': 'Shutdown local VM', 'ansible_result': 'type: <class \'dict\'>\nstr: {\'msg\': "unable to connect to server at \'hp-dl388g9-04.lab.eng.pek2.**FILTERED**.com:16514\': Connection refused", \'exception\': \'Traceback (most recent call last):\\n File "/tmp/ansible_virt_payload_83pxn_ex/__main__.py", line 593, in main\\n rc, result = core(module)\\n File "/tmp/ansible_virt_payload_8', 'task_duration': 2} 2019-11-08 00:11:59,001-0500 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fd064ac1908> kwargs ignore_errors:None 2019-11-08 00:11:59,003-0500 INFO ansible stats { "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_playbook_duration": "03:20 Minutes", "ansible_result": "type: <class 'dict'>\nstr: {'localhost': {'ok': 87, 'failures': 1, 'unreachable': 0, 'changed': 24, 'skipped': 10, 'rescued': 0, 'ignored': 0}}", "ansible_type": "finish", "status": "FAILED" } Version-Release number of selected component (if applicable): rhvh-4.4.0.8-0.20191107.0 cockpit-packagekit-197.3-1.el8.noarch cockpit-196.3-1.el8.x86_64 cockpit-system-196.3-1.el8.noarch cockpit-dashboard-197.3-1.el8.noarch cockpit-storaged-197.3-1.el8.noarch cockpit-bridge-196.3-1.el8.x86_64 subscription-manager-cockpit-1.25.17-1.el8.noarch cockpit-ovirt-dashboard-0.13.8-1.el8ev.noarch cockpit-ws-196.3-1.el8.x86_64 ovirt-hosted-engine-setup-2.4.0-0.1.master.20191104160243.git0c51343.el8ev.noarch ovirt-hosted-engine-ha-2.4.0-0.0.master.git633a1db.el8ev.noarch rhvm-appliance-4.4-20190823.0.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1.Clean install rhvh-4.4.0.8-0.20191107.0 2.Deploy hosted engine via cockpit UI 3. Actual results: Hosted engine deploy failed with libvirt.libvirtError: unable to connect to server: Connection refused. Expected results: Hosted engine deploy successful without any error Additional info:
Created attachment 1633903 [details] connection refused logs
Isn't this a duplicate of bug 1766556?
(In reply to Yedidyah Bar David from comment #2) > Isn't this a duplicate of bug 1766556? I feel they are different ones. They occurs in different deploy stages. This bug occurs after creating local VM, when copy local vm to engine vm with added storage. But the bug 1766556 occurs when Waiting for the host to be up during local vm creation.
I now spent some time looking at the attached logs, and fail to find what shuts down libvirtd at this point. If this is reproducible, please attach all of /var/log and the journal logs. Thanks. For the record, I now successfully finished a hosted-engine deploy using an up-to-date centos8 host (with nightly master snapshot) and ovirt-engine-appliance-4.4-20191221175026.1.el8.x86_64.
(In reply to Yedidyah Bar David from comment #4) > I now spent some time looking at the attached logs, and fail to find what > shuts down libvirtd at this point. If this is reproducible, please attach > all of /var/log and the journal logs. Thanks. > > For the record, I now successfully finished a hosted-engine deploy using an > up-to-date centos8 host (with nightly master snapshot) and > ovirt-engine-appliance-4.4-20191221175026.1.el8.x86_64. I will try them after I finish RHVH 4.3.8 Tier2 test recent days.
Created attachment 1647664 [details] /var/log files
Created attachment 1647665 [details] journalctl log
According to the attached dnf.rpm.log, you have two-months-old packages: 2019-11-07T16:51:29Z SUBDEBUG Installed: ovirt-ansible-hosted-engine-setup-1.0.31-1.el8ev.noarch 2019-11-07T16:53:33Z SUBDEBUG Installed: ovirt-hosted-engine-setup-2.4.0-0.1.master.20191104160243.git0c51343.el8ev.noarch Please try again with more recent packages, built after 2019-12-22. Thanks.
Test Version (The latest ovirt version) ovirt-node-ng-installer-4.4.0-2019122607.el8.iso ovirt-engine-appliance-4.4-20191226174442.1.el8.x86_64 Test Result: Hosted engine deploy failed since [ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up] [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": [{"address": "hp-dl388g9-04.lab.eng.pek2.redhat.com", "affinity_labels": [], "auto_numa_status": "unknown", "certificate": {"organization": "lab.eng.pek2.redhat.com", "subject": "O=lab.eng.pek2.redhat.com,CN=hp-dl388g9-04.lab.eng.pek2.redhat.com"}, "cluster": {"href": "/ovirt-engine/api/clusters/5bce182a-28c0-11ea-a794-5254003404b0", "id": "5bce182a-28c0-11ea-a794-5254003404b0"}, "comment": "", "cpu": {"speed": 0.0, "topology": {}}, "device_passthrough": {"enabled": false}, "devices": [], "external_network_provider_configurations": [], "external_status": "ok", "hardware_information": {"supported_rng_sources": []}, "hooks": [], "href": "/ovirt-engine/api/hosts/9013d38e-71ec-460e-b1af-f510d85a2592", "id": "9013d38e-71ec-460e-b1af-f510d85a2592", "katello_errata": [], "kdump_status": "unknown", "ksm": {"enabled": false}, "max_scheduling_memory": 0, "memory": 0, "name": "hp-dl388g9-04.lab.eng.pek2.redhat.com", "network_attachments": [], "nics": [], "numa_nodes": [], "numa_supported": false, "os": {"custom_kernel_cmdline": ""}, "permissions": [], "port": 54321, "power_management": {"automatic_pm_enabled": true, "enabled": false, "kdump_detection": true, "pm_proxies": []}, "protocol": "stomp", "se_linux": {}, "spm": {"priority": 5, "status": "none"}, "ssh": {"fingerprint": "SHA256:CNEZTFA8dISuv4k96apQsdOPWdOS9YvOltDeVKyEAtU", "port": 22}, "statistics": [], "status": "install_failed", "storage_connection_extensions": [], "summary": {"total": 0}, "tags": [], "transparent_huge_pages": {"enabled": false}, "type": "rhel", "unmanaged_networks": [], "update_available": false, "vgpu_placement": "consolidated"}]}, "attempts": 120, "changed": false, "deprecations": [{"msg": "The 'ovirt_host_facts' module has been renamed to 'ovirt_host_info', and the renamed one no longer returns ansible_facts", "version": "2.13"}]} didi, Please check the log in my machine environment(reserved until Jan, 2, 2020). I will send the info via email.
Thanks. I looked at the machine. It indeed failed in host deploy, as noted in previous comment. host-deploy log (in /var/log/ovirt-hosted-engine-setup/engine-logs-2019-12-27T15:43:34Z/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20191227235344-hp-dl388g9-04.lab.eng.pek2.redhat.com-3ad05d87.log ) has: 2019-12-27 23:54:01 CST - TASK [ovirt-host-deploy-facts : Install yum-utils] ***************************** 2019-12-27 23:54:10 CST - fatal: [hp-dl388g9-04.lab.eng.pek2.redhat.com]: FAILED! => {"changed": false, "failures": ["No package yum-utils available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []} which looks like bug 1785272. So, how to continue? Some options: 1. Close as duplicate of 1785272. But note that 1785272 only affects node, so you can also: 2. Try again on EL8 (not node) 3. It's not clear if current bug is just plain 'hosted-engine deploy' or some more specific flow. If more specific, just keep it open, and depend on 1785272. Thanks again.
(In reply to Yedidyah Bar David from comment #11) > Thanks. I looked at the machine. It indeed failed in host deploy, as noted > in previous comment. host-deploy log (in > /var/log/ovirt-hosted-engine-setup/engine-logs-2019-12-27T15:43:34Z/ovirt- > engine/host-deploy/ovirt-host-deploy-ansible-20191227235344-hp-dl388g9-04. > lab.eng.pek2.redhat.com-3ad05d87.log ) has: > > 2019-12-27 23:54:01 CST - TASK [ovirt-host-deploy-facts : Install yum-utils] > ***************************** > > 2019-12-27 23:54:10 CST - fatal: [hp-dl388g9-04.lab.eng.pek2.redhat.com]: > FAILED! => {"changed": false, "failures": ["No package yum-utils > available."], "msg": "Failed to install some of the specified packages", > "rc": 1, "results": []} > > which looks like bug 1785272. So, how to continue? Some options: > > 1. Close as duplicate of 1785272. But note that 1785272 only affects node, > so you can also: > > 2. Try again on EL8 (not node) > Sorry, do you mean I using RHEL8-host instead of ovirt-node to retry it? > 3. It's not clear if current bug is just plain 'hosted-engine deploy' or > some more specific flow. If more specific, just keep it open, and depend on > 1785272. > I think it is plain, no more specific operation during setting up Hosted-engine. It is failed when node try to be up after registering to the engine vm, so I think it is the same with BZ-1785272. But the original bug occurred after node up, when creating target VM which is not reproduced with the latest ovirt version. > Thanks again.
(In reply to Wei Wang from comment #12) > (In reply to Yedidyah Bar David from comment #11) > > which looks like bug 1785272. So, how to continue? Some options: > > > > 1. Close as duplicate of 1785272. But note that 1785272 only affects node, > > so you can also: > > > > 2. Try again on EL8 (not node) > > > Sorry, do you mean I using RHEL8-host instead of ovirt-node to retry it? Yes. > > > 3. It's not clear if current bug is just plain 'hosted-engine deploy' or > > some more specific flow. If more specific, just keep it open, and depend on > > 1785272. > > > I think it is plain, no more specific operation during setting up > Hosted-engine. It is failed when node try to be up after registering to the > engine vm, so I think it is the same with BZ-1785272. But the original bug > occurred after node up, when creating target VM which is not reproduced with > the latest ovirt version. So I'd just close as duplicate. If either node or el8 fail after that one is fixed, please open a new bug. Thanks!
Marking this bug as dependent on bug #1785272 and moving this to MODIFIED since that bug is in modified state. We'll move to QE both at the same time and we'll reopen this if this still reproduce while the issue on yum-utils is fixed.
bug #1785272 moved to QE, moving this as well
QE will verify this bug until getting the new 4.4 build.
Test version: RHVH-4.4-20200205.1-RHVH-x86_64-dvd1.iso cockpit-ovirt-dashboard-0.14.1-1.el8ev.noarch cockpit-bridge-211.1-1.el8.x86_64 cockpit-dashboard-211-1.el8.noarch cockpit-system-211.1-1.el8.noarch cockpit-ws-211.1-1.el8.x86_64 cockpit-211.1-1.el8.x86_64 cockpit-storaged-211-1.el8.noarch rhvm-appliance-4.4-20200123.0.el8ev.x86_64 Test Result: Hosted engine deploy failed since [ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up] [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": [{"address": "hp-dl388g9-04.lab.eng.pek2.redhat.com", "affinity_labels": [], "auto_numa_status": "unknown", "certificate": {"organization": "lab.eng.pek2.redhat.com", "subject": "O=lab.eng.pek2.redhat.com,CN=hp-dl388g9-04.lab.eng.pek2.redhat.com"}, "cluster": {"href": "/ovirt-engine/api/clusters/3674fe3a-48b5-11ea-a66c-5254003404b0", "id": "3674fe3a-48b5-11ea-a66c-5254003404b0"}, "comment": "", "cpu": {"speed": 0.0, "topology": {}}, "device_passthrough": {"enabled": false}, "devices": [], "external_network_provider_configurations": [], "external_status": "ok", "hardware_information": {"supported_rng_sources": []}, "hooks": [], "href": "/ovirt-engine/api/hosts/54280d80-f3aa-4f68-b39c-1dcbbfcc8b45", "id": "54280d80-f3aa-4f68-b39c-1dcbbfcc8b45", "katello_errata": [], "kdump_status": "unknown", "ksm": {"enabled": false}, "max_scheduling_memory": 0, "memory": 0, "name": "hp-dl388g9-04.lab.eng.pek2.redhat.com", "network_attachments": [], "nics": [], "numa_nodes": [], "numa_supported": false, "os": {"custom_kernel_cmdline": ""}, "permissions": [], "port": 54321, "power_management": {"automatic_pm_enabled": true, "enabled": false, "kdump_detection": true, "pm_proxies": []}, "protocol": "stomp", "se_linux": {}, "spm": {"priority": 5, "status": "none"}, "ssh": {"fingerprint": "SHA256:piuq9fnOwos/lbsFaQKgYl7Mz+0rqlWNo/vqhZ39IPY", "port": 22}, "statistics": [], "status": "install_failed", "storage_connection_extensions": [], "summary": {"total": 0}, "tags": [], "transparent_huge_pages": {"enabled": false}, "type": "rhel", "unmanaged_networks": [], "update_available": false, "vgpu_placement": "consolidated"}]}, "attempts": 120, "changed": false, "deprecations": [{"msg": "The 'ovirt_host_facts' module has been renamed to 'ovirt_host_info', and the renamed one no longer returns ansible_facts", "version": "2.13"}]} vdsm.log 2020-02-06 16:01:34,627+0800 ERROR (vm/7b97fbc8) [virt.vm] (vmId='7b97fbc8-4d9b-455c-b14a-afad0e136e5a') Failed to connect to guest agent channel (vm:2232) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 2230, in _vmDependentInit self.guestAgent.start() File "/usr/lib/python3.6/site-packages/vdsm/virt/guestagent.py", line 247, in start self._prepare_socket() File "/usr/lib/python3.6/site-packages/vdsm/virt/guestagent.py", line 289, in _prepare_socket supervdsm.getProxy().prepareVmChannel(self._socketName) File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__ return callMethod() File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda> **kwargs) File "<string>", line 2, in prepareVmChannel File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result) FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/libvirt/qemu/channels/7b97fbc8-4d9b-455c-b14a-afad0e136e5a.com.redhat.rhevm.vdsm' QE move bug status to "ASSIGNED"
Looks like the original issue about libvirt.libvirtError: unable to connect to server: Connection refused is fixed. The new issue in comment #17 should be handled in a different bug, can you please open it? Moving this back to QE for the libvirt.libvirtError: unable to connect to server: Connection refused issue verification.
With the latest 4.4 build, this issue's verification is blocked by bug 1808253 and 1701491. QE will try to verify this bug after 1808253 and 1701491 are fixed.
Original issue has gone, so verify this bug.
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.