Description of problem: Corrupt data in vm_dynamic table was preventing complete engine startup success. Version-Release number of selected component (if applicable): We are running version 4.2.2.6 of the ovirt-engine How reproducible: Not sure Steps to Reproduce: Not exactly sure about this, we had a windows 10 virtual machine with guest-tools version 4.2 running. At one point we added an interface, then I believe we deleted the interface. There was only one interface necessary so the additional interface was added/deleted while trouble shooting an unrelated problem Additional info: [deployments]# more engine.ear.failed "{\"WFLYCTL0080: Failed services\" => {\"jboss.deployment.subunit.\\\"engine.ear\\\".\\\"bll.jar\\\".component.Backend.START\" => \"java.lang.IllegalStateException: WFLYEE0042: Failed to construct component instance Caused by: java.lang.IllegalStateException: WFLYEE0042: Failed to construct component instance Caused by: javax.ejb.EJBException: java.lang.IllegalStateException: WFLYEE0042: Failed to construct component instance Caused by: java.lang.IllegalStateException: WFLYEE0042: Failed to construct component instance Caused by: javax.ejb.EJBException: org.springframework.dao.DataIntegrityViolationException: PreparedStatementCallback; SQL [select * from getvmsbyclusterid(?)]; ERROR: invalid input syntax for type inet: \\\"fe80::243c:15f1:53e2:5b74%9\\\" Where: PL/pgSQL function fn_get_comparable_ip_list(text) line 8 at RETURN PL/pgSQL function getvmsbyclusterid(uuid) line 3 at RETURN QUERY; nested exception is org.postgresql.util.PSQLException: ERROR: invalid input syntax for type inet: \\\"fe80::243c:15f1:53e2:5b74%9\\\" Where: PL/pgSQL function fn_get_comparable_ip_list(text) line 8 at RETURN PL/pgSQL function getvmsbyclusterid(uuid) line 3 at RETURN QUERY Caused by: org.springframework.dao.DataIntegrityViolationException: PreparedStatementCallback; SQL [select * from getvmsby clusterid(?)]; ERROR: invalid input syntax for type inet: \\\"fe80::243c:15f1:53e2:5b74%9\\\" Where: PL/pgSQL function fn_get_comparable_ip_list(text) line 8 at RETURN PL/pgSQL function getvmsbyclusterid(uuid) line 3 at RETURN QUERY; nested exception is org.postgresql.util.PSQLException: ERROR: invalid input syntax for type inet: \\\"fe80::243c:15f1:53e2:5b74%9\\\" Where: PL/pgSQL function fn_get_comparable_ip_list(text) line 8 at RETURN PL/pgSQL function getvmsbyclusterid(uuid) line 3 at RETURN QUERY Caused by: org.postgresql.util.PSQLException: ERROR: invalid input syntax for type inet: \\\"fe80::243c:15f1:53e2:5b74%9\\\ " Where: PL/pgSQL function fn_get_comparable_ip_list(text) line 8 at RETURN PL/pgSQL function getvmsbyclusterid(uuid) line 3 at RETURN QUERY\"}}" Which was tracked back to the following entry in the vm_dynamic table engine# select vm_ip from vm_dynamic; ... 172.29.164.139 127.0.0.1 2001:0:9d38:953c:243c:15f1:53e2:5b74 fe80::243c:15f1:53e2:5b74%9 fe80::588d:760b:26d1:f55b%7 ::1 Deleting the row from vm_dynamic allowed the engine to start successfully
I guess we'd need to know exact versions when it happened. AFAIR there were some fixes for this recently
[kmorgan@ovctl01-mn ~]$ rpm -qa | grep -i virt ovirt-engine-metrics-1.1.3.4-1.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.2.2.6-1.el7.centos.noarch ovirt-engine-extension-aaa-ldap-1.3.7-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch ovirt-web-ui-1.3.7-2.el7.centos.noarch ovirt-engine-setup-base-4.2.2.6-1.el7.centos.noarch ovirt-engine-cli-3.6.9.2-1.el7.centos.noarch ovirt-ansible-cluster-upgrade-1.1.6-1.el7.centos.noarch ovirt-ansible-repositories-1.1.0-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-common-4.2.2.6-1.el7.centos.noarch ovirt-engine-extensions-api-impl-4.2.2.6-1.el7.centos.noarch ovirt-engine-backend-4.2.2.6-1.el7.centos.noarch ovirt-engine-api-explorer-0.0.2-1.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7.noarch ovirt-provider-ovn-1.2.9-1.el7.centos.noarch ovirt-engine-tools-backup-4.2.2.6-1.el7.centos.noarch virt-what-1.13-10.el7.x86_64 ovirt-engine-dbscripts-4.2.2.6-1.el7.centos.noarch virtio-win-0.1.149-2.noarch ovirt-js-dependencies-1.2.0-3.1.el7.centos.noarch ovirt-ansible-engine-setup-1.1.0-1.el7.centos.noarch ovirt-ansible-infra-1.1.4-1.el7.centos.noarch ovirt-ansible-roles-1.1.3-1.el7.centos.noarch ovirt-setup-lib-1.1.4-1.el7.centos.noarch ovirt-guest-tools-iso-4.2-1.el7.centos.noarch ovirt-engine-wildfly-overlay-11.0.1-1.el7.centos.noarch ovirt-engine-websocket-proxy-4.2.2.6-1.el7.centos.noarch ovirt-engine-dwh-4.2.2.2-1.el7.centos.noarch ovirt-engine-dashboard-1.2.2-3.el7.centos.noarch ovirt-engine-tools-4.2.2.6-1.el7.centos.noarch ovirt-engine-setup-4.2.2.6-1.el7.centos.noarch ovirt-ansible-vm-infra-1.1.5-1.el7.centos.noarch ovirt-ansible-manageiq-1.1.6-1.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.1.7-1.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-4.2.2.6-1.el7.centos.noarch ovirt-host-deploy-java-1.7.3-1.el7.centos.noarch ovirt-engine-restapi-4.2.2.6-1.el7.centos.noarch ovirt-engine-4.2.2.6-1.el7.centos.noarch ovirt-imageio-common-1.2.2-0.el7.centos.noarch python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64 ovirt-cockpit-sso-0.0.4-1.el7.noarch ovirt-ansible-disaster-recovery-0.3-1.el7.centos.noarch ovirt-imageio-proxy-setup-1.2.2-0.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-4.2.2.6-1.el7.centos.noarch ovirt-iso-uploader-4.2.0-1.el7.centos.noarch ovirt-ansible-image-template-1.1.5-1.el7.centos.noarch ovirt-imageio-proxy-1.2.2-0.el7.centos.noarch ovirt-engine-dwh-setup-4.2.2.2-1.el7.centos.noarch ovirt-engine-webadmin-portal-4.2.2.6-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-4.2.2.6-1.el7.centos.noarch ovirt-engine-extension-aaa-ldap-setup-1.3.7-1.el7.centos.noarch ovirt-host-deploy-1.7.3-1.el7.centos.noarch ovirt-vmconsole-proxy-1.0.4-1.el7.noarch ovirt-engine-lib-4.2.2.6-1.el7.centos.noarch ovirt-engine-wildfly-11.0.0-1.el7.centos.x86_64
Karl, if possible, can you share your vdsm.log, or at least the line containing "fe80::243c:15f1:53e2:5b74%9", of the host which executed the related VM?
Eitan, is this a duplicate of bug 1566059 ?
Unfortunately, I no longer have the original vdsm.log, it doesn't go back that far. We have since had this occur a couple more times, but nothing recent so don't have logs for that either. If it does occur, we simply stop the guest agent and then make sure ipv6 is disabled on windows guests, we don't use ipv6 anyway. And the refresh seems to clear up the entry with the extra character so eventually the system rights itself.
Karl, thanks for the update
in response to comment 4: yes this is the same bug - the sql "fn_get_comparable_ip_list" is the culprit as appears in the stack trace above and also in BZ1566059. A fix for BZ1566059 was introduced in 4.2.3z and 4.2 while according to the output of rpm -qa in comment #2 above the installed ovirt-engine is 4.2.2.6 so it looks like the fix is not in this version.
Thanks, Eitan. I will close this bug as a duplicate because it is already fixed and Karl found a workaround. Thanks for reporting the bug! *** This bug has been marked as a duplicate of bug 1566059 ***