Created attachment 1451001 [details] Log files of the attempts to fix Description of problem: Ovirt cannot seem to import it's OWN OVA exported files. Version-Release number of selected component (if applicable): 4.2.3.5 How reproducible: Always Steps to Reproduce: 1.Export OVA using Web GUI 2.transfer to another instance of ovirt 3.Using Web GUI try to import 'OVA', point it to file, and select the VM to load. Actual results: Failure to import with message like 'Importing VM Windows7 to Cluster Default' Expected results: Successful import Additional info: I worked through 3 different iterations of things here. Specific Root Problem: The ovf provides a uuid for the <rasd:Parent> and <rasd:template> objects which causes a failure in parsing the values because it expects an integer in virt-v2v's conversion codebase. I don't know the right answer here or who's wrong...just that virt-v2v doesn't like it. This is the 1-original log file directory attached in the tarzip. There are 3 directories of logs (named 1- 2- 3-) included here. Each have the ovf file that was packed into the ova with the same disk iteratively that caused the error. I incrementally fixed different parts of the ovf to get farther in the import process, but still ultimately was left with inability to import ovirts OWN OVA file into another ovirt isntance. In the second try (2-removedparent), I removed the parent and template elements altogether. It fell down because the ovirt exported OVA file contains a syntactical error according to the spec I found: it uses 'disk/<uuid>' instead of '/disk/<uuid>' for the <rasd:HostResource> element. I manually fixed this then tried again. In the third attempt (3-fixeddisk) it fell down because of a VMDK image descriptor problem. [root@server srv]# cat /etc/redhat-release CentOS Linux release 7.5.1804 (Core) [root@server srv]# rpm -qa | grep vdsm vdsm-jsonrpc-4.20.27.1-1.el7.centos.noarch vdsm-hook-vhostmd-4.20.27.1-1.el7.centos.noarch vdsm-common-4.20.27.1-1.el7.centos.noarch vdsm-network-4.20.27.1-1.el7.centos.x86_64 vdsm-4.20.27.1-1.el7.centos.x86_64 vdsm-yajsonrpc-4.20.27.1-1.el7.centos.noarch vdsm-python-4.20.27.1-1.el7.centos.noarch vdsm-hook-fcoe-4.20.27.1-1.el7.centos.noarch vdsm-api-4.20.27.1-1.el7.centos.noarch vdsm-client-4.20.27.1-1.el7.centos.noarch vdsm-hook-openstacknet-4.20.27.1-1.el7.centos.noarch vdsm-hook-ethtool-options-4.20.27.1-1.el7.centos.noarch vdsm-hook-vfio-mdev-4.20.27.1-1.el7.centos.noarch vdsm-hook-vmfex-dev-4.20.27.1-1.el7.centos.noarch vdsm-http-4.20.27.1-1.el7.centos.noarch [root@server srv]# rpm -qa | grep virt libvirt-python-4.3.0-1.el7.x86_64 libvirt-daemon-driver-storage-logical-4.3.0-1.el7.x86_64 libvirt-daemon-driver-interface-4.3.0-1.el7.x86_64 libvirt-bash-completion-4.3.0-1.el7.x86_64 ovirt-host-dependencies-4.2.2-2.el7.centos.x86_64 cockpit-ovirt-dashboard-0.11.24-1.el7.centos.noarch libvirt-libs-4.3.0-1.el7.x86_64 virt-manager-common-1.4.3-3.el7.noarch libvirt-daemon-driver-storage-4.3.0-1.el7.x86_64 ovirt-setup-lib-1.1.4-1.el7.centos.noarch libvirt-daemon-4.3.0-1.el7.x86_64 ovirt-vmconsole-1.0.5-4.el7.centos.noarch libvirt-daemon-driver-nwfilter-4.3.0-1.el7.x86_64 libvirt-daemon-driver-storage-rbd-4.3.0-1.el7.x86_64 libvirt-daemon-driver-storage-disk-4.3.0-1.el7.x86_64 libvirt-daemon-driver-secret-4.3.0-1.el7.x86_64 libvirt-lock-sanlock-4.3.0-1.el7.x86_64 ovirt-host-deploy-1.7.3-1.el7.centos.noarch virt-v2v-1.36.10-6.el7_5.2.x86_64 ovirt-imageio-daemon-1.3.1.2-0.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch fence-virt-0.3.2-13.el7.x86_64 ovirt-host-4.2.2-2.el7.centos.x86_64 libvirt-daemon-driver-storage-core-4.3.0-1.el7.x86_64 ovirt-vmconsole-host-1.0.5-4.el7.centos.noarch libvirt-daemon-driver-storage-scsi-4.3.0-1.el7.x86_64 libvirt-daemon-driver-storage-iscsi-4.3.0-1.el7.x86_64 libvirt-daemon-driver-nodedev-4.3.0-1.el7.x86_64 python-ovirt-engine-sdk4-4.2.6-2.el7.centos.x86_64 ovirt-hosted-engine-ha-2.2.11-1.el7.centos.noarch ovirt-engine-appliance-4.2-20180504.1.el7.centos.noarch ovirt-release42-4.2.3.1-1.el7.noarch libvirt-daemon-driver-qemu-4.3.0-1.el7.x86_64 libvirt-daemon-config-nwfilter-4.3.0-1.el7.x86_64 libvirt-daemon-config-network-4.3.0-1.el7.x86_64
Please note...I've run the ovf through xmllint --format to get nicer readable xml.
(In reply to Kyle Stapp from comment #1) > Please note...I've run the ovf through xmllint --format to get nicer > readable xml. I can't find the OVF in the attached tar file, did you forget to include it maybe? That's interesting - we failed to parse the OVF and therefore fell back to the old mechanism of querying external (non-oVirt) OVAs and trying to import it using virt-v2v. It would be great to have that OVA. If it's too big or you can't attach it for any other reason, it would also be great to have at least the OVF ('tar xvf /srv/Windows7.ova vm.ovf' should do the trick).
Apologies....you are right...I forgot to add them. I'll add them later tonight. Around 11 pm EST.
Created attachment 1451079 [details] Logs (Including ovf files from OVA).
There...I've included the ovf files this time. Sorry about the mistake before. I can't share the OS images themselves as they have company proprietary stuff. They are in the respective directories with the manual edits I made in 2- and 3- to try and get it to work. Again...the ovirt_ova_query_ansible logs are empty-ish too...which seems bizarre to me....And leaves me with nothing to read :(.
Created attachment 1451080 [details] Logs (Including ovf files from OVA)--zipped.
Created attachment 1451081 [details] Logs (Including ovf files from OVA) -- zipped and working
The good news is that the OVF configuration is perfectly fine. After removing the newlines from vm.ovf and tar-ing it - the query works properly. The bad news is that this seems like a general issue in invoking ansible playbooks on that host (de000d1d-ceb6-45a5-bb7e-214ad7516043): Ansible playbook command has exited with value: 4 Therefore, moving to infra.
(In reply to Arik from comment #8) > The good news is that the OVF configuration is perfectly fine. After > removing the newlines from vm.ovf and tar-ing it - the query works properly. > > The bad news is that this seems like a general issue in invoking ansible > playbooks on that host (de000d1d-ceb6-45a5-bb7e-214ad7516043): > Ansible playbook command has exited with value: 4 > > Therefore, moving to infra. According to [1] exit code 4 means parser error, so are we sure we don't have a code error bug in the playbook/role? Is it possible to execute flow from command line to see debug output of execution? [1] https://github.com/ansible/ansible/issues/19720
If you tell me the exact command structure to invite it I will do that happily
I looked at the other bug. It might have to do with firewall rules we have in place if it's a host unreachable instigator. I'll check when I get into work in about 1hr. Thanks for help.
Wow....so the problem was the firewall rules/sshd-port. We have additonal lockdown rules that we add after ovirt is installed. We change the port etc for sshd. I was unaware that ansible was part of the local host ova reading stuff etc. Could we get better error messages around the ovirt ansible stuff? Could we get a more explicit error message from ansible itself into one of our logs or in the engine.log? I didn't realise import needed sshd to be on port 22. Out of curiousity, does the ansible code path here respect the OVEHOSTED_NETWORK/sshdPort setting? I did not use it in this case...but I know details can fall through the cracks.
Why is the ovirt-query-ova-ansible log file empty instead of full of the ansible error message?
We also might want to look at fixing our ova output to be closer to spec. I don't know proper place to open a new bug etc...so I'll leave that to others:). At least specifically for the 'disk/' issue -->'/disk/' in the <rasd:HostResource> element.
(In reply to Kyle Stapp from comment #13) > Wow....so the problem was the firewall rules/sshd-port. We have additonal > lockdown rules that we add after ovirt is installed. We change the port etc > for sshd. > > I was unaware that ansible was part of the local host ova reading stuff etc. > > Could we get better error messages around the ovirt ansible stuff? Could we > get a more explicit error message from ansible itself into one of our logs > or in the engine.log? It would be better in 4.2.4 - the 'load' operation will fail when failing to execute the ansible playbook for querying the OVA, so we won't try to import it as VMware's OVA nor miss the error I mentioned in comment 8. > > I didn't realise import needed sshd to be on port 22. Not only import flow needs it. I wonder how did you install your host(s) - didn't you use host-deploy? or did you set the firewall rules after the hosts were installed? anyway, both host-deploy and checks for updates should now work for you. Kyle, could you please share this knowledge with the users-list on that thread?
I indeed setup firewall rules AFTER host-deploy. I think I did that after testing with OVEHOSTED_NETWORK/sshdPort set to an exotic port number and some things broke. But I might have misattributed the breakage to ssh when it was really the cruft I speak to below. When I was installing/reinstalling/reinstalling/reinstalling trying to get an automated workflow using ansible to configure our host and automatically deploy ovirt along with our own software I/Ovirt uninstall was previously leaving cruft around, that I found causes spurious failures later in the re-installs. I eventually found https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install. And built a fullproof purge that seems to get me into pure state. I use the following ansible snippet to do a full purge that seems to allow correct installation following a borked one (I pulled it from my code base so some one or two references don't directly apply to ovirt..but most does): - name: Clean Old Install #This attempts to remove all old cruft from previous install attempts #The reason we include the ovirt packages is so that they can be reinstall #At potentially newer versions block: - name: Detect existing cleanup script shell: which ovirt-hosted-engine-cleanup | cat register: ohes_cleanup - name: Debug ohes_cleanup.stdout debug: var: ohes_cleanup.stdout - name: Run Ovirt's Hosted Engine Cleanup Script shell: ovirt-hosted-engine-cleanup -q when: ohes_cleanup.stdout != "" - name: Clean old packages package: name: "{{item}}" state: absent with_items: - "*vdsm*" - "*ovirt*" - "*libvirt*" - "*cockpit*" - name: Remove old configs etc shell: "rm -rf /etc/{{item}}" args: warn: False with_items: - "/etc/*ovirt*" - "/etc/*vdsm*" - "/etc/libvirt/qemu/HostedEngine*" - "/etc/*libvirt*" - "/etc/pki/vdsm" - "/etc/pki/keystore" - "/etc/ovirt-hosted-engine" - "/var/lib/libvirt/" - "/var/lib/vdsm/" - "/var/lib/ovirt-hosted-engine-*" - "/var/log/ovirt-hosted-engine-setup/" - "/var/cache/libvirt/" - name: Clean old repo files shell: "rm -rf /etc/yum.repos.d/{{item}}" args: warn: False with_items: - "ovirt*" - "virt*" - name: Remove old firewalld rules file: path: "/etc/firewalld/direct.xml" state: absent - name: Clear firewall rules shell: "iptables --flush" args: warn: False ignore_errors: yes - name: clean interface configs shell: "rm -rf /etc/sysconfig/network-scripts/{{nic_device_name}}* /etc/sysconfig/network-scripts/ifcfg-dummy_0* /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt“ args: warn: False - name: clean network stuff shell: "{{item}}" args: warn: False with_items: - "brctl delbr ovirtmgmt | cat" - "ip link del ovirtmgmt | cat" - "ip link del dummy0 | cat" - "ip link del virbr0 | cat" - "ip link del virbr0-nic | cat" - "ip link del dummy_0 | cat" - 'ip link del \;vdsmdummy\; | cat'
This can be closed. But please open a new one for: * Better ansible error message that makes it clear it cannot connect to host in some log * Fixing the 'disk/' -> '/disk/' test in <rasd:HostResource> element