Bug 1259467
Summary: | Migration issues Importing Storage Domain no more VM | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Alain <avondra> | ||||||||||
Component: | BLL.Storage | Assignee: | Maor <mlipchuk> | ||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Elad <ebenahar> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 3.5.0 | CC: | acanan, amureini, avondra, bugs, ecohen, lsurette, mgoldboi, mlipchuk, rbalakri, tnisan, yeylon | ||||||||||
Target Milestone: | ovirt-3.6.0-rc | Flags: | rule-engine:
ovirt-3.6.0+
ylavi: planning_ack+ amureini: devel_ack+ rule-engine: testing_ack+ |
||||||||||
Target Release: | 3.6.0 | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | storage | ||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2015-11-04 11:17:14 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Alain
2015-09-02 16:27:51 UTC
Maor, have a look asap please, seems like unregistered entities perhaps? Yes Tal, It seems something like that. Do I upload vdsm.log or engine.log ? Hi Alain, There are no logs, can you please upload them again. also from the previous destroyed engine Thanks, Maor Alain, I'm not sure that I understood from your description. Was your DC was 3.4 or 3.5 version? what kind of configuration problem did you had? Maor, My DC was migrated to 3.5 about one year ago from 3.4 (maybe I've done something wrong...). When I want to migrate the oVirt manager to a new one, I can't import VMs from the Storage Domains previously attached to my actual DC in 3.5 version. Do you need the entire engine and vdsm logs or only part of them (i.e with ERROR or WARNING tags) ? (In reply to Alain from comment #5) > Maor, > My DC was migrated to 3.5 about one year ago from 3.4 (maybe I've done > something wrong...). What do you mean by migrate? Can you please describe the steps of this 3.5 upgrade? Do you still have the logs from this upgrade? Will it be possible to reproduce this and send the logs? > When I want to migrate the oVirt manager to a new one, I can't import VMs > from the Storage Domains previously attached to my actual DC in 3.5 version. > Do you need the entire engine and vdsm logs or only part of them (i.e with > ERROR or WARNING tags) ? Please attach the full engine and vdsm logs (from the destroyed environment and the current environment) I've updated the DC via the Webmin Portal selecting the DC and "Edit Data Center" -> "Compatibility Version" and choosen "3.5". Should I forgot something ? Created attachment 1069657 [details]
Engine.log
These logs are coming from the actual manager oVirt 3.5.0-1.
It actually runs as hosted engine as dezcribed in my article.
Created attachment 1069658 [details]
vdsm logs from Hypervisor 1
These logs are actual logs, I don't have the logs of August 28th, date of the migration try, because I needed to restore the hypervisor from a 27th Acronis Backup.
Created attachment 1069659 [details]
vdsm logs from Hypervisor 2
These logs are actual logs, As for the hypervisor 1 I don't have the logs of August 28th, date of the migration try, because I needed to restore the hypervisor from a 27th Acronis Backup.
Created attachment 1069660 [details]
Engine.log from the new oVirt Manager 3.5.3
These logs come from the (almost) new manager with oVirt 3.5.3 who failed to import VMs from the Storage Domain imported.
Thes logs were generated during the migration on August 28th started at 01 PM
Thanks for the logs. It looks like that in your previous setup you have used 4 Storage Domains which had OVF_STORE disk, their ids are: 0fec0486-7863-49bc-a4ab-d2c7ac48258a 1f6dec51-12a6-41ed-9d14-8f0ad4e062d2 7e40772a-fe94-4fb2-94c4-6198bed04a6a d7b9d7cc-f7d6-43c7-ae13-e720951657c9 It also looks that the VMs are fetched from the OVF_STORE disks as well like: "[1700470f] Retrieve OVF Entity from storage domain ID 7e40772a-fe94-4fb2-94c4-6198bed04a6a for entity ID 82d1653d-78ad-4859-b9af-8fb02bfdae15, entity name unc-srv-qual03 and VM Type of VM" Can you please point me out to a specific Storage Domain that doesn't provide you the VMs or Templates to import? and if you remember also a name of a specific VM which you wanted to register? (In reply to Maor from comment #12) > Thanks for the logs. > > It looks like that in your previous setup you have used 4 Storage Domains > which had OVF_STORE disk, their ids are: > 0fec0486-7863-49bc-a4ab-d2c7ac48258a > 1f6dec51-12a6-41ed-9d14-8f0ad4e062d2 > 7e40772a-fe94-4fb2-94c4-6198bed04a6a > d7b9d7cc-f7d6-43c7-ae13-e720951657c9 > > It also looks that the VMs are fetched from the OVF_STORE disks as well like: > "[1700470f] Retrieve OVF Entity from storage domain ID > 7e40772a-fe94-4fb2-94c4-6198bed04a6a for entity ID > 82d1653d-78ad-4859-b9af-8fb02bfdae15, entity name unc-srv-qual03 and VM Type > of VM" > > Can you please point me out to a specific Storage Domain that doesn't > provide you the VMs or Templates to import? and if you remember also a name > of a specific VM which you wanted to register? For instance the VOL-UNC-PROD-02 and the VM unc-srv-ad1 It looks like your Hosts were running when doing the recover, so what happened is that the VM unc-srv-ad1 has been running as an external VM. Before doing the recover of your setup, the Hosts must be rebooted as mentioned in the documentation. It looks that you tried to import it but failed since there was a running external VM: "2015-08-28 15:43:38,546 WARN [org.ovirt.engine.core.bll.ImportVmFromConfigurationCommand] (ajp--127.0.0.1-8702-4) [5ef73a48] CanDoAction of action ImportVmFromConfiguration failed for user admin@internal. Reasons: VAR__ACTION__IMPORT,VAR__TYPE__VM,VM_CANNOT_IMPORT_VM_EXISTS,$VmName external-unc-srv-ad1" I would suggest you to try to attach this Storage Domain to a new setup, using new Hosts (or rebooted hosts) and try to register this VM once again (In reply to Maor from comment #14) > It looks like your Hosts were running when doing the recover, so what > happened is that the VM unc-srv-ad1 has been running as an external VM. > Before doing the recover of your setup, the Hosts must be rebooted as > mentioned in the documentation. > > It looks that you tried to import it but failed since there was a running > external VM: > "2015-08-28 15:43:38,546 WARN > [org.ovirt.engine.core.bll.ImportVmFromConfigurationCommand] > (ajp--127.0.0.1-8702-4) [5ef73a48] CanDoAction of action > ImportVmFromConfiguration failed for user admin@internal. Reasons: > VAR__ACTION__IMPORT,VAR__TYPE__VM,VM_CANNOT_IMPORT_VM_EXISTS,$VmName > external-unc-srv-ad1" > > I would suggest you to try to attach this Storage Domain to a new setup, > using new Hosts (or rebooted hosts) and try to register this VM once again That's right I did'nt reboot the hosts, where is the documentation about that ? I saw effectively, some Vms with the prefix external., but do you think it's normal that I don't see any VMs in the "Import VM" tag of the Storage Doamin ? (In reply to Alain from comment #15) > (In reply to Maor from comment #14) > > It looks like your Hosts were running when doing the recover, so what > > happened is that the VM unc-srv-ad1 has been running as an external VM. > > Before doing the recover of your setup, the Hosts must be rebooted as > > mentioned in the documentation. > > > > It looks that you tried to import it but failed since there was a running > > external VM: > > "2015-08-28 15:43:38,546 WARN > > [org.ovirt.engine.core.bll.ImportVmFromConfigurationCommand] > > (ajp--127.0.0.1-8702-4) [5ef73a48] CanDoAction of action > > ImportVmFromConfiguration failed for user admin@internal. Reasons: > > VAR__ACTION__IMPORT,VAR__TYPE__VM,VM_CANNOT_IMPORT_VM_EXISTS,$VmName > > external-unc-srv-ad1" > > > > I would suggest you to try to attach this Storage Domain to a new setup, > > using new Hosts (or rebooted hosts) and try to register this VM once again > > That's right I did'nt reboot the hosts, where is the documentation about > that ? see http://www.ovirt.org/Features/ImportStorageDomain#Restrictions: "In a disaster recovery scenario, if the Host, which the user about to use, was in the environment which was destroyed, it is recommended to reboot this Host before adding it to the new setup. The reason for that is first, to kill any qemu processes which are still running and might be automatically be added as VMs into the new setup, and also to avoid any sanlock issues." > I saw effectively, some Vms with the prefix external., but do you think it's > normal that I don't see any VMs in the "Import VM" tag of the Storage Doamin > ? weird, in the logs it looks like you were trying to import them, can you please try to attach this Storage Domain to a new setup with rebooted hosts, and let me know if you still don't see those VMs in the import subtab? (In reply to Maor from comment #16) > (In reply to Alain from comment #15) > > (In reply to Maor from comment #14) > > > It looks like your Hosts were running when doing the recover, so what > > > happened is that the VM unc-srv-ad1 has been running as an external VM. > > > Before doing the recover of your setup, the Hosts must be rebooted as > > > mentioned in the documentation. I am sorry for my last question, of course I know where is the documentation :-) Just to complete, I must rebbot the host after the installation in the DC or is it better to reboot also the host before creating its in the new DC ? > > > > > > It looks that you tried to import it but failed since there was a running > > > external VM: > > > "2015-08-28 15:43:38,546 WARN > > > [org.ovirt.engine.core.bll.ImportVmFromConfigurationCommand] > > > (ajp--127.0.0.1-8702-4) [5ef73a48] CanDoAction of action > > > ImportVmFromConfiguration failed for user admin@internal. Reasons: > > > VAR__ACTION__IMPORT,VAR__TYPE__VM,VM_CANNOT_IMPORT_VM_EXISTS,$VmName > > > external-unc-srv-ad1" > > > > > > I would suggest you to try to attach this Storage Domain to a new setup, > > > using new Hosts (or rebooted hosts) and try to register this VM once again > > > > That's right I did'nt reboot the hosts, where is the documentation about > > that ? > > see http://www.ovirt.org/Features/ImportStorageDomain#Restrictions: > "In a disaster recovery scenario, if the Host, which the user about to use, > was in the environment which was destroyed, it is recommended to reboot this > Host before adding it to the new setup. The reason for that is first, to > kill any qemu processes which are still running and might be automatically > be added as VMs into the new setup, and also to avoid any sanlock issues." > > > > I saw effectively, some Vms with the prefix external., but do you think it's > > normal that I don't see any VMs in the "Import VM" tag of the Storage Doamin > > ? > > weird, in the logs it looks like you were trying to import them, can you > please try to attach this Storage Domain to a new setup with rebooted hosts, > and let me know if you still don't see those VMs in the import subtab? (In reply to Alain from comment #17) > (In reply to Maor from comment #16) > > (In reply to Alain from comment #15) > > > (In reply to Maor from comment #14) > > > > It looks like your Hosts were running when doing the recover, so what > > > > happened is that the VM unc-srv-ad1 has been running as an external VM. > > > > Before doing the recover of your setup, the Hosts must be rebooted as > > > > mentioned in the documentation. > > I am sorry for my last question, of course I know where is the documentation > :-) > Just to complete, I must rebbot the host after the installation in the DC or > is it better to reboot also the host before creating its in the new DC ? It is better to reboot the Hosts just before you add them to the new setup > > > > > > > > > > It looks that you tried to import it but failed since there was a running > > > > external VM: > > > > "2015-08-28 15:43:38,546 WARN > > > > [org.ovirt.engine.core.bll.ImportVmFromConfigurationCommand] > > > > (ajp--127.0.0.1-8702-4) [5ef73a48] CanDoAction of action > > > > ImportVmFromConfiguration failed for user admin@internal. Reasons: > > > > VAR__ACTION__IMPORT,VAR__TYPE__VM,VM_CANNOT_IMPORT_VM_EXISTS,$VmName > > > > external-unc-srv-ad1" > > > > > > > > I would suggest you to try to attach this Storage Domain to a new setup, > > > > using new Hosts (or rebooted hosts) and try to register this VM once again > > > > > > That's right I did'nt reboot the hosts, where is the documentation about > > > that ? > > > > see http://www.ovirt.org/Features/ImportStorageDomain#Restrictions: > > "In a disaster recovery scenario, if the Host, which the user about to use, > > was in the environment which was destroyed, it is recommended to reboot this > > Host before adding it to the new setup. The reason for that is first, to > > kill any qemu processes which are still running and might be automatically > > be added as VMs into the new setup, and also to avoid any sanlock issues." > > > > > > > I saw effectively, some Vms with the prefix external., but do you think it's > > > normal that I don't see any VMs in the "Import VM" tag of the Storage Doamin > > > ? > > > > weird, in the logs it looks like you were trying to import them, can you > > please try to attach this Storage Domain to a new setup with rebooted hosts, > > and let me know if you still don't see those VMs in the import subtab? Ok Maor, I will plan to make another try on next week, I will keep you inform of the results. Thank you Regards (In reply to Alain from comment #19) > Ok Maor, I will plan to make another try on next week, I will keep you > inform of the results. > Thank you > Regards Thanks, please let me know if you need any help on the process. I'm changing the severity to undefined for now, until we will get more details about the other try. (In reply to Maor from comment #20) > (In reply to Alain from comment #19) > > Ok Maor, I will plan to make another try on next week, I will keep you > > inform of the results. > > Thank you > > Regards > > Thanks, please let me know if you need any help on the process. > I'm changing the severity to undefined for now, until we will get more > details about the other try. I will make the operation tomorrow morning between 9h30 to 12h30. If I have a big trouble, I will contact you if you're not too busy. Thanks Regards (In reply to Alain from comment #21) > (In reply to Maor from comment #20) > > (In reply to Alain from comment #19) > > > Ok Maor, I will plan to make another try on next week, I will keep you > > > inform of the results. > > > Thank you > > > Regards > > > > Thanks, please let me know if you need any help on the process. > > I'm changing the severity to undefined for now, until we will get more > > details about the other try. > > I will make the operation tomorrow morning between 9h30 to 12h30. > If I have a big trouble, I will contact you if you're not too busy. > Thanks > Regards no problem, I will try to be available then (In reply to Maor from comment #22) > (In reply to Alain from comment #21) > > (In reply to Maor from comment #20) > > > (In reply to Alain from comment #19) > > > > Ok Maor, I will plan to make another try on next week, I will keep you > > > > inform of the results. > > > > Thank you > > > > Regards > > > > > > Thanks, please let me know if you need any help on the process. > > > I'm changing the severity to undefined for now, until we will get more > > > details about the other try. > > > > I will make the operation tomorrow morning between 9h30 to 12h30. > > If I have a big trouble, I will contact you if you're not too busy. > > Thanks > > Regards > > no problem, I will try to be available then Hi Maor, I am ready to begin, last question, do you think I'd better remove the hosts from the old DC before create them in the new one ? Thnaks Regards (In reply to Alain from comment #23) > (In reply to Maor from comment #22) > > (In reply to Alain from comment #21) > > > (In reply to Maor from comment #20) > > > > (In reply to Alain from comment #19) > > > > > Ok Maor, I will plan to make another try on next week, I will keep you > > > > > inform of the results. > > > > > Thank you > > > > > Regards > > > > > > > > Thanks, please let me know if you need any help on the process. > > > > I'm changing the severity to undefined for now, until we will get more > > > > details about the other try. > > > > > > I will make the operation tomorrow morning between 9h30 to 12h30. > > > If I have a big trouble, I will contact you if you're not too busy. > > > Thanks > > > Regards > > > > no problem, I will try to be available then > > Hi Maor, > I am ready to begin, last question, do you think I'd better remove the hosts > from the old DC before create them in the new one ? > Thnaks > Regards yes, please do (In reply to Maor from comment #24) > (In reply to Alain from comment #23) > > (In reply to Maor from comment #22) > > > (In reply to Alain from comment #21) > > > > (In reply to Maor from comment #20) > > > > > (In reply to Alain from comment #19) > > > > > > Ok Maor, I will plan to make another try on next week, I will keep you > > > > > > inform of the results. > > > > > > Thank you > > > > > > Regards > > > > > > > > > > Thanks, please let me know if you need any help on the process. > > > > > I'm changing the severity to undefined for now, until we will get more > > > > > details about the other try. > > > > > > > > I will make the operation tomorrow morning between 9h30 to 12h30. > > > > If I have a big trouble, I will contact you if you're not too busy. > > > > Thanks > > > > Regards > > > > > > no problem, I will try to be available then > > > > Hi Maor, > > I am ready to begin, last question, do you think I'd better remove the hosts > > from the old DC before create them in the new one ? > > Thnaks > > Regards > > yes, please do Maor, That's what I've done, but no after reboot and installing hosts, they are all non-responsive and the network doesn't go up only loopback... It looks that since the hosts were not rebooted, the external-VMs which were automatically imported to the recovered engine has ran over the existing unregistered entities in the OVF_STORE disk. The engine should filter out all the external VMs when updating OVF_STORE disk The bug here is when an external VM is running in the setup and the OVF_STORE is being updated with it. You can reproduce this bug with the following steps: 1. Create a VM with a disk on a Storage Domain 2. Move the Storage Domain to maintenance - At this point the VM will be saved in the OVF_STORE disk 3. Move the Storage Domain back to up again 4. Run the VM - At this point copy the qemu command process in the Host to use it later At this point you can DR the setup (or do the following steps): 5. Remove the Storage Domain from the setup 6. Try to run the VM again from the Host. - So the VM will be added automatically to the setup as external-VM 7. Import the Storage Domain back to the setup - At this point the Storage should have the orignal VM as a candidate entity to register to the setup. 8. Stop the external VM and remove it from the setup 9. Try to import the candidate entity from the imported Storage Domain. While having an external VM with the same UUID as a VM that is unregistered in the imported storage domain in its OVF_STORE disk, OVFs upload to the OVF_STORE disk doesn't override the VM with the external one. Steps I did: 1) Created a domain and VM with disk located in it 2) Deactivated the domain so the OVF_STORE will be updated 3) Activated the domain back 4) Started the VM 5) Stopped ovirt-engine service in the engine 6) On a second RHEVM setup: added the host that has the qemu process of the VM from the first setup and created a new DC with a new domain (master). The VM was reported as an external one 7) Imported the domain from the first setup to the second one. Activated the domain 8) Deactivated the imported domain 9) Removed the external VM from the setup 10) Activated the imported domain 11) Registered (imported) the VM The VM I registered is the original VM and not the external one Verified using RHEV-3.6.0-15 rhevm-3.6.0-0.18.el6.noarch vdsm-4.17.8-1.el7ev.noarch oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue. If problems still persist, please open a new BZ and reference this one. |