Description of problem:
Cleanly detaching an SD from one RHEVM and importing it into an uninitialized DC in a new RHEVM ignores the existing OVF_STORES and creates new ones. THis leads to the customer not seeing the "VM Import" tab. Any attempt by the customer to move the SD back to the original RHEVM in order to recover the missing VM definitions can result in ALL VM CONFIGURATION BEING OVERWRITTEN.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
* Cleanly detach SD from original RHEVM
* Import into an uninitialized DC on new RHEVM
* wait until the OVF update occurs
* Check for the "VM Import" tab - It will not exist
* Verify OVF_STORE contents (and number) - FAIL - Two new OVF_STORE disks were created. The old OVF_STORE disks were ignored, but still contained the original VM information.
At this point, the customer has successfully imported the SD from one RHEVM to another, but cannot see any VM import information. Behind the scenes, RHEV has ignored the original primary/secondary OVF_STORE disks and created two new ones. Of course, the customer cannot see this via the UI. At this point, the customer wants to revert back to the original configuration in order to recover the missing VM information. This can be done several ways:
* Scenario 1 - Detatch SD from new RHEVM - FAIL! Cannot detach from a DC
with only one SD! So, force remove the DC. Import into already initialized DC back into the original RHEVM - FAIL - Although the import will succeed, the import process sees ALL FOUR OVF_STORE disks, and uses the newer ones to determine what VMs are available for import
* Scenario 2 - DO nothing to the new RHEVM, just try and import the SD again on the original RHEVM - FAIL - Unable to attach the SD back into the initialized DC on the original RHEVM.
* Scenario 3 - Power off the new RHEVM and RHEVH, re-import on the original RHEVM -FAIL - Although the import will succeed, the import process sees ALL FOUR OVF_STORE disks, and uses the newer ones to determine what VMs are available for importst the first two), and there was no "VM Import" tab -
* Scenario 4 - Add another SD to the new RHEVM DC and make it master, detach the SD cleanly, and re-import to original RHEVM DC - FAIL - Although the import will succeed, the import process sees ALL FOUR OVF_STORE disks, and uses the newer ones to determine what VMs are available for importst the first two), and there was no "VM Import" tab -
In all of the above scenarios where the re-import in to the original RHEVM was successful (1, 3, and 4), all VM information was destroyed when the OVF update process kicked off since it used the newer OVF_STORE information and propagated it to ALL OVF_STORE disks.
Import succeeds, but no VMs available for import. Any attempt to move back to the original RHEVM resulted in catastrophic data loss
Import succeeds, original OVF_STORES were found, and VM information was available for import.
In order to run this test, I lowered the "OvfUpdateIntervalInMinutes" from 60 to 5 in order to see the results more quickly.
Maor - please take a look.
This looks very familiar to something you already solved for 3.5.1, no?
This is indeed a duplicate of bug https://bugzilla.redhat.com/1138114
which was fixed in version org.ovirt.engine-root-3.5.0-13
(In reply to Maor from comment #3)
> This is indeed a duplicate of bug https://bugzilla.redhat.com/1138114
> which was fixed in version org.ovirt.engine-root-3.5.0-13
I don't understand how this is possible.
Bug 1138114 was solved in 3.5.0 vt4, and the customer is using the GA release.
Is it possible we have a regression on out hands?
(In reply to Allon Mureinik from comment #5)
> (In reply to Maor from comment #3)
> > This is indeed a duplicate of bug https://bugzilla.redhat.com/1138114
> > which was fixed in version org.ovirt.engine-root-3.5.0-13
> I don't understand how this is possible.
> Bug 1138114 was solved in 3.5.0 vt4, and the customer is using the GA
> Is it possible we have a regression on out hands?
No, or at least it doesn't look like that on my setup.
Maybe the problem is something different, maybe the OVF_STORE disks are not valid to read from.
James, can u please attach the engine and VDSM logs?
As you and I discussed via email, I can easily reproduce this, even on a single RHEVM instance. The key is to detatch/remove the SD (cleanly), and then import it into a RHEVM and attach it to an *uninitialized* DC. I have recreated this scenario many times at this point, using two RHEVM instances and also just using a single RHEVM instance. Here is the scenario I used:
* Create a single VM/disk on the SD
Created "TESTVM1" with a single 1GB drive
* Put the SD into maintenance
I can verify the OVF_STORE disks have been created at this point
directly in the FS:
# grep OVF */*meta
Updated":"Mon Apr 27 09:31:25 CDT 2015","Size":10240}
Updated":"Mon Apr 27 09:31:25 CDT 2015","Size":10240}
# strings 44e10311-f0f2-4cae-8dca-5e7a70e68684/d5f91f24-1cc0-41cc-a1d2-fe4add91dabf
<?xml version='1.0' encoding='UTF-8'?><ovf:Envelope
* Detach and Remove the SD (without formatting)
At this point, I still see the OVF_STORE, there are still only two of
them, and the XML information is still intact.
* Create a new DC in the same RHEVM instance with no attached SD
* Import the SD back into RHEVM
* Attach the imported SD to the uninitialized DC
At this point, it is exactly the same as before. The original OVF_STORE disks were ignored, and two new ones were created.
I've attached logs from engine/vdsm for the above scenario.
Just to be extra thorough, and also to verify that the original OVF_STORE images were still perfectly readable, I added a new SD to the DC, detached/removed the SD with the duplicate OVF_STORE disks, *removed* the newly added OVF_STORE images on the filesystem, and then re-imported the SD into the DC (which is now initialized as opposed to uninitialized).
The "TESTVM1" VM and disk were available for import.
To be perfectly clear, the problem appears to be with importing SD into uninitialized DC.
Created attachment 1019358 [details]
engine.log capturing new OVF_STORE creation
Created attachment 1019359 [details]
vdsm log well before and during duplicate OVF_STORE creation
Thanks for the logs James,
It looks that the problem is that the customer used an uninitialized Storage Pool to attach the imported Storage Domain.(see ) although engine did not block this operation.
I'm working for a fix for that so the user will be able to attach a "detached" Storage Domain to an uninitialized Storage Pool with existing OVF_STORE disks.
"Attaching an imported Storage Domain can only be applied with an initialized Data Center."
(In reply to Maor from comment #11)
> Thanks for the logs James,
> It looks that the problem is that the customer used an uninitialized Storage
> Pool to attach the imported Storage Domain.(see ) although engine did not
> block this operation.
Note that in RHEV-M 3.5.1, this operation will be blocked with a user-friendly message (see bug 1178646), so the customer will get a clear indication what he's doing wrong, and eliminate the risk of potential data loss.
Reducing priority based on this analysis.
PM/GSS stakeholders, please chime in if you disagree with this move.
1. Cleanly detach SD from original RHEVM
2. Import into an uninitialized DC on new RHEVM
3. wait until the OVF update occurs
4. Check for the "VM Import" tab - It should exist
The import succeeds and the VMs available for import
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.