Bug 1116055
| Summary: | Restarting VDSM during Live snapshot will cause ERROR message in engine | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] oVirt | Reporter: | Raz Tamir <ratamir> | ||||
| Component: | ovirt-engine-core | Assignee: | bugs <bugs> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | Pavel Stehlik <pstehlik> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.5 | CC: | amureini, gklein, iheim, michal.skrivanek, ratamir, rbalakri, yeylon | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.6.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | virt | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-03-29 09:08:52 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
As far as I can see, the snapshot was created (i.e., createVolume succeeded), but there's no way of knowing if the VM was snapshoted to it or not. Hence, in order to avoid corruption, a restart is recommended. Raz, a few questions: 1. Can you confirm the above statement? 2. What VDSM did you restart? HSM? SPM? Hi Allon, 1. Yes I agree 2. HSM - the vm run on it > 1. Yes I agree
In that case, I'm not sure there's anything better we can do.
In any event, since the createVolume succeeded, it's more of a virt issue than a storage one.
well, it doesn't even sound interesting to solve. Just don't restart the vdsm when it's doing something:) And when it happens there's no harm really, just the err message and recommendation. However, Allon, why are we recommending restart of the VM, isn't it enough to just remove the snapshot and say it failed (as a generic recovery when we don't know why or where it failed) (In reply to Michal Skrivanek from comment #4) > However, Allon, why are we recommending restart of the VM, isn't it enough > to just remove the snapshot and say it failed (as a generic recovery when we > don't know why or where it failed) The point is that you WANTED to snapshot at this time - if the VM is still running, it'll write to the old volumes. If you force the VM to restart, you're forcing it to write to the new volumes. Not perfect, but works. this bug won't fit into 3.5 release and is being deferred to a later release. If you deeply care about this bug and deserves to be re-evaluated please let me know Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug. |
Created attachment 914517 [details] vdsm and engine logs Description of problem: An ERROR message appears when restarting vdsm during live snapshot creation. The snapshot created successfully even though the message: "Failed to create live snapshot '2' for VM 'vm_0'. VM restart is recommended." And followed by: "Failed to complete snapshot '2' creation for VM 'vm_0'." In engine log a VDSErrorException raised: 2014-07-03 18:26:39,135 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (org.ovirt.thread.pool-6-thread-35) [6b9813f7] Command SnapshotVDSCommand(HostName = aqua-vds4.qa.lab.tlv.redhat.com, HostId = 82893bc5-b294-4b84-a9b9-f044cdb23bda, vmId=e62574cb-5c41-4393-9ec9-a01f718050f6) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed, code = 48 Version-Release number of selected component (if applicable): ovirt-engine-3.5.0-0.0.master.20140605145557.git3ddd2de.el6.noarch How reproducible: 100% Steps to Reproduce: 1. start live snapshot 2. restart vdsm 3. Actual results: explained above Expected results: Additional info: