Bug 1430358
| Summary: | Restarting the SPM vdsm process during a cold merge, after cannot preview other snapshots | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Carlos Mestre González <cmestreg> | ||||
| Component: | Core | Assignee: | Ala Hino <ahino> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Carlos Mestre González <cmestreg> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 4.19.0 | CC: | ahino, alitke, amureini, bugs, cmestreg, stirabos, tnisan, ylavi | ||||
| Target Milestone: | ovirt-4.1.3 | Flags: | rule-engine:
ovirt-4.1+
rule-engine: exception+ ylavi: planning_ack+ amureini: devel_ack+ ratamir: testing_ack+ |
||||
| Target Release: | 4.19.19 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-07-06 13:31:41 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Carlos Mestre González
2017-03-08 12:52:38 UTC
Created attachment 1261267 [details]
vdsm and engine logs
- Remove snapshot_18901_iscsi_1
- Immediately restart vdsm host_mixed_2 that is the SPM (systemctl restart)
- Operation fails.
- Try to preview snapshot_18901_iscsi_2 ===> FAILS (Now SPM is host_mixed_3)
The new cold merge flow first changes the base volume to illegal, then the prepare/merge phases are performed and then the volume status is changed back to legal (as part of the finalizeMerge operation)..
In this scenario vdsm was restarted during the merge, causing the volume to remain illegal which fails further operations on the chain (like createVolume).
As it seems, we shouldn't set the volume to ILLEGAL (After verifying that the volume indeed can be safely used even after a failure during any of the operations).
2017-03-08 13:14:05,044+0200 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] In recovery, ignoring 'SDM.merge' in bridge with {'subchain_info': {'img_id': 'f45bba2d-7e2f-4d02-9c30-a71818c1a20a', 'sd_id': '9eed63d8-bf77-442a-b4a0-fd313324d783', 'top_id': '45d68105-965f-4f93-b8f0-485ba6809700', 'base_id': '5f2c7357-c772-4e8e-acf4-c190c0e0d7ed', 'base_generation': 0}, 'job_id': 'd69b099a-1d8b-4897-946e-65cc2dd5667a'} (__init__:527)
2017-03-08 13:16:41,941+0200 ERROR (tasks/4) [storage.VolumeManifest] Unexpected error (volume:580)
Traceback (most recent call last):
File "/usr/share/vdsm/storage/volume.py", line 578, in prepare
chainrw=chainrw, setrw=setrw)
File "/usr/share/vdsm/storage/volume.py", line 557, in prepare
raise se.prepareIllegalVolumeError(self.volUUID)
prepareIllegalVolumeError: Cannot prepare illegal volume: ('5f2c7357-c772-4e8e-acf4-c190c0e0d7ed',)
Or alternatively - we can set it back to LEGAL again. Ala/Adam - what's your take on that? Hmm, it might make sense to leave the base volume LEGAL since the chain and data remains valid even during an interrupted merge. We just need to make sure that the engine marks the snapshot as ILLEGAL (in the DB) so we do not attempt to preview or revert to the partially deleted snapshot. In order to start leaving the base vol LEGAL we need to audit the entity polling that engine does to determine the status of a cold merge when there is no host job reported. AFAIK the current design checks for base volume legality. If we can move that to using volume generation validation instead then we should be covered. Carlos, This is a vdsm bug. Please change the Product to vdsm and provide the version where this bug observed. INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Tag 'v4.19.18' doesn't contain patch 'https://gerrit.ovirt.org/77610'] gitweb: https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=shortlog;h=refs/tags/v4.19.18 For more info please contact: infra verified on: vdsm-4.19.20-1.el7ev.x86_64 (rhevm-4.1.3) |