Bug 1231535
| Summary: | VM block SNAPSHOT disks become illegal after failed Live Delete Snapshot Merge | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ | ||||
| Component: | ovirt-engine | Assignee: | Greg Padgett <gpadgett> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Aharon Canan <acanan> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.5.1 | CC: | acanan, alitke, amureini, ecohen, gklein, gpadgett, kgoldbla, lpeer, lsurette, rbalakri, Rhev-m-bugs, tnisan, yeylon, ylavi | ||||
| Target Milestone: | --- | Keywords: | ZStream | ||||
| Target Release: | 3.5.4 | Flags: | ylavi:
Triaged+
|
||||
| Hardware: | x86_64 | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | storage | ||||||
| Fixed In Version: | ovirt-engine-3.5.4 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1213157 | Environment: | |||||
| Last Closed: | 2015-09-06 17:09:31 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1213157 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Comment 1
Allon Mureinik
2015-06-14 14:22:19 UTC
Greg, can you please provide the QA with steps to reproduce THIS bug? (In reply to Allon Mureinik from comment #2) > Greg, can you please provide the QA with steps to reproduce THIS bug? Sure, same steps as the original bug inspiring this one: Steps to Reproduce: 1. Create a VM with several disks including block preallocated and thin and nfs preallocated and thin 2. Start the VM 3. Create 3 snapshots: snsa1, snsa2, snsa3 4. Deleted snapshot snsa2; while the snapshot is locked restarted the vdsm Expected results: If the deletion is successful, then the fix works Actual results: The deletion will fail and disks will be illegal. Attempts to delete the snapshot again will fail. (In reply to Greg Padgett from comment #3) > [...] Also note that for reproducing this, the type of disk isn't as important as performing multiple deletions. Created attachment 1051099 [details]
Logs01
Issue reproduced on vt16.1 (rhevm-3.5.4-1.1.el6ev.noarch) using comment #3 steps screenshot and logs attached. (In reply to Aharon Canan from comment #6) > Issue reproduced on vt16.1 (rhevm-3.5.4-1.1.el6ev.noarch) using comment #3 > steps > > screenshot and logs attached. Hi Aharon, I see several communication errors (non-responsive host) in the engine log and some storage-related errors in the vdsm log, which leads me to a couple questions: 1) Did the storage come back up as expected after the hosts were up? 2) Did you attempt to remove the snapshot again after the host was back up? I didn't emphasize it much in the steps to reproduce, but the original issue left the snapshots in a state where subsequent removal after failure was impossible. There are some cases (this may be one) where the deletion fails, but it /should/ allow you to remove it after a retry--this is the expected behavior. Knowing more about the test would help determine if this is truly a bug vs an unfortunate but expected failure case. Thanks. (In reply to Greg Padgett from comment #7) > (In reply to Aharon Canan from comment #6) > > Issue reproduced on vt16.1 (rhevm-3.5.4-1.1.el6ev.noarch) using comment #3 > > steps > > > > screenshot and logs attached. > > Hi Aharon, I see several communication errors (non-responsive host) in the > engine log and some storage-related errors in the vdsm log, which leads me > to a couple questions: > > 1) Did the storage come back up as expected after the hosts were up? Yes > 2) Did you attempt to remove the snapshot again after the host was back up? Yes > > I didn't emphasize it much in the steps to reproduce, but the original issue > left the snapshots in a state where subsequent removal after failure was > impossible. There are some cases (this may be one) where the deletion > fails, but it /should/ allow you to remove it after a retry--this is the > expected behavior. Knowing more about the test would help determine if this > is truly a bug vs an unfortunate but expected failure case. Thanks. Let me know if you want me to try it again. (In reply to Aharon Canan from comment #8) > (In reply to Greg Padgett from comment #7) > > 1) Did the storage come back up as expected after the hosts were up? > Yes > > 2) Did you attempt to remove the snapshot again after the host was back up? > Yes [...] > Let me know if you want me to try it again. Thanks, so it sounds like there's a fair chance this is something I haven't seen before but the prior logs didn't quite have enough for me to go on. It would be great if you could reproduce it and provide: - steps/details (including # of disks, snapshots, storage type, etc) - engine log - host log - engine db dump; OR point me to the environment where I can poke around a little That should be enough to get started. Following comments #12 and #13, verified. RHEV 3.5.4 Released. closing current release. |