Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1978672

Summary: VMs with block based storage do not recover from hibernation (suspend)
Product: Red Hat Enterprise Virtualization Manager Reporter: Frank DeLorey <fdelorey>
Component: vdsmAssignee: Liran Rotenberg <lrotenbe>
Status: CLOSED ERRATA QA Contact: Tamir <tamir>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4.6CC: ahadas, aromito, ddacosta, dfodor, eshames, lrotenbe, lsurette, mavital, michal.skrivanek, mzamazal, sfishbai, srevivo, tamir, ycui
Target Milestone: ovirt-4.4.9Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: vdsm-4.40.90.2 Doc Type: Bug Fix
Doc Text:
Previously, virtual machines failed to restore when running hibernation on block based storage. In the current release, the data is written as raw data allowing the virtual machine restore to succeed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 15:12:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1417161    

Description Frank DeLorey 2021-07-02 13:09:58 UTC
Description of problem:
If a VM with block based storage is put into suspend mode (hibernation) it fails to come out when run.

Version-Release number of selected component (if applicable):
RHV 4.4.6

How reproducible:
Every time

Steps to Reproduce:
1.Select a VM with block based storage and suspend it
2.Select RUN to bring the VM out of hibernation
3.Fails to exit hibernation

Actual results:
Fails 100% of the time to exit from being suspended. The VM must be rebooted or shutdown to come back up and even then the state in the UI never goes to UP it stays at "Powering UP" or "Rebooting"

Expected results:

Should resume without any failures

Additional info:

This works as expected with file based storage. Customer also state dthat this was working in RHV 4.4.5 but stopped working after uograding to RHV 4.4.6

Comment 5 Arik 2021-07-05 10:56:07 UTC
same as bz 1733804 and bz 1708031 that was supposed to be resolved by the fix for bz 1503468 but for latter was closed as WORKSFORME
I managed to reproduce it as well

Comment 6 Arik 2021-07-05 11:10:10 UTC
It seems to work for snapshots with memory and the handling of the memory volumes should be the same..

Comment 11 Arik 2021-07-15 06:54:39 UTC
How about creating a loopback device with --sizelimit in case the memory dump resides on a block device when restoring the memory?

Comment 15 Tamir 2021-10-14 15:12:23 UTC
Verified on RHV 4.4.9-4. All looks good to me.

Env:
  - Engine instance with RHV 4.4.9-4 (ovirt-engine-4.4.9.1-0.13.el8ev) and RHEL 8.5 installed.
  - 3 Hosts with RHV 4.4.9-4 and RHEL 8.5, vdsm-4.40.90.2-1.el8ev, ovirt-engine-4.4.9.1-0.13.el8ev.

Steps:

In Admin Portal:

1. Create a 4.6 data center and a 4.6 cluster.
2. Install the hosts.
3. Add NFS, iSCSI and GlusterFS storage domains.
4. Create 3 RHEL 8.5 VMs, Each VM has a bootable disk in 1 of those storage domains.
5. Run the VMs.
6. Suspend the VMs.
7. Run the VMs.

Results (As Expected):
1. The 4.6 data center and cluster were created.
2. The hosts were installed.
3. The NFS, iSCSI and GlusterFS storage domains were added.
4. The 3 RHEL 8.5 VMs were created.
5. The VMs ran.
6. The VMs were suspended.
7. The VMs ran successfully.

Snapshot test:

Setup: shutdown the VMs.

1. Run the iSCSI VM.
2. Create a snapshot with all disks.
3. Stop the VM.
4. Add 2 more iSCSI disks.
5. Start the VM.
6. Snapshot it with all disks.
7. Stop the VM.
8. Preview the first snapshot.
9. Run the VM (With all the disks/part of those. I checked both cases).
10. Suspend the VM.
11. Run the VM.

Results:
In steps 1, 5, 9, 11 the VM is running correctly without any error.
In steps 3, 7 the VM is stopped.
In steps 2, 6 The snapshot was created.
In step 4, The disks were added.
In step 8, The first snapshot is previewed.
In step 10, The VM is suspended without any errors.

Comment 16 Arik 2021-10-14 21:26:00 UTC
Tamir, only steps 10 and 11 are relevant in this context and what we are interested in the most is where the memory dump volume is stored
We need to make sure that when suspending a VM and its memory dump volume resides on any of the three storages you have (especially iSCSI since we had a problem on block storage before), we are able to resume from suspension properly (not only that the VM runs but also that the memory is restored)
Can you please make sure that is covered?

Comment 17 Tamir 2021-10-19 14:54:43 UTC
Verified on RHV 4.4.9-5. Thanks for the comment Arik.

Env:
  - Engine instance with RHV 4.4.9-5 (ovirt-engine-4.4.9.2-0.6.el8ev) and RHEL 8.5 installed.
  - 3 Hosts with RHV 4.4.9-5 and RHEL 8.5, vdsm-4.40.90.2-1.el8ev, ovirt-engine-4.4.9.2-0.6.el8ev.

Steps:

In Admin Portal:

1. Create a 4.6 data center and a 4.6 cluster.
2. Install the hosts.
3. Add NFS, iSCSI and GlusterFS storage domains.
4. Create 3 RHEL 8.5 VMs, Each VM has a bootable disk in 1 of those storage domains.
5. Run the VMs.
6. Create a mount dir (mkdir /mnt/ramdisk)
7. Mount ramfs onto the mount dir (mount -t ramfs -o size=20m ramfs /mnt/ramdisk).
8. Create a small file with content in the mount dir.
9. Run Firefox and open 3 different tabs
10. Suspend the VMs.
11. Run the VMs.
12. Check that firefox is still open with those 3 tabs.
13. Check that the file contents match to those from step 8.

* The NFS and GlusterFS are tested for regression purposes.

Results (As Expected):
1. The 4.6 data center and cluster were created.
2. The hosts were installed.
3. The NFS, iSCSI and GlusterFS storage domains were added.
4. The 3 RHEL 8.5 VMs were created.
5. The VMs ran.
6. The mount dir was created.
7. The ramfs was mounted.
8. The file with content was created in the mount dir.
9. Firefox is running with 3 different tabs
10. The VMs were suspended.
11. The VMs ran successfully.
12. Firefox was still open with those 3 tabs.
13. The file exists with the same data.

Snapshot test:

Setup: shutdown the VMs.

1. Run the iSCSI VM.
2. Open Firefox with 3 different tabs
3. Create a snapshot with all disks.
4. Add 2 more iSCSI disks.
5. Snapshot it with all disks.
6. Stop the VM.
7. Preview the first snapshot.
8. Run the VM (With all the disks/part of those. I checked both cases).
9. Open another window of Firefox with one tab.
10. Suspend the VM.
11. Run the VM.
12. Check that all the Firefox instances are open as stated before. 

Results:
In steps 1, 8, 11, the VM is running correctly without any error.
In steps 6, the VM is stopped.
In steps 3, 5, The snapshot was created.
In step 4, The disks were added.
In step 7, The first snapshot is previewed.
In step 10, The VM is suspended without any errors.
In steps 2, 9, a firefox window is opened with the number of tabs specified.
In step 12, The firefox instances are still open with the same tabs.

Comment 21 errata-xmlrpc 2021-11-16 15:12:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) [ovirt-4.4.9]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4704