Red Hat Bugzilla – Bug 1311762
Unable to resume a suspended instance
Last modified: 2018-02-08 06:15:43 EST
Description of problem:
Instance enters error state when resuming from suspend. The following error is seen in the nova-compute log:
libvirtError: Cannot access backing file '/var/lib/nova/instances/_base/swap_1024' of storage file '/var/lib/nova/instances/813878c9-47c0-4430-9640-bddda7fe5b10/disk.swap' (as uid:107, gid:107): No such file or directory
Confirmed that file does not exist when this occurs. Workaround (to an extent) is to cycle the instance with nova start/stop before putting into a suspend state. This is also impacting nova migrate in a different environment.
Every time an instance has been up for a while and is suspended or migrated.
Steps to Reproduce:
1. nova suspend <uuid>
2. nova resume <uuid>
instance enters error state
instance resumes successfully
Recreating the file manually and rebooting the compute is a workaround. Would it be possible to get a hotfix on this? This is a severe bug on an operational point of view.
(In reply to David Hill from comment #3)
> Recreating the file manually and rebooting the compute is a workaround.
> Would it be possible to get a hotfix on this? This is a severe bug on an
> operational point of view.
I've submitted a patch for review: https://code.engineering.redhat.com/gerrit/#/c/68594/
Were they resuming an instance that was created after the fix was applied or is it an instance that was created prior to the hot fix?
The patch fixes the block device mapping of an instance, which previously wasn't tracking the ephemeral and a swap disks.
Instance that were create before the fix was applied will still have the old block device mapping..
rpm is in
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.