Bug 1964475 - [OSP 13] After host reboot VM goes to error state due to nova-compute EmptyCatalog error
Summary: [OSP 13] After host reboot VM goes to error state due to nova-compute EmptyCa...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 13.0 (Queens)
Hardware: All
OS: Linux
high
high
Target Milestone: async
: 13.0 (Queens)
Assignee: Lee Yarwood
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks: 1979850
TreeView+ depends on / blocked
 
Reported: 2021-05-25 14:57 UTC by David Hill
Modified: 2023-07-10 17:21 UTC (History)
10 users (show)

Fixed In Version: openstack-nova-17.0.13-38.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-07-10 17:21:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1905701 0 None None None 2021-05-27 17:20:19 UTC
Red Hat Issue Tracker OSP-4137 0 None None None 2021-11-17 09:33:03 UTC

Description David Hill 2021-05-25 14:57:13 UTC
Description of problem:
instances with encrypted volumes won't reboot without a hard reboot after a sysreq but will boot normally after a normal reboot of the hypervisor.  This is an improvement over [1] but given that hosts might not always reboot normally and might crash or lose power, this might still be a problem that requires a normal intervention in order to get VMs back after such a crash.   I suspect libvirt writes data on the disks when killed normally but that data never makes it to disk in the even of a powerloss/hard crash.   I put this issue in nova as we might be missing a bind path in the container but might as well be a by-design libvirt issue or a bug.

[1] https://bugzilla.redhat.com/1905017


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 David Hill 2021-05-25 15:36:16 UTC
Before the crash, we see /etc/libvirt/secrets/$UUID.${EXTS} ... after the crash, that file is gone.

If we normally reboot, that file stays.

I tried manually creating a secret using virsh in the nova_libvirt container and powercycled the VM ... the file remained present.   My next step is to try to reproduce this using nova ...

Comment 2 Lee Yarwood 2021-05-25 18:03:21 UTC
(In reply to David Hill from comment #1)
> Before the crash, we see /etc/libvirt/secrets/$UUID.${EXTS} ... after the
> crash, that file is gone.
> 
> If we normally reboot, that file stays.
> 
> I tried manually creating a secret using virsh in the nova_libvirt container
> and powercycled the VM ... the file remained present.   My next step is to
> try to reproduce this using nova ...

Reproducing the removal after a sysrq crash in an OSP env would be super useful. I'll try to do some background reading on how Docker is using device-mapper in OSP 13 to see if there's something we've missed to ensure these secrets get persisted all the time.

Comment 3 David Hill 2021-05-26 15:39:44 UTC
I tried reproducing this issue with a lvm backed cinder volume with luks and I wasn't able to.

This is interesting.  Is there a service that might be starting in their environment that would cleanup /etc/libvirt/secrets ?

Comment 14 Lon Hohberger 2023-03-16 10:32:39 UTC
According to our records, this should be resolved by openstack-nova-17.0.13-40.el7ost.  This build is available now.

Comment 15 Lon Hohberger 2023-07-10 17:21:17 UTC
OSP13 support officially ended on 27 June 2023


Note You need to log in before you can comment on or make changes to this bug.