Bug 2175225

Summary: Several failures in encryption (and attachment?)-related cinder tests (lvm/iSCSI) with recent OSP17.1/RHEL 9.2
Product: Red Hat OpenStack Reporter: Luigi Toscano <ltoscano>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED DUPLICATE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: unspecified    
Version: 17.1 (Wallaby)CC: dasmith, eglynn, eharney, jhakimra, kchamart, osp-dfg-compute, sbauza, sgordon, vromanso
Target Milestone: ---Keywords: Regression
Target Release: ---Flags: ifrangs: needinfo? (osp-dfg-compute)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-10 21:35:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luigi Toscano 2023-03-03 15:37:24 UTC
Description of problem:
After the switch to RHEL 9.2 as base, several encryption-related volume tests started to fail consistently when cinder uses the lvm/iSCSI backend with timeouts while trying to boot from or attach an encrypted volume. 

A few sample tests that can be used as reference are:

- cinder_tempest_plugin.scenario.test_volume_encrypted.TestEncryptedCinderVolumes.test_attach_cloned_encrypted_volume
- cinder_tempest_plugin.scenario.test_volume_encrypted.TestEncryptedCinderVolumes.test_boot_cloned_encrypted_volume
- tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_boot_server_from_encrypted_volume_luks


The only visible error from the openstack side is shown in nova-conductor.log:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2414, in _build_and_run_instance
    self.driver.spawn(context, instance, image_meta,
  File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 4285, in spawn
    self._create_guest_with_network(
  File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7408, in _create_guest_with_network
    self._cleanup(
  File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    self.force_reraise()
  File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    raise self.value
  File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7377, in _create_guest_with_network
    guest = self._create_guest(
  File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7317, in _create_guest
    guest.launch(pause=pause)
  File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/guest.py", line 168, in launch
    LOG.exception('Error launching a defined domain with XML: %s',
  File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    self.force_reraise()
  File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    raise self.value
  File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/guest.py", line 165, in launch
    return self._domain.createWithFlags(flags)
  File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 190, in doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 148, in proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 129, in execute
    six.reraise(c, e, tb)
  File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise
    raise value
  File "/usr/lib/python3.9/site-packages/eventlet/tpool.py", line 83, in tworker
    rv = meth(*args, **kwargs)
  File "/usr/lib64/python3.9/site-packages/libvirt.py", line 1409, in createWithFlags
    raise libvirtError('virDomainCreateWithFlags() failed')
libvirt.libvirtError: internal error: process exited while connecting to monitor: 2023-03-01T17:37:32.064771Z qemu-kvm: -blockdev {"driver":"host_device","filename":"/dev/disk/by-id/os-brick+dev+sda","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: Could not open '/dev/disk/by-id/os-brick+dev+sda': Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2237, in _do_build_and_run_instance
    self._build_and_run_instance(context, instance, image,
  File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2510, in _build_and_run_instance
    raise exception.RescheduledException(
nova.exception.RescheduledException: Build of instance a2ac6ba6-4554-4d97-89ca-926aa6e7ab19 was re-scheduled: internal error: process exited while connecting to monitor: 2023-03-01T17:37:32.064771Z qemu-kvm: -blockdev {"driver":"host_device","filename":"/dev/disk/by-id/os-brick+dev+sda","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: Could not open '/dev/disk/by-id/os-brick+dev+sda': Permission denied


On the compute node, /var/log/containers/libvirt/virtqemud.log contains (not the same timestamp, but it's pretty reproducible):

2023-03-02 20:57:24.760+0000: 22522: info : qemuMonitorSend:861 : QEMU_MONITOR_SEND_MSG: mon=0x7f5a2804ad00 msg={"execute":"blockdev-add","arguments":{"driver":"host_device","filename":"/de
v/disk/by-id/os-brick+dev+sda","aio":"native","node-name":"libvirt-3-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"},"id":"libvirt-416"}
 fd=-1
2023-03-02 20:57:24.761+0000: 840946: info : qemuMonitorIOWrite:366 : QEMU_MONITOR_IO_WRITE: mon=0x7f5a2804ad00 buf={"execute":"blockdev-add","arguments":{"driver":"host_device","filename":
"/dev/disk/by-id/os-brick+dev+sda","aio":"native","node-name":"libvirt-3-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"},"id":"libvirt-416"}
 len=258 ret=258 errno=0
2023-03-02 20:57:24.764+0000: 840946: debug : qemuMonitorJSONIOProcessLine:189 : Line [{"id": "libvirt-416", "error": {"class": "GenericError", "desc": "Could not open '/dev/disk/by-id/os-brick+dev+sda': Permission denied"}}]
2023-03-02 20:57:24.764+0000: 840946: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY: mon=0x7f5a2804ad00 reply={"id": "libvirt-416", "error": {"class": "GenericError", "desc": "Could not open '/dev/disk/by-id/os-brick+dev+sda': Permission denied"}}
2023-03-02 20:57:24.765+0000: 22522: debug : qemuMonitorJSONCheckErrorFull:354 : unable to execute QEMU command {"execute":"blockdev-add","arguments":{"driver":"host_device","filename":"/dev/disk/by-id/os-brick+dev+sda","aio":"native","node-name":"libvirt-3-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"},"id":"libvirt-416"}: {"id":"libvirt-416","error":{"class":"GenericError","desc":"Could not open '/dev/disk/by-id/os-brick+dev+sda': Permission denied"}}
2023-03-02 20:57:24.765+0000: 22522: error : qemuMonitorJSONCheckErrorFull:366 : internal error: unable to execute QEMU command 'blockdev-add': Could not open '/dev/disk/by-id/os-brick+dev+sda': Permission denied




Version-Release number of selected component (if applicable):
libvirt-daemon-9.0.0-5.el9.x86_64
openstack-nova-*-23.2.3-1.20230208100952.797c5fa.el9ost.noarch
python3-os-brick-4.3.4-1.20230128060810.cf69f92.el9ost.noarch
openstack-cinder-18.2.2-1.20230214171250.83c60e3.el9ost.noarch

containers: 17.1_20230216.1

Comment 2 Luigi Toscano 2023-03-03 16:44:24 UTC
I was pointed out that this may be a duplicate of bug 2168966 (whose description may be need to be made public), apologies for not searching more.

Comment 3 Luigi Toscano 2023-03-10 21:35:35 UTC

*** This bug has been marked as a duplicate of bug 2168966 ***