Bug 2084193

Summary: Volume backed live migration failing with TLS-e due to Cannot load certificate '/etc/pki/qemu/server-cert.pem'
Product: Red Hat OpenStack Reporter: James Parker <jparker>
Component: puppet-tripleoAssignee: Bogdan Dobrelya <bdobreli>
Status: CLOSED ERRATA QA Contact: James Parker <jparker>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: alee, alifshit, bdobreli, bshephar, jjoyce, jschluet, mburns, slinaber, spower, stchen, tvignaud
Target Milestone: z3Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)Flags: bdobreli: needinfo-
bdobreli: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-11.7.0-2.20220405015038.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-22 16:07:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Logs associated with instance when migration fails none

Description James Parker 2022-05-11 15:56:30 UTC
Created attachment 1878694 [details]
Logs associated with instance when migration fails

Description of problem: Volume back live migration is failing in 16.2 TLS-E CI job due to error:

2022-05-10 23:29:20.224 7 ERROR nova.virt.libvirt.driver [-] [instance: 09dbc268-f2c8-453a-9417-826e77cd7b76] Live Migration failure: internal error: unable to execute QEMU command 'object-add': Cannot load certificate '/etc/pki/qemu/server-cert.pem' & key '/etc/pki/qemu/server-key.pem': Error while reading file.: libvirt.libvirtError: internal error: unable to execute QEMU command 'object-add': Cannot load certificate '/etc/pki/qemu/server-cert.pem' & key '/etc/pki/qemu/server-key.pem': Error while reading file.                                                                                                                                                                                                    
2022-05-10 23:29:20.227 7 DEBUG nova.virt.libvirt.driver [-] [instance: 09dbc268-f2c8-453a-9417-826e77cd7b76] Migration operation thread notification thread_finished /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:9358
2022-05-10 23:29:20.352 15 INFO neutron.wsgi [req-41bfaaff-ed17-4014-81ae-3e271bc2a6ff 857a971e4a0a48e58ebac05e67cb678f fcae9e831eb54cf4a0a8edcf6835f6d5 - default default] 172.17.1.93,::1 "GET /v2.0/ports?device_id=09dbc268-f2c8-453a-9417-826e77cd7b76 HTTP/1.1" status: 200  len: 1125 time: 0.2093558
2022-05-10 23:29:20.420 12 INFO nova.api.openstack.requestlog [req-38fe1089-2845-41a8-8827-004c4488301c 857a971e4a0a48e58ebac05e67cb678f fcae9e831eb54cf4a0a8edcf6835f6d5 - default default] 172.17.1.30 "GET /v2.1/servers/09dbc268-f2c8-453a-9417-826e77cd7b76" status: 200 len: 1578 microversion: 2.25 time: 0.385653
2022-05-10 23:29:20.442 7 DEBUG nova.virt.libvirt.migration [-] [instance: 09dbc268-f2c8-453a-9417-826e77cd7b76] VM running on src, migration failed _log /usr/lib/python3.6/site-packages/nova/virt/libvirt/migration.py:427
2022-05-10 23:29:20.443 7 DEBUG nova.virt.libvirt.driver [-] [instance: 09dbc268-f2c8-453a-9417-826e77cd7b76] Fixed incorrect job type to be 4 _live_migration_monitor /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:9172
2022-05-10 23:29:20.443 7 ERROR nova.virt.libvirt.driver [-] [instance: 09dbc268-f2c8-453a-9417-826e77cd7b76] Migration operation has aborted



Version-Release number of selected component (if applicable):
RHOS-16.2-RHEL-8-20220427.n.3

How reproducible:
100%

Steps to Reproduce:
1. Deploy tls-e environment with above puddle
2. Execute tempest tests:
 a. tempest.api.compute.admin.test_live_migration.LiveAutoBlockMigrationV225Test.test_volume_backed_live_migration
 b. tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_volume_backed_live_migration
3.

Actual results:
Live migration fails

Expected results:
Live migration should succedd


Additional info:
Logs associated with one of the failures attached

1. Build info: https://rhos-ci-staging-jenkins.lab.eng.tlv2.redhat.com/job/DFG-all-unified-16.2_director-rhel-virthost-3cont_2comp_3ceph_1freeipa-ipv4-geneve-ceph-nfs-ganesha-default/14/
2. Test Report: https://rhos-ci-staging-jenkins.lab.eng.tlv2.redhat.com/job/DFG-all-unified-16.2_director-rhel-virthost-3cont_2comp_3ceph_1freeipa-ipv4-geneve-ceph-nfs-ganesha-default/14/testReport/tempest.api.compute.admin.test_live_migration/LiveMigrationTest/test_volume_backed_live_migration_id_5071cf17_3004_4257_ae61_73a84e28badd_volume_/

Comment 2 Artom Lifshitz 2022-05-16 14:55:40 UTC
The downstream that includes the fix is puppet-tripleo-11.7.0-2.20220405015037.el8ost, can you retry and report back if the same problem continues?

Comment 3 Artom Lifshitz 2022-05-16 15:51:05 UTC
"Edit": most likely dupe of https://bugzilla.redhat.com/show_bug.cgi?id=2079767

Comment 7 Bogdan Dobrelya 2022-05-17 13:28:31 UTC
*** Bug 2084208 has been marked as a duplicate of this bug. ***

Comment 8 Bogdan Dobrelya 2022-05-17 13:57:20 UTC
This is highly likely a blocker

Comment 19 errata-xmlrpc 2022-06-22 16:07:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.3 (Train)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:4793