Description of problem: Deployment of hosted engine fails with restored backup from hosted engine 4.3.9 when CA renewal is selected Version-Release number of selected component (if applicable): 4.4.0 How reproducible: Always Steps to Reproduce: 1. Create backup of production hosted engine engine-backup --scope=all --mode=backup --file=/backups/migration-4.4/backup.bck --log=/backups/migration-4.4/backuplog.log 2. Clean install of host with oVirt Node 4.4.0 release ISO 3. hosted-engine --deploy --restore-from-file=/root/backup.bck, select yes when asked for CA renewal Actual results: Deploy fails: A log file in /var/log/ovirt-engine/setup in the running but unfinished VM shows: 2020-05-27 00:17:09,660+0200 DEBUG otopi.context context._executeMethod:145 method exception Traceback (most recent call last): File "/usr/lib64/python3.6/site-packages/M2Crypto/BIO.py", line 279, in openfile f = open(filename, mode) FileNotFoundError: [Errno 2] No such file or directory: '/etc/pki/ovirt-engine/qemu-ca.pem' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/pki/ca.py", line 699, in _miscUpgrade if self._expired(self._x509_load_cert(ca_file)): File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/pki/ca.py", line 94, in _x509_load_cert res = X509.load_cert(f) File "/usr/lib64/python3.6/site-packages/M2Crypto/X509.py", line 802, in load_cert with BIO.openfile(file) as bio: File "/usr/lib64/python3.6/site-packages/M2Crypto/BIO.py", line 281, in openfile raise BIOError(ex.args) M2Crypto.BIO.BIOError: (2, 'No such file or directory') 2020-05-27 00:17:09,663+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Misc configuration': (2, 'No such file or directory') Expected results: Setup completes. Additional info: CA in production HE is valid.
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
do we want to do the renewal during restore? we can also just leave that as a post-restore task if it's complicated...
Renewal was not strictly necessary in this case (this was more a case of "Why not while we're at it .."). I had this issue on the users mailing list and Didi asked me to raise this here.
(In reply to Michal Skrivanek from comment #2) > do we want to do the renewal during restore? we can also just leave that as > a post-restore task if it's complicated... We can. Main reason for combining is that both upgrade and pki renew require a significant downtime, and we assume that many people would prefer to have a single such downtime instead of two. See also bug 1648190. (In reply to Oliver Leinfelder from comment #3) > Renewal was not strictly necessary in this case (this was more a case of > "Why not while we're at it .."). I had this issue on the users mailing list > and Didi asked me to raise this here. Indeed. Thanks! The problem seems to be around our new code for the qemu ca, affecting only 4.4. I didn't try to reproduce yet myself.
Didi isn't this a duplicate of bug #1841203 ?
(In reply to Sandro Bonazzola from comment #5) > Didi isn't this a duplicate of bug #1841203 ? I don't think they are related. It seems to be introduced by a patch for bug 1739557 [1]. It adds code to handle qemu-ca authority: - A method _create_qemu_ca, with name=QEMU_CA_AVAILABLE, that creates the CA - Some other stuff, including generalizing the code handling the engine's CA, and in particular, handling of renewal also of qemu-ca, in the existing method _miscUpgrade. _miscUpgrade was changed to be ran before=QEMU_CA_AVAILABLE. This means that we try to upgrade, before we create. So if we do need to upgrade, we fail. Milan - do you remember why you added this before=QEMU_CA_AVAILABLE? If it's needed, we should add a small test to not fail if it's missing, something like "os.path.exists(ca_file) and ...". Otherwise, perhaps we should run them in the opposite order, changing to "after=QEMU_CA_AVAILABLE". [1] https://gerrit.ovirt.org/104240
(In reply to Yedidyah Bar David from comment #6) > Milan - do you remember why you added this before=QEMU_CA_AVAILABLE? I think I just followed what happens to CA_AVAILABLE. I tried to handle the QEMU CA in a similar way as the primary CA. Which may be right or wrong in this case, IIRC I didn't have any extra reason beyond copy&paste to put the thing at that particular place.
Blocked with currently outdated rhvm-appliance-4.4-20200722.0.el8ev.x86_64.rpm appliance. Will have to wait for the latest bits to arrive.
Upgrade worked for me with CA renewal as expected on these components: ovirt-engine-setup-4.4.2.3-0.6.el8ev.noarch ovirt-ansible-hosted-engine-setup-1.1.8-1.el8ev.noarch ovirt-hosted-engine-setup-2.4.6-1.el8ev.noarch ovirt-hosted-engine-ha-2.4.4-1.el8ev.noarch Red Hat Enterprise Linux release 8.2 (Ootpa) Linux 4.18.0-193.14.3.el8_2.x86_64 #1 SMP Mon Jul 20 15:02:29 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.