Description of problem: Deployment of hosted engine fails with error Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: Description of problem: Deployment of hosted engine fails with restored backup from hosted engine 4.3.9 when CA renewal is selected Version-Release number of selected component (if applicable): 4.4.0 How reproducible: Always Steps to Reproduce: 1. Create backup of production hosted engine engine-backup --scope=all --mode=backup --file=/backups/migration-4.4/backup.bck --log=/backups/migration-4.4/backuplog.log 2. Clean install of host with oVirt Node 4.4.0 release ISO 3. hosted-engine --deploy --restore-from-file=/root/backup.bck Actual results: Deploy fails with the following error: [ ERROR ] ovirtsdk4.AuthError: Error during SSO authentication server_error : PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": false, "msg": "Error during SSO authentication server_error : PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target"} /var/log/engine.log in the HE shows: 2020-05-27 16:10:43,695+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-05-27 16:10:53,962+02 INFO [org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default task-8) [] locking user: admin due to interval failures 2020-05-27 16:10:58,956+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-05-27 16:11:09,222+02 INFO [org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default task-8) [] locking user: admin due to interval failures 2020-05-27 16:11:14,217+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-05-27 16:11:24,484+02 INFO [org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default task-8) [] locking user: admin due to interval failures 2020-05-27 16:11:29,480+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Expected results: Setup completes. Additional info: Possibly relevant: Production hosted engine uses user supplied certificate for the web server (and only for the web service, other certs/CA were generated by oVirt).
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
sounds like bug 1816648. Can you doublecheck the restore script you've used has that fix, and that you are running postgres 12?
Just to be sure there is no mixup: This is about a hosted engine deployment on a clean oVirt node installation (based on 4.4.0 release ISO from ovirt.org) so postgres was never installed directly by me.
Understood. It still needs to be enabled by dnf module enable postgresql:12 first, before you start the installation
I'm still at a a loss here, I'm sorry. Wouldn't that be the job of some - presumably ansible - script during deployment since postgres is inside the VM which is getting built and installed with minimal interaction with me?
Test Version: 4.3.9 build: ovirt-node-ng-installer-4.3.9-2020031917.el7.iso ovirt-engine-appliance-4.3-20200319.1.el7.x86_64.rpm 4.4.0 build: ovirt-node-ng-installer-4.4.0-2020052110.el8.iso ovirt-engine-appliance-4.4-20200520111649.1.el8.x86_64.rpm Test Steps: According to comment 0 Test Result: Deployment of hosted engine successfully with restored backup from hosted engine 4.3.9 if reply "No" to question "Renew engine CA on restore if needed? Please notice that if you choose Yes, all hosts will have to be later manually reinstalled from the engine.". [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20200606002713.conf' [ INFO ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ INFO ] Hosted Engine successfully deployed [ INFO ] Other hosted-engine hosts have to be reinstalled in order to update their storage configuration. From the engine, host by host, please set maintenance mode and then click on reinstall button ensuring you choose DEPLOY in hosted engine tab. [ INFO ] Please note that the engine VM ssh keys have changed. Please remove the engine VM entry in ssh known_hosts on your clients. QE cannot reproduce this issue. oliver.leinfelder, Do you have some special operation. If so, please listing the detail steps for the special operation.
My backup file comes from a hosted engine 4.3.9-1.el7. I'm using "hosted-engine --deploy --restore-from-file=/root/backup.bck" on a clean install ovirt node ng 4.4.0-1.el8 I replied "No" to question "Renew engine CA on restore if needed? Please notice that if you choose Yes, all hosts will have to be later manually reinstalled from the engine." The production hosted engine has it's apache-ca.pem replaced by a user provided CA according to this: https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html (and only Apache cert, all other certs remain untouched)
> replaced by a user provided CA you mean a valid CA signed by letsencrypt or other authorities? and > The production hosted engine has a valid domain name which can be visited from www?
No, the CA is user created but imported into the host trust store (as outlined in https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html) The hosted engine is not reachable globally but has a certificate from our (user created) CA to its FQDN + subject alternative name (both of which are resolvable via internal DNS)
Test Version: 4.3.9 build: ovirt-node-ng-installer-4.3.9-2020031917.el7.iso ovirt-engine-appliance-4.3-20200319.1.el7.x86_64.rpm 4.4.0 build: ovirt-node-ng-installer-4.4.0-2020052110.el8.iso ovirt-engine-appliance-4.4-20200520111649.1.el8.x86_64.rpm Test according to comment 0 with a third-party CA certificate according to this: https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html (and only Apache cert, all other certs remain untouched) Result: hosted engine restore failed at a similar issue ~~~~ engine.log ~~~~ 2020-06-11 14:08:09,738+08 ERROR [org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default task-1) [] Cannot authenticate using authentication Headers: server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-06-11 14:08:09,764+08 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-1) [] OAuthException access_denied: Cannot authenticate user 'None@N/A': No valid profile found in credentials.. Is this issue same with comment 0? If so, There is a d/s bug https://bugzilla.redhat.com/show_bug.cgi?id=1715767, similar to this one. And it was fixed as a document issue https://bugzilla.redhat.com/show_bug.cgi?id=1744522.
Created attachment 1696671 [details] he restore with self-ca logs
Please let us know if documentation at https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html-single/administration_guide/index?lb_target=production#Replacing_the_Manager_CA_Certificate or equivalent: https://ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html solves the issue.
(In reply to Sandro Bonazzola from comment #12) > Please let us know if documentation at > https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/ > html-single/administration_guide/ > index?lb_target=production#Replacing_the_Manager_CA_Certificate Notably, step 14 there, which makes engine-backup include the custom cert. > or equivalent: > https://ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html > solves the issue. That's an old, broken link. The correct one is: https://ovirt.org/documentation/administration_guide/#Replacing_the_Manager_CA_Certificate (In reply to Wei Wang from comment #10) > Test Version: > 4.3.9 build: ovirt-node-ng-installer-4.3.9-2020031917.el7.iso > ovirt-engine-appliance-4.3-20200319.1.el7.x86_64.rpm > 4.4.0 build: ovirt-node-ng-installer-4.4.0-2020052110.el8.iso > ovirt-engine-appliance-4.4-20200520111649.1.el8.x86_64.rpm > > Test according to comment 0 with a third-party CA certificate according to > this: > https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html (and > only Apache cert, all other certs remain untouched) > > Result: > hosted engine restore failed at a similar issue Wei, can you please attach also the backup file generated by engine-backup? Thanks. If you still have the machines, I'd like to have a look at them. > > ~~~~ > engine.log > ~~~~ > 2020-06-11 14:08:09,738+08 ERROR > [org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default task-1) [] > Cannot authenticate using authentication Headers: server_error: PKIX path > building failed: sun.security.provider.certpath.SunCertPathBuilderException: > unable to find valid certification path to requested target > 2020-06-11 14:08:09,764+08 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] > (default task-1) [] OAuthException access_denied: Cannot authenticate user > 'None@N/A': No valid profile found in credentials.. > > Is this issue same with comment 0? If so, There is a d/s bug > https://bugzilla.redhat.com/show_bug.cgi?id=1715767, similar to this one. > And it was fixed as a document issue > https://bugzilla.redhat.com/show_bug.cgi?id=1744522. Indeed, the result of this was adding step 14 that I mentioned above.
(In reply to Yedidyah Bar David from comment #13) > (In reply to Sandro Bonazzola from comment #12) > > Please let us know if documentation at > > https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/ > > html-single/administration_guide/ > > index?lb_target=production#Replacing_the_Manager_CA_Certificate > > Notably, step 14 there, which makes engine-backup include the custom cert. > > > or equivalent: > > https://ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html > > solves the issue. > > That's an old, broken link. The correct one is: > > https://ovirt.org/documentation/administration_guide/ > #Replacing_the_Manager_CA_Certificate > > (In reply to Wei Wang from comment #10) > > Test Version: > > 4.3.9 build: ovirt-node-ng-installer-4.3.9-2020031917.el7.iso > > ovirt-engine-appliance-4.3-20200319.1.el7.x86_64.rpm > > 4.4.0 build: ovirt-node-ng-installer-4.4.0-2020052110.el8.iso > > ovirt-engine-appliance-4.4-20200520111649.1.el8.x86_64.rpm > > > > Test according to comment 0 with a third-party CA certificate according to > > this: > > https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL.html (and > > only Apache cert, all other certs remain untouched) > > > > Result: > > hosted engine restore failed at a similar issue > > Wei, can you please attach also the backup file generated by engine-backup? > Thanks. If you still have the machines, I'd like to have a look at them. > I have the backup file in my local computer, but the test environment has gone already. I attached the backup file in attachment. > > > > ~~~~ > > engine.log > > ~~~~ > > 2020-06-11 14:08:09,738+08 ERROR > > [org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default task-1) [] > > Cannot authenticate using authentication Headers: server_error: PKIX path > > building failed: sun.security.provider.certpath.SunCertPathBuilderException: > > unable to find valid certification path to requested target > > 2020-06-11 14:08:09,764+08 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] > > (default task-1) [] OAuthException access_denied: Cannot authenticate user > > 'None@N/A': No valid profile found in credentials.. > > > > Is this issue same with comment 0? If so, There is a d/s bug > > https://bugzilla.redhat.com/show_bug.cgi?id=1715767, similar to this one. > > And it was fixed as a document issue > > https://bugzilla.redhat.com/show_bug.cgi?id=1744522. > > Indeed, the result of this was adding step 14 that I mentioned above.
Created attachment 1701696 [details] backup file
(In reply to Wei Wang from comment #15) > Created attachment 1701696 [details] > backup file The content of this file do not seem to include what I'd have expected if you followed step 14 (for engine-backup). Are you sure you did? Can you please retry? I now noticed that this step 14 is broken - filed bug 1859505 for that. Oliver - can you please also check this on your setup? Thanks!
(In reply to Yedidyah Bar David from comment #16) > (In reply to Wei Wang from comment #15) > > Created attachment 1701696 [details] > > backup file > > The content of this file do not seem to include what I'd have expected if > you followed step 14 (for engine-backup). Are you sure you did? Can you > please retry? I am not sure the problem in comment 10 is same with comment 0, so I haven't try the step 14. I will try it after I finish 4.3.11 svvp testing. > > I now noticed that this step 14 is broken - filed bug 1859505 for that. > > Oliver - can you please also check this on your setup? > > Thanks!
(In reply to Wei Wang from comment #17) > (In reply to Yedidyah Bar David from comment #16) > > (In reply to Wei Wang from comment #15) > > > Created attachment 1701696 [details] > > > backup file > > > > The content of this file do not seem to include what I'd have expected if > > you followed step 14 (for engine-backup). Are you sure you did? Can you > > please retry? > I am not sure the problem in comment 10 is same with comment 0, so I haven't > try the step 14. I will try it after I finish 4.3.11 svvp testing. Test with the below script instead of that in step 14, hosted engine restore successfully. ========================================================= BACKUP_PATHS="${BACKUP_PATHS} /etc/ovirt-engine-backup" cp -f /etc/pki/ovirt-engine/apache-ca.pem \ /etc/pki/ca-trust/source/anchors/3rd-party-ca-cert.pem update-ca-trust ========================================================= > > > > I now noticed that this step 14 is broken - filed bug 1859505 for that. > > > > Oliver - can you please also check this on your setup? > > > > Thanks!
Thanks! Anything else missing for marking this bug verified, other than bug 1859505?
(In reply to Yedidyah Bar David from comment #19) > Thanks! Anything else missing for marking this bug verified, other than bug > 1859505? You are welcome! Nothing. QE will verify this bug after bug 1859505 is fixed and ON_QA status is changed.
Very well. Moving to MODIFIED for now.
https://bugzilla.redhat.com/show_bug.cgi?id=1859505 status is NEW, move this bug to "ASSIGNED"
Moving back to MODIFIED, not sure what the process should ideally be, exactly. Perhaps we should not automatically move bugs from MODIFIED to QE if they depend on bugs in state <= MODIFIED? Perhaps this is hard...
Dependent doc bug 1859505 closed/published, moving to QE.
Based on comment 18, the procedure after applying the fix in bug 1859505 was already tested, so it's safe to move to VERIFIED. Will leave it for QE to do.
According to comment 24 and comment 25, QE move this bug to VERIFIED.
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days