+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1791007 +++ ====================================================================== Description of problem: An attempt to connect to a VM with vm-console fail with an error: key_cert_check_authority: invalid certificate Certificate invalid: not a host certificate Host key verification failed. Connection to <engine_fqdn> closed. Version-Release number of selected component (if applicable): - ovirt-engine-4.4.0-0.14.master.el7.noarch - vdsm-4.40.0-180.giteba0b75.el8ev.ppc64le (on the host the vm is running on) - libvirt-daemon-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64 (on the host the vm is running on) - qemu-kvm-4.1.0-14.module+el8.1.0+4548+ed1300f4.x86_64 (on the host the vm is running on) How reproducible: 100% Steps to Reproduce: 1. Create VM with 'Console > Enable VirtIO serial console' enabled and start the VM. 2. Create ssh-key on engine by ssh-keygen -t rsa -f /root/.ssh/sc_test_key -q -N '' 3. Set the pub key (/root/.ssh/sc_test_key.pub) on engine web-UI under 'Options' 4. Make sure the VM is available for connection by running the following on the engine: ssh -o StrictHostKeyChecking=no -t -i /root/.ssh/sc_test_key -p 2222 ovirt-vmconsole@<engine_fqdn> list 5. Try connecting to the VM by running the following on the engine: ssh -o StrictHostKeyChecking=no -t -i /root/.ssh/sc_test_key -p 2222 ovirt-vmconsole@<engine_fqdn> connect --vm-name=<test_vm> Actual results: The following error appears: key_cert_check_authority: invalid certificate Certificate invalid: not a host certificate Host key verification failed. Connection to <engine_fqdn> closed. Expected results: The VM console should appear. Additional info: - The serial-getty service should be running on the vm in order to connect by vm-console (ex. systemctl start serial-getty@ttyS0) - Same flow as above works as expected on RHV-4.3 (ovirt-engine-4.3.8.1-0.1.master.el7.noarch) (Originally by Beni Pelled)
For QE: The problem was with a wrongly generated certificate. It has been fixed and newly deployed hosts or hosts after new certificate enrollment should be fine. However, the fix helps only with a proxy on el7. On el8, the connection still doesn't work, for a different reason, apparently due to changes in ssh. It's going to be solved as part of the ovirt-vmconsole el8 port. (Originally by Milan Zamazal)
this bug is targeting 4.4.2 and is in modified state. Can we retarget to 4.4.0 and move to QE? (Originally by Sandro Bonazzola)
(In reply to Sandro Bonazzola from comment #2) > this bug is targeting 4.4.2 and is in modified state. Can we retarget to > 4.4.0 and move to QE? Yes. (Originally by Milan Zamazal)
I've run into this issue after building a new RHHI cluster from RHVH-4.3-20191211.3 and then upgrading to redhat-virtualization-host-image-update-4.3.8-20200126.0.el7_7.noarch Can you provide any details on how to correct the certificate issue? (Originally by John Call)
(In reply to John Call from comment #4) > Can you provide any details on how to correct the certificate issue? Enroll new certificates for the host, e.g. from the Web UI. (Originally by Milan Zamazal)
(In reply to Milan Zamazal from comment #5) > Enroll new certificates for the host, e.g. from the Web UI. Thanks! I found the "Enroll Certificate" button after clicking on the Host screen and then clicking into the "Installation" drop-down. But it didn't solve my problem... I put each of my hosts into maintenance mode (one at a time) and did the "Enroll Certificate" action. Do I need to do anything on the rhvm host, since that is where my SSH connection is terminating and where the ovirt-vmconsole-proxy service is running? [root@rhvm ~]# ssh -i ~/.ssh/id_rsa -p 2222 -t ovirt-vmconsole.iad.redhat.com Available Serial Consoles: 00 HostedEngine[b421251f-73f4-4ea6-a598-50175efbe63c] 01 ocp4-helper[3407335a-a2cf-44bb-a511-cf0cdd9d4cf3] 02 ocp4-master0[00a2a19a-3bcd-4968-b990-dfb0c7c17700] 03 ocp4-master1[800c06c9-3e40-460b-9a83-876f20b35ddc] 04 ocp4-master2[b6d5ff40-ed3a-4ba1-9650-9eb67bb7b72c] 05 ocp4-worker0[d4b02a5b-2798-429b-a2c2-5141ab952665] 06 ocp4-worker1[80d7d43b-5e42-4d06-a7e9-96e9ad661e26] 07 usaf-edge-helper[cc0deb16-46c8-4666-83dd-25278ec57937] Please, enter the id of the Serial Console you want to connect to. To disconnect from a Serial Console, enter the sequence: <Enter><~><.> SELECT> 1 key_cert_check_authority: invalid certificate Certificate invalid: not a host certificate Host key verification failed. Connection to rhvm.dota-lab.iad.redhat.com closed. [root@rhvm ~]# (Originally by John Call)
(In reply to John Call from comment #6) > Do I need to do anything on > the rhvm host, since that is where my SSH connection is terminating and > where the ovirt-vmconsole-proxy service is running? No, Engine should do everything needed. You can try to restart ovirt-vmconsole-host-sshd service on the host to be sure. > key_cert_check_authority: invalid certificate > Certificate invalid: not a host certificate > Host key verification failed. This looks like the situation before the fix. What's your Engine version? (Originally by Milan Zamazal)
(In reply to Milan Zamazal from comment #7) > You can try to restart ovirt-vmconsole-host-sshd service on the host to be sure. > > This looks like the situation before the fix. What's your Engine version? I tried restarting the ovirt-vmconsole-host-sshd, but I get the same error. I also tried the old-fashioned reboot the entire RHVM, but that didn't help either. I'm running this version of engine... [root@rhvm ~]# rpm -q ovirt-engine ovirt-engine-4.3.8.2-0.4.el7.noarch (Originally by John Call)
(In reply to John Call from comment #8) > I'm running this version of engine... > > [root@rhvm ~]# rpm -q ovirt-engine > ovirt-engine-4.3.8.2-0.4.el7.noarch This bug concerns only 4.4, specifically the move to ansible. AFAIK a different mechanism is used for certificate updates in 4.3. I'd suggest filing a bug on Engine 4.3 to handle the problem in your environment. (Originally by Milan Zamazal)
There are two problems I experienced on 4.4. There are Python 3 problems in the helper script after switching Engine to el8. I posted fixes to gerrit. The other problem is that /etc/pki/ovirt-engine/private/ca.pem and /etc/pki/ovirt-vmconsole/ca.pub are used quite interchangeably with vmconsole. ca.pem is used to sign host vmconsole keys, while ca.pub is used to check them and to authenticate against them. In other words, they are expected to be the same. ca.pem is placed in a private location, inaccessible to ovirt-vmconsole user. When vmconsole certificates and keys already exist, engine-setup doesn't update them. This was a problem I've experienced in my environment. Deleting /etc/pki/ovirt-vmconsole and running engine-setup again fixed it. It should be ensured that vmconsole certificates and keys are updated in case ca.pem is changed. (Originally by Milan Zamazal)