- Here is a simple architecture for my demo deployment using KVM VMs: 1) rhevm1.ab.lab -> oVirt Engine (CentOS 7.4 x86_64) 2) rhevh1.ab.lab -> oVirt Enterprise Linux Host (CentOS 7.4 x86_64) 3) rhevh2.ab.lab -> oVirt Enterprise Linux Host (CentOS 7.4 x86_64) - How to reproduce issue: ========================= 1) I installed 'ovirt-engine' successfully. 2) I tried to add 'rhevh1.ab.lab' from 'rhevm1.ab.lab' using 'both' root password or SSH public key. Everything was OK until I had the following error message using SSH public key (I had the same with root password): "Host rhevh1 installation failed. Failed to execute Ansible host-deploy role. Please check logs for more details: /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20171229144439-rhevh1.ab.lab-13b2b5c4.log." Checking log file showed the following: [root@rhevm1 tmp]# cat /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20171229144439-rhevh1.ab.lab-13b2b5c4.log 2017-12-29 14:44:40,320 p=9666 u=ovirt | [DEPRECATION WARNING]: The use of 'include' for tasks has been deprecated. Use ' import_tasks' for static inclusions or 'include_tasks' for dynamic inclusions. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 2017-12-29 14:44:40,321 p=9666 u=ovirt | [DEPRECATION WARNING]: include is kept for backwards compatibility but usage is discouraged. The module documentation details page may explain more about this rationale.. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 2017-12-29 14:44:40,458 p=9666 u=ovirt | PLAY [all] ********************************************************************* 2017-12-29 14:44:40,469 p=9666 u=ovirt | TASK [Gathering Facts] ********************************************************* 2017-12-29 14:44:40,801 p=9666 u=ovirt | fatal: [rhevh1.ab.lab]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh_exchange_identification: Connection closed by remote host\r\n", "unreachable": true} 2017-12-29 14:44:40,802 p=9666 u=ovirt | PLAY RECAP ********************************************************************* 2017-12-29 14:44:40,802 p=9666 u=ovirt | rhevh1.ab.lab : ok=0 changed=0 unreachable=1 failed=0 3) I tried to simulate Ansilbe role work by using simple ansible module using 'root' user: [root@rhevm1 tmp]# ansible -m ping rhevhs -i /tmp/tmphosts The authenticity of host '192.168.150.21 (<no hostip for proxy command>)' can't be established. ECDSA key fingerprint is SHA256:fPem/Gn67v6tEbQpX/NJix/Ff7gPJsQ9Hz9SSBBc5ts. ECDSA key fingerprint is MD5:12:ca:c7:84:93:50:3f:33:b5:39:e2:20:2d:13:17:43. Are you sure you want to continue connecting (yes/no)? yes 192.168.150.21 | SUCCESS => { "changed": false, "ping": "pong" } 4) It seems that above ansible role is not able to deal with SSH fingerprint for the first time even if I was able to collect 'rhevh1.ab.lab' fingerprint during adding host procedures. 5) I have tried to disable this feature in ansible.cfg file: [root@rhevm1 tmp]# grep host_key_checking /etc/ansible/ansible.cfg host_key_checking = False 6) Again the same results: [root@rhevm1 tmp]# cat /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20171229145907-rhevh1.ab.lab-3e8bace0.log 2017-12-29 14:59:08,005 p=10135 u=ovirt | [DEPRECATION WARNING]: The use of 'include' for tasks has been deprecated. Use 'import_tasks' for static inclusions or 'include_tasks' for dynamic inclusions. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 2017-12-29 14:59:08,006 p=10135 u=ovirt | [DEPRECATION WARNING]: include is kept for backwards compatibility but usage is discouraged. The module documentation details page may explain more about this rationale.. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 2017-12-29 14:59:08,139 p=10135 u=ovirt | PLAY [all] ********************************************************************* 2017-12-29 14:59:08,151 p=10135 u=ovirt | TASK [Gathering Facts] ********************************************************* 2017-12-29 14:59:08,477 p=10135 u=ovirt | fatal: [rhevh1.ab.lab]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh_exchange_identification: Connection closed by remote host\r\n", "unreachable": true} 2017-12-29 14:59:08,478 p=10135 u=ovirt | PLAY RECAP ********************************************************************* 2017-12-29 14:59:08,478 p=10135 u=ovirt | rhevh1.ab.lab : ok=0 changed=0 unreachable=1 failed=0 - Questions: 1) How could I resolve this issue? 2) Is this behaviors considered as a bug to be reported? Thanks, Ab
That's really strange, could you please attach engine logs to the bug? They are located on rhevm1.ab.lab in /var/log/ovirt-engine and its subdirectories.
Created attachment 1374507 [details] ovirt-engine logs
Created attachment 1374508 [details] sosreport file complete sosreport file from 'rhevm1.ab.lab'
Hello Martin, I have attached the requested files + complete sosreport file for your review. Please let me know if you need further data/files. Thank you again for your help. Ab
As you can see in the log you are connecting to the remote machine as ovirt user, which is incorrect, as the ssh keys are for root user. Can you please check if in file /usr/share/ovirt-engine/playbooks/ovirt-host-deploy.yml you have 'remote_user: root' (or if possible share whole file)? Also can you please share the verbose log of following command: /usr/bin/ansible-playbook -vvvv --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=192.168.150.21, --extra-vars=host_deploy_cluster_version=4.2 --extra-vars=host_deploy_gluster_enabled=false --extra-vars=host_deploy_virt_enabled=true --extra-vars=host_deploy_vdsm_port=54321 --extra-vars=host_deploy_override_firewall=true --extra-vars=host_deploy_firewall_type=FIREWALLD --extra-vars=ansible_port=22 --extra-vars=host_deploy_post_tasks=/etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml /usr/share/ovirt-engine/playbooks/ovirt-host-deploy.yml
Hello Ondra, - This was strange from me also. I have logged to "ovirt-engine" web interface using "admin" user. - "/usr/share/ovirt-engine/playbooks/ovirt-host-deploy.yml" file does not have "remote_user" entry: [root@rhevm1 ~]# cat /usr/share/ovirt-engine/playbooks/ovirt-host-deploy.yml - hosts: all remote_user: root roles: - ovirt-host-deploy tasks: - name: Check if post tasks file exists stat: path: "{{ host_deploy_post_tasks }}" delegate_to: localhost register: stat_post_tasks when: "host_deploy_post_tasks is defined" - name: Executing post tasks defined by user include: "{{ host_deploy_post_tasks }}" when: "host_deploy_post_tasks is defined and stat_post_tasks.stat.exists" [root@rhevm1 ~]# cat /usr/share/ovirt-engine/playbooks/roles/ovirt-host-deploy/meta/main.yml dependencies: - ovirt-host-deploy-facts - ovirt-provider-ovn-driver - ovirt-host-deploy-libvirt-guests - ovirt-host-deploy-firewalld [root@rhevm1 ~]# cat /usr/share/ansible/roles/ovirt-infra/roles/oVirt.hosts/tasks/main.yml --- - name: Get hosts ovirt_hosts_facts: auth: "{{ ovirt_auth }}" pattern: "{{ hosts | map(attribute='name') | map('regex_replace', '(.*)', 'name=\\1') | list | join(' or ') }} status=installfailed" tags: - hosts - reinstall - name: Reinstall hosts ovirt_hosts: auth: "{{ ovirt_auth }}" state: reinstalled name: "{{ item.name }}" public_key: "{{ item.password is undefined }}" password: "{{ item.password | default(omit) }}" with_items: - "{{ ovirt_hosts | default([]) }}" tags: - hosts - reinstall - name: Add hosts ovirt_hosts: auth: "{{ ovirt_auth }}" state: "{{ item.state | default(omit) }}" name: "{{ item.name }}" address: "{{ item.address }}" cluster: "{{ item.cluster }}" password: "{{ item.password | default(omit) }}" public_key: "{{ item.public_key | default(omit) }}" override_iptables: true timeout: "{{ item.timeout | default(ovirt_hosts_add_timeout) }}" poll_interval: "{{ item.poll_interval | default(20) }}" with_items: - "{{ hosts | default([]) }}" async: "{{ ovirt_hosts_max_timeout }}" poll: 0 register: add_hosts tags: - hosts - name: Wait for hosts to be added async_status: "jid={{ item.ansible_job_id }}" register: job_result with_items: - "{{ add_hosts.results }}" tags: - hosts until: job_result.finished retries: "{{ ovirt_hosts_max_timeout // 20 }}" delay: 20 - I am going to attach mentioned command output. Best regards, Ab
Created attachment 1374653 [details] ansible-playbook results [root@rhevm1 ~]# /usr/bin/ansible-playbook -vvvv --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=192.168.150.21, --extra-vars=host_deploy_cluster_version=4.2 --extra-vars=host_deploy_gluster_enabled=false --extra-vars=host_deploy_virt_enabled=true --extra-vars=host_deploy_vdsm_port=54321 --extra-vars=host_deploy_override_firewall=true --extra-vars=host_deploy_firewall_type=FIREWALLD --extra-vars=ansible_port=22 --extra-vars=host_deploy_post_tasks=/etc/ovirt-engine/ansible/ovirt-host-deploy-post-tasks.yml /usr/share/ovirt-engine/playbooks/ovirt-host-deploy.yml Results
It looks like your private key is correctly readable by root user, but it's not readable by ovirt user, can you please share output of the following command: # ls -l /etc/pki/ovirt-engine/keys/engine_id_rsa Thanks! Also did do a clean ovirt 4.2 installation, or did you upgraded? I saw in log you had incorrect repos on the hosts.. did you installed 4.1 first?
Hello Ondra, - Happy new year and please accept my apologies for late reply. - 'ovirt' user owes mentioned SSH private key: [root@rhevm1 ~]# ls -l /etc/pki/ovirt-engine/keys/engine_id_rsa -rw-------. 1 ovirt ovirt 1828 Dec 29 12:22 /etc/pki/ovirt-engine/keys/engine_id_rsa [root@rhevm1 ~]# stat /etc/pki/ovirt-engine/keys/engine_id_rsa File: ‘/etc/pki/ovirt-engine/keys/engine_id_rsa’ Size: 1828 Blocks: 8 IO Block: 4096 regular file Device: fd00h/64768d Inode: 2884595 Links: 1 Access: (0600/-rw-------) Uid: ( 108/ ovirt) Gid: ( 108/ ovirt) Context: unconfined_u:object_r:cert_t:s0 Access: 2017-12-30 19:16:30.184603432 +0100 Modify: 2017-12-29 12:22:41.279130934 +0100 Change: 2017-12-29 12:31:49.909717510 +0100 Birth: - - Yes, I did clean ovirt-engine installation using 4.2, the only thing that I tried instillation multiple times by reverting back to KVM snapshots as I am using KVM hypervisor for my demo. Here are ovirt-engine current repos: [root@rhevm1 ~]# yum repolist Loaded plugins: fastestmirror, versionlock Loading mirror speeds from cached hostfile * base: centos.mirror.colo-serv.net * extras: mirror.its.dal.ca * ovirt-4.2: mirrors.rit.edu * ovirt-4.2-epel: mirrors.rit.edu * updates: centos.mirror.ca.planethoster.net repo id repo name status !base/7/x86_64 CentOS-7 - Base 9,591 !centos-sclo-rh-release/x86_64 CentOS-7 - SCLo rh 6,534 !extras/7/x86_64 CentOS-7 - Extras 327 !ovirt-4.2/7 Latest oVirt 4.2 Release 179 !ovirt-4.2-centos-gluster312/x86_64 CentOS-7 - Gluster 3.12 93 !ovirt-4.2-centos-opstools/x86_64 CentOS-7 - OpsTools - release 421 !ovirt-4.2-centos-ovirt42/x86_64 CentOS-7 - oVirt 4.2 201 !ovirt-4.2-centos-qemu-ev/x86_64 CentOS-7 - QEMU EV 39 !ovirt-4.2-epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 12,184 !ovirt-4.2-virtio-win-latest virtio-win builds roughly matching what will be shipped in 35 !updates/7/x86_64 CentOS-7 - Updates 1,573 repolist: 31,177 [root@rhevm1 ~]# rpm -qa | grep ovirt-release ovirt-release42-4.2.0-1.el7.centos.noarch Also, 'rhevh1.ab.lab' current repos: [root@rhevh1 ~]# yum repolist Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: centos.zswap.net * extras: centos.mirror.globo.tech * ovirt-4.2: mirrors.rit.edu * ovirt-4.2-epel: mirror.umd.edu * updates: centos.zswap.net repo id repo name status !base/7/x86_64 CentOS-7 - Base 9,591 !centos-sclo-rh-release/x86_64 CentOS-7 - SCLo rh 6,534 !extras/7/x86_64 CentOS-7 - Extras 327 !ovirt-4.2/7 Latest oVirt 4.2 Release 179 !ovirt-4.2-centos-gluster312/x86_64 CentOS-7 - Gluster 3.12 93 !ovirt-4.2-centos-opstools/x86_64 CentOS-7 - OpsTools - release 421 !ovirt-4.2-centos-ovirt42/x86_64 CentOS-7 - oVirt 4.2 201 !ovirt-4.2-centos-qemu-ev/x86_64 CentOS-7 - QEMU EV 39 !ovirt-4.2-epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 12,184 !ovirt-4.2-virtio-win-latest virtio-win builds roughly matching what will be shipped in 35 !updates/7/x86_64 CentOS-7 - Updates 1,573 repolist: 31,177 [root@rhevh1 ~]# rpm -qa | grep ovirt-release ovirt-release42-4.2.0-1.el7.centos.noarch Which log part you are in doubt about? - Please let me know if you need further files/info? Best regards, Ab
Hello, I have the similar problem after ovirt upgrade from 4.1 to 4.2.0.2-1. No ansible action is working now from UI (check for upgrade, upgrade, host re/install) You recommend to manually run ansible playbooks. It's working well under root account. But I think engine is running under ovirt user which have set /sbin/nologin shell by default. So if you run this command under 'sudo -u ovirt', problem occurs (ssh_exchange_identification: Connection closed by remote host). If I try to change login shell for ovirt user to /bin/bash or run 'sudo -s -u ovirt', ansible playbook is working well again. But it's still not working from ovirt UI. One hint: If I run ansible-playlist from commandline to the same server in parallel, web UI function works. It looks like ansible sharing SSH connection. @Ab, can you confirm the same behaviour? Case: Try to check host updates from UI and next from cmdline. Then run both in parallel. Best Regards Petr Istenik Cloud Expert
(In reply to tm-pi from comment #10) > Hello, > I have the similar problem after ovirt upgrade from 4.1 to 4.2.0.2-1. > No ansible action is working now from UI (check for upgrade, upgrade, host > re/install) > > You recommend to manually run ansible playbooks. It's working well under > root account. But I think engine is running under ovirt user which have set > /sbin/nologin shell by default. > So if you run this command under 'sudo -u ovirt', problem occurs > (ssh_exchange_identification: Connection closed by remote host). > If I try to change login shell for ovirt user to /bin/bash or run 'sudo -s > -u ovirt', ansible playbook is working well again. > But it's still not working from ovirt UI. The created ansible-playbook does not have its own terminal. So the value in /etc/passwd shouldn't be relevant. I cannot reproduce the issue, but I will keep on investigating, btw, don't you use IBM java, or what Java version/vendor do you use? > > One hint: If I run ansible-playlist from commandline to the same server in > parallel, web UI function works. It looks like ansible sharing SSH > connection. > > @Ab, can you confirm the same behaviour? > Case: Try to check host updates from UI and next from cmdline. Then run both > in parallel. > > Best Regards > > Petr Istenik > Cloud Expert
No IBM java, repository version only. I tryied to reproduce my problem and I have found root of the problem. Error occurs only if I install and configure ipa-client(sssd) on engine server.
Thanks for the info! @Ab can you confirm you have same issue? Did you configure ipa-client?
To workaround the problem you can comment the line: ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h in /etc/ssh/ssh_config on ovirt-engine machine.
*** Bug 1528974 has been marked as a duplicate of this bug. ***
*** Bug 1531967 has been marked as a duplicate of this bug. ***
Ondra, Since you closed my ticket as a duplicate, I will comment on my issue here. I tried your workaround re commenting the line in my /etc/ssh/ssh_config on engine and then checked for updates. I still get "Failed to check for available updates", so in my case, it does not resolve the issue.
(In reply to jas from comment #17) > Ondra, > Since you closed my ticket as a duplicate, I will comment on my issue here. > I tried your workaround re commenting the line in my /etc/ssh/ssh_config on > engine and then checked for updates. I still get "Failed to check for > available updates", so in my case, it does not resolve the issue. Can you still see: Ansible playbook command has exited with value: 4 or do you see any other new issue? Can you please share the log if there is some new error message.
Yup: 2018-01-09 10:44:58,197-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-3) [44947ca3-1e4e-422f-9c18-f2e5c7b7d81d] Ansible playbook command has exited with value: 4 and in /etc/ssh/ssh_config: #ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h
(In reply to tm-pi from comment #12) > No IBM java, repository version only. > I tryied to reproduce my problem and I have found root of the problem. Error > occurs only if I install and configure ipa-client(sssd) on engine server. Hmmm! My ovirt engine 'rhevhm1.ab.lab' is domain member of 'ab.lab' domain also. I user FreeIPA 3.4 as server and 'rhevhm1.ab.lab' uses ipa-client. I understand that you tested the same scenario without letting ovirt engine joint IPA server and it was working fine? Best regards, Ab
(In reply to Ondra Machacek from comment #13) > Thanks for the info! > > @Ab can you confirm you have same issue? Did you configure ipa-client? Yes, I did. rhevm1.ab.lab (oVirt engine uses 'ipa-client' to join 'ab.lab'. Do you think it is the root cause? Best regards, Ab
(In reply to Ondra Machacek from comment #14) > To workaround the problem you can comment the line: > > ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h > > in /etc/ssh/ssh_config on ovirt-engine machine. Hi Ondra, Your workaround worked fine with me and I managed to register 'rhevh1.ab.lab'. Are you going to fix such behviour soon as I do not know what is the side effect for commenting mentioned line in '/etc/ssh/ssh_config' file? Also, I will try to keep mentioned line in '/etc/ssh/ssh_config' file as it is, but I will withdraw 'rhevhm1.ab.lab' from 'ab.lab' domain and see if IPA domain does make problem. Best regards, Ab
Hi Ondra, Here are my steps: ================== 1) Remove 'rhevm1.ab.lab' from 'ab.lab' domain: [root@rhevm1 ~]# ipa-client-install --uninstall [...] The ipa-client-install command was successful [root@rhevm1 ~]# systemctl stop ovirt-engine [root@rhevm1 ~]# reboot 2) Adding 'rhevh1.ab.lab' Host. Result: It DOES work! - So, my summary: If oVirt engine is registered to FreeIPA 4.4 domain using ipa-client, we can not register oVirt hosts. How to fix (workarounds): 1) Adjust '/etc/ssh/ssh_config' file which we do not know its drawback. 2) Remove oVirt engine from FreeIPA, which could not be suitable from some environment. What do you think? Ab
(In reply to jas from comment #19) > Yup: > 2018-01-09 10:44:58,197-05 INFO > [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] > (EE-ManagedThreadFactory-commandCoordinator-Thread-3) > [44947ca3-1e4e-422f-9c18-f2e5c7b7d81d] Ansible playbook command has exited > with value: 4 > > and in /etc/ssh/ssh_config: > #ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h Have you restarted sshd service after making above change?
(In reply to Martin Perina from comment #25) > (In reply to jas from comment #19) > > Yup: > > 2018-01-09 10:44:58,197-05 INFO > > [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] > > (EE-ManagedThreadFactory-commandCoordinator-Thread-3) > > [44947ca3-1e4e-422f-9c18-f2e5c7b7d81d] Ansible playbook command has exited > > with value: 4 > > > > and in /etc/ssh/ssh_config: > > #ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h > > Have you restarted sshd service after making above change? Ahh, please ignore, I noticed a seconds after sending a reply that this change is in the client config and not a server one, sorry.
Ab, The following two lines in the /etc/ssh/ssh_config file are used to configure SSH to fetch hosts public key information from their directory entries in FreeIPA ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h GlobalKnownHostsFile /var/lib/sss/pubconf/known_hosts The ovirt user has set /sbin/nologin shell, so we can't execute /usr/bin/sss_ssh_knownhostsproxy command as ovirt user, that's why this whole thing fails. jas.ca , Very strange, can you please share the whole /etc/ssh/ssh_config ? Are there any more ProxyCommands?
Martin: No problem.. :) Ondra: It's just the basics.. I didn't change it from the default, but I'll add it here .. no more ProxyCommand though.. I do notice that /var/lib/sss/pubconf/known_hosts is empty.. is that an issue?
Created attachment 1379234 [details] ssh_config from engine on my host
OK, thanks, last thing which will hopefully help me to figure why it doesn't work for you, Can you please execute following command: # sudo -u ovirt /usr/bin/ansible-playbook -vvvvv --check --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=IP_OF_YOUR_HOST, /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml And share the output? Thanks a lot!
I did: sudo -u ovirt /usr/bin/ansible-playbook -vvvvv --check --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=virt2-mgmt, /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml Now we may be getting somewhere... when it tries to connect via SSH, I see: The authenticity of host 'virt2-mgmt (192.168.0.35)' can't be established. ECDSA key fingerprint is <fingerprint>. ECDSA key fingerprint is <fingerprint>. Are you sure you want to continue connecting (yes/no)? If I say "yes" then it actually works from the command prompt.. the end result being: PLAY RECAP *********************************************************************************************** virt2-mgmt : ok=2 changed=1 unreachable=0 failed=0 It's not clear why the system isn't "known"... Now I go back into engine, and repeat the check.. there's no error, but the status for update doesn't clear either.
(In reply to Ondra Machacek from comment #27) > Ab, > > The following two lines in the /etc/ssh/ssh_config file are used to > configure SSH to fetch hosts public key information from their directory > entries in FreeIPA > > ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h > GlobalKnownHostsFile /var/lib/sss/pubconf/known_hosts > > > The ovirt user has set /sbin/nologin shell, so we can't execute > /usr/bin/sss_ssh_knownhostsproxy command as ovirt user, that's why this > whole thing fails. > > Well, will oVirt team have a fix for that behavior for oVirt Engine IPA client in the next releases to avoid such workaround? or will you consider it as worked as designed case and the only way is to use your workaround? From my side, I would go with option #1 to have fix for such behavior in the next releases. Best regards, Ab
Hello, If i am not mistaken, Martin's last update indicates that you are going to fix oVirt engine behavior when it is FreeIPA client in the next release(s), right? Best regards, Ab
Well, we don't have a fix for that, because FreeIPA code is too invasive. Here are our options: 1. We cannot disable execution of ProxyCommand enabled in system wide settings, we can only completely ignore ssh system wide settings. But that could cause side effects (for example part of the settings is the list of enabled ciphers). 2. We could change the shell of ovirt user to /bin/bash, but we have /bin/nologin there on purpose, we don't want to allow to execute under this user anything else than ovirt services. And even with this change we would need to connect to the host using ssh command line before adding it to oVirt engine to add it to the list of known hosts, which is completely unnecessary for all other non-FreeIPA uses cases. 3. We could try to edit /etc/ssh/ssh_config as a part of engine-setup to remove offending ProxyCommand settings, but that's quite error prone. None of those options are good from oVirt point of view, so we have decided to add a test into engine-setup to detect if FreeIP ProxyCommand is enabled and if so we will fail engine-setup with the message that administrator needs to remove that option manually if he want to continue with oVirt 4.2 installation/upgrade (of course we need to mention that also in release notes). That way administrator will know which options causes the issue and he could use FreeIPA domain host attachment with oVirt on the same machine
Thank you Martin for your reply. - Regarding option #2, why not to allow 'ovirt' user to have temporary 'BASH' access ((if)) it belongs to FreeIPA domain. I have tested the following on my 'oVirt' engine: [root@rhevm1 ~]# su - ovirt -s /bin/bash Last login: Fri Dec 29 14:02:19 CET 2017 on pts/0 -bash-4.2$ echo $SHELL /bin/bash -bash-4.2$ id uid=108(ovirt) gid=108(ovirt) groups=108(ovirt) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 -bash-4.2$ pwd /var/lib/ovirt-engine -bash-4.2$ ssh root@rhevh1 The authenticity of host 'rhevh1 (192.168.150.21)' can't be established. ECDSA key fingerprint is SHA256:fPem/Gn67v6tEbQpX/NJix/Ff7gPJsQ9Hz9SSBBc5ts. ECDSA key fingerprint is MD5:12:ca:c7:84:93:50:3f:33:b5:39:e2:20:2d:13:17:43. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'rhevh1' (ECDSA) to the list of known hosts. Password: Last login: Tue Jan 9 20:03:47 2018 from gateway [root@rhevh1 ~]# id uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 [root@rhevh1 ~]# logout -bash-4.2$ ls -lh /var/lib/ovirt-engine/.ssh/ total 4.0K -rw-r--r--. 1 ovirt ovirt 548 Jan 10 17:53 known_hosts -bash-4.2$ cat /var/lib/ovirt-engine/.ssh/known_hosts rhevh1.ab.lab,192.168.150.21 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBKwLob5OBRyIKbT0wxSpnayoZ2aWCBy9vRJc0bweIkZXRGbC9YPstXLwXQ8mZ4vQk6+ZMRbIswOBMrOSsyO9bPA= rhevh3.ab.lab,192.168.150.23 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBD3X1fzndU8cyT9q9ckWdI6i0hDuKg53flMRcUt5wWhse/4jqEXDXFWyWaaJzHGWfr/DitHO+YjPImif1Snj5Wg= rhevh1 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBKwLob5OBRyIKbT0wxSpnayoZ2aWCBy9vRJc0bweIkZXRGbC9YPstXLwXQ8mZ4vQk6+ZMRbIswOBMrOSsyO9bPA= What do you think? - Also, I think this will be the same behavior with Red Hat IDM Server? I did not test that, but I think it will be the same scenario as FreeIPA is the upstream project for Red Hat IDM Server? Best regards, Ab
I just wanted to add that after manually adding hosts to known_hosts, if there are updates, they can now install and the new updates symbol will vanish, However, if I've installed the updates already via yum, and run check, then the update flag is not cleared like it used to be. That's still a bug. I understand the issue with supporting IPA but I also think that since it's a Red Hat product, it should be supported 'out of the box' without this manual change. It's not clear if the requirement to manually add hosts to known hosts is something users need to manually do as well. I don't understand why known hosts isn't prepopulated since the engine had to contact all the hosts once to register them, but I guess that had something to do with the commenting of the line in ssh_config... Or maybe that's something just broken in my config. At one point after running updates.
Hello, Any updates? Best regards, Ab
Perhaps we can override ProxyCommand for ansible, using ansible_ssh_common_args? See e.g.: http://docs.ansible.com/ansible/latest/faq.html#how-do-i-configure-a-jump-host-to-access-servers-that-i-have-no-direct-access-to
(In reply to Yedidyah Bar David from comment #38) > Perhaps we can override ProxyCommand for ansible, using > ansible_ssh_common_args? See e.g.: > > http://docs.ansible.com/ansible/latest/faq.html#how-do-i-configure-a-jump- > host-to-access-servers-that-i-have-no-direct-access-to It doesn't override it, but it will execute it as well, with any ProxyCommand defined in the /etc/ssh/ssh_config. I've tried to find a way how to override it or ignore, but I didn't find any.
Fix for this bug breaks on 7.3, with openssh 6.6, which does not have '-G'. Moving back to assigned for now, need to decide if to just update the 'Requires:' line to >= 7.4 or to revert the patch and implement some other way.
(In reply to Yedidyah Bar David from comment #40) > Fix for this bug breaks on 7.3, with openssh 6.6, which does not have '-G'. > Moving back to assigned for now, need to decide if to just update the > 'Requires:' line to >= 7.4 or to revert the patch and implement some other > way. Let's revert and implement with 'ssh -F configfile'.
Verified in ovirt-engine-4.2.2.4-0.1.el7.noarch --== END OF CONFIGURATION ==-- [ INFO ] Stage: Setup validation During execution engine service will be stopped (OK, Cancel) [OK]: [WARNING] There is 'ProxyCommand' configuration option in /etc/ssh/ssh_config file. If you have configured IPA client on your machine, please re-configure it with --no-ssh option. If you don't have IPA client configured please remove or comment 'ProxyCommand' configuration. It is very important to remove the configuration option or the engine won't work properly. Do you want Setup to continue, with the incorrect ssh configuration? (Yes, No) [No]: Yes If you want to automate this question with answerfile, you can use: [environment:default] QUESTION/1/OVESETUP_VERIFY_SSH_CLIENT_CONFIG=bool:True (or False)
ProxyCommand is a valid sshd option and could be set to anything, not just what FreeIPA (IDM) or SSSD sets it too. So it's use is correct and could be set to user definable commands which in turn could perhaps break again in the future. Instead of providing the comment above, have you considered giving the user a choice on one of the 3 options above via configurable startup security options to oVirt? Through a separate config file perhaps? For example, # Reconfigure ovirt user to use /bin/bash instead of /sbin/nologin . SECURITY/1/OVE_OVIRT_USER_DEFAULT_SHELL=/bin/bash # Remove ProxyCommand from config. SECURITY/1/OVE_AUTO_REMOVE_PROXY_COMMAND=true # This could cause oVirt to, say, generate a new ssh_config based on the old one but wo/ the ProxyCommand for use only when adding new hosts or taking other actions. SECURITY/1/OVE_IGNORE_PROXY_COMMAND=true # Provide logic to temporarily set /bin/bash for the ovirt user, run a test ssh command then reconfigure ovirt for /sbin/nologin again. As this is a one time event during initial host setup, might not be an issue? SECURITY/1/OVE_AUTO_SSH_SETUP_ON_FAILURE=true # As the name implies, automatically configure the known_hosts file that /usr/bin/sss_ssh_knownhostsproxy (ie /var/lib/sss/pubconf/known_hosts ) SECURITY/1/OVE_AUTO_SETUP_OVIRT_KNOWN_HOSTS=true # Preconfigure a connection to a host via a staging user before work get's done to add a host via the ovirt user. ( ie populate /var/lib/sss/pubconf/known_hosts on behalf of the ovirt user, if that's possible. ) SECURITY/1/OVE_PRE_CONF_CONN_ENV=true etc. This security config could contain firewalld and SELinux checks as well? >> It is very important to remove the configuration option or the engine won't work properly. >> Do you want Setup to continue, with the incorrect ssh configuration? The above comment, at least in my view, gives the user a rather tough choice when all they really want to do is get the two products to work together. 1) If you want oVirt to install, you'll need to undo the config of the other app (but we can't really guarantee what that will do and that is on the end user). What if the user reconfigures things and resets the config of the other app at a later date breaking oVirt again? 2) Keep your other app config, but oVirt won't work. This way you're letting the user decide if they prefer to, for example, lower oVirt security in order to keep the ProxyCommand app or perhaps would prefer to reconfigure the ProxyCommand app instead of oVirt. I'm facing the same issue but I will be trying to set the /sbin/nologin to /bin/bash instead, for the ovirt user. In my case I prefer this option over removing the ProxyCommand. Thoughts? Cheers, Tom K. ------------------------------------------- [root@ovirt01 host-deploy]# cat ovirt-host-deploy-ansible-20180326155154-mdskvm-p02.my.dom-2e0aa2de.log 2018-03-26 15:51:56,018 p=29228 u=ovirt | Using /usr/share/ovirt-engine/playbooks/ansible.cfg as config file 2018-03-26 15:51:56,695 p=29228 u=ovirt | [DEPRECATION WARNING]: The use of 'include' for tasks has been deprecated. Use 'import_tasks' for static inclusions or 'include_tasks' for dynamic inclusions. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 2018-03-26 15:51:56,697 p=29228 u=ovirt | [DEPRECATION WARNING]: include is kept for backwards compatibility but usage is discouraged. The module documentation details page may explain more about this rationale.. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 2018-03-26 15:51:57,084 p=29228 u=ovirt | PLAY [all] ********************************************************************* 2018-03-26 15:51:57,115 p=29228 u=ovirt | TASK [Gathering Facts] ********************************************************* 2018-03-26 15:51:58,182 p=29228 u=ovirt | fatal: [mdskvm-p02.my.dom]: UNREACHABLE! => { "changed": false, "unreachable": true } MSG: Failed to connect to the host via ssh: ssh_exchange_identification: Connection closed by remote host 2018-03-26 15:51:58,184 p=29228 u=ovirt | PLAY RECAP ********************************************************************* 2018-03-26 15:51:58,184 p=29228 u=ovirt | mdskvm-p02.my.dom : ok=0 changed=0 unreachable=1 failed=0 [root@ovirt01 host-deploy]# [root@ovirt01 host-deploy]# sudo -u ovirt /usr/bin/ansible-playbook -vvvvv --check --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=mdskvm-p02.my.dom, /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml ansible-playbook 2.4.2.0 config file = /etc/ansible/ansible.cfg configured module search path = [u'/var/lib/ovirt-engine/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible-playbook python version = 2.7.5 (default, Aug 4 2017, 00:39:18) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] Using /etc/ansible/ansible.cfg as config file setting up inventory plugins Parsed mdskvm-p02.my.dom, inventory source with host_list plugin Loading callback plugin default of type stdout, v2.0 from /usr/lib/python2.7/site-packages/ansible/plugins/callback/__init__.pyc PLAYBOOK: ovirt-host-upgrade.yml *********************************************************************************************************************************************** 1 plays in /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml PLAY [all] ********************************************************************************************************************************************************************* META: ran handlers TASK [ovirt-host-upgrade : Install ovirt-host package if it isn't installed] *************************************************************************************************** task path: /usr/share/ovirt-engine/playbooks/roles/ovirt-host-upgrade/tasks/main.yml:8 Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/_text.py Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/basic.py Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/urls.py Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/six/__init__.py Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/parsing/convert_bool.py Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/parsing/__init__.py Using module_utils file /usr/lib/python2.7/site-packages/ansible/module_utils/pycompat24.py Using module file /usr/lib/python2.7/site-packages/ansible/modules/packaging/os/yum.py <mdskvm-p02.my.dom> ESTABLISH SSH CONNECTION FOR USER: root <mdskvm-p02.my.dom> SSH: ansible.cfg set ssh_args: (-C)(-o)(ControlMaster=auto)(-o)(ControlPersist=60s) <mdskvm-p02.my.dom> SSH: ANSIBLE_PRIVATE_KEY_FILE/private_key_file/ansible_ssh_private_key_file set: (-o)(IdentityFile="/etc/pki/ovirt-engine/keys/engine_id_rsa") <mdskvm-p02.my.dom> SSH: ansible_password/ansible_ssh_pass not set: (-o)(KbdInteractiveAuthentication=no)(-o)(PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey)(-o)(PasswordAuthentication=no) <mdskvm-p02.my.dom> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=root) <mdskvm-p02.my.dom> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10) <mdskvm-p02.my.dom> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath=/var/lib/ovirt-engine/.ansible/cp/582d02949b) <mdskvm-p02.my.dom> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/etc/pki/ovirt-engine/keys/engine_id_rsa"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/var/lib/ovirt-engine/.ansible/cp/582d02949b mdskvm-p02.my.dom '/bin/sh -c '"'"'echo ~ && sleep 0'"'"'' <mdskvm-p02.my.dom> (255, '', 'OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 62: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug1: Control socket "/var/lib/ovirt-engine/.ansible/cp/582d02949b" does not exist\r\ndebug1: Executing proxy command: exec /usr/bin/sss_ssh_knownhostsproxy -p 22 mdskvm-p02.my.dom\r\ndebug3: timeout: 10000 ms remain after connect\r\ndebug1: permanently_drop_suid: 108\r\ndebug1: key_load_public: No such file or directory\r\ndebug1: identity file /etc/pki/ovirt-engine/keys/engine_id_rsa type -1\r\ndebug1: key_load_public: No such file or directory\r\ndebug1: identity file /etc/pki/ovirt-engine/keys/engine_id_rsa-cert type -1\r\ndebug1: Enabling compatibility mode for protocol 2.0\r\ndebug1: Local version string SSH-2.0-OpenSSH_7.4\r\ndebug1: ssh_exchange_identification: This account is currently not available.\n\r\nssh_exchange_identification: Connection closed by remote host\r\n') fatal: [mdskvm-p02.my.dom]: UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 62: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug1: Control socket \"/var/lib/ovirt-engine/.ansible/cp/582d02949b\" does not exist\r\ndebug1: Executing proxy command: exec /usr/bin/sss_ssh_knownhostsproxy -p 22 mdskvm-p02.my.dom\r\ndebug3: timeout: 10000 ms remain after connect\r\ndebug1: permanently_drop_suid: 108\r\ndebug1: key_load_public: No such file or directory\r\ndebug1: identity file /etc/pki/ovirt-engine/keys/engine_id_rsa type -1\r\ndebug1: key_load_public: No such file or directory\r\ndebug1: identity file /etc/pki/ovirt-engine/keys/engine_id_rsa-cert type -1\r\ndebug1: Enabling compatibility mode for protocol 2.0\r\ndebug1: Local version string SSH-2.0-OpenSSH_7.4\r\ndebug1: ssh_exchange_identification: This account is currently not available.\n\r\nssh_exchange_identification: Connection closed by remote host\r\n", "unreachable": true } [WARNING]: Could not create retry file '/usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.retry'. [Errno 13] Permission denied: u'/usr/share/ovirt-engine/playbooks /ovirt-host-upgrade.retry' PLAY RECAP ********************************************************************************************************************************************************************* mdskvm-p02.my.dom : ok=0 changed=0 unreachable=1 failed=0 [root@ovirt01 host-deploy]#
+ one additional step to Ab's steps that I followed then rebooted: cat known_hosts > /var/lib/sss/pubconf/known_hosts pwd /var/lib/ovirt-engine/.ssh
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.
(In reply to Tom K from comment #43) > ProxyCommand is a valid sshd option and could be set to anything, not just > what FreeIPA (IDM) or SSSD sets it too. So it's use is correct and could be > set to user definable commands which in turn could perhaps break again in > the future. > > Instead of providing the comment above, have you considered giving the user > a choice on one of the 3 options above via configurable startup security > options to oVirt? Through a separate config file perhaps? > We've decided to fix it differently at the end. The ssh has the parameter '-F', which can override the system SSH configuration. So we will be using it internally see bug #1562013.