Bug 1960968 - Disable checking of SSH connection when adding a host into the ansible-runner-service inventory
Summary: Disable checking of SSH connection when adding a host into the ansible-runner...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.4.5
Hardware: All
OS: Linux
high
high
Target Milestone: ovirt-4.4.7
: ---
Assignee: Dana
QA Contact: Wei Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-16 22:23 UTC by Allie DeVolder
Modified: 2021-07-22 15:13 UTC (History)
8 users (show)

Fixed In Version: ovirt-engine-4.4.7.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-22 15:12:33 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:
pelauter: needinfo-
weiwang: testing_plan_complete+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:2865 0 None None None 2021-07-22 15:13:39 UTC
oVirt gerrit 115057 0 master MERGED setup: Disable connection check when adding host to ARS inventory 2021-06-08 15:30:28 UTC

Description Allie DeVolder 2021-05-16 22:23:14 UTC
Description of problem:
Attempting to add a second host to a new RHV environment, ansible-runner fails with NOAUTH "SH auth error - passwordless ssh not configured for '<IP address>'" and the host fails to add. We've tested passwordless ssh from the RHV-M and it works.

Version-Release number of selected component (if applicable):
RHV-M
 rhvm-4.4.5.11-0.1
Hypervisor 
 vdsm-4.40.22-1
 ansible-2.9.11-1

How reproducible:
Unknown

Steps to Reproduce:
1. Build new hosted engine environment.
2. ATtempt to add a second host

Actual results:
Failure to add host. Log entry in /var/log/ovirt-engine/ansible-runner-
~~~
service.log: runner_service.services.hosts - ERROR - SSH - NOAUTH:SSH auth error - passwordless ssh not configured for '<Host IP Address>'
~~~

Expected results:
Successful addition of hypervisor to RHV-M.

Additional info:

Comment 2 Michal Skrivanek 2021-05-17 06:56:16 UTC
(In reply to Allie DeVolder from comment #0)
> Description of problem:
> Attempting to add a second host to a new RHV environment, ansible-runner
> fails with NOAUTH "SH auth error - passwordless ssh not configured for '<IP
> address>'" and the host fails to add. We've tested passwordless ssh from the
> RHV-M and it works.
> 
> Version-Release number of selected component (if applicable):
> RHV-M
>  rhvm-4.4.5.11-0.1
> Hypervisor 
>  vdsm-4.40.22-1
>  ansible-2.9.11-1

4.4.5 is shipping ansible-2.9.17-1. Why do you have 2.9.11?

Anyway, what do you mean by passwordless? How exactly is the host being added?

Comment 3 Martin Perina 2021-05-17 07:08:40 UTC
Are you sure that you are adding a host using ssh public key option? Are you using webadmin or RESTAPI?
Also according to the customer case there were performed bunch of hacks on ansible-runner-service internals around ssh_key.pub, I highly suggest to undo all those hacks, there is no need to for them, engine SSH public key is passed to the ansible-runner-service when adding a host with public key authentication selected.

Comment 4 Allie DeVolder 2021-05-17 17:36:58 UTC
The workaround we tried was manually copying ssh_key.pub to /usr/share/ovirt-engine/ansible-runner-service-project/env/ssh_key.pub because the error before that was:

~~~
2021-05-14 15:40:51,766 - runner_service.services.hosts - ERROR - SSH - NOAUTH:SSH auth error - passw                                                                                                                                        ordless ssh not configured for '10.104.136.149'
2021-05-14 15:40:51,767 - flask.app - ERROR - Exception on /api/v1/hosts/10.104.136.149/groups/ovirt                                                                                                                                         [POST]
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper
    resp = resource(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view
    return self.dispatch_request(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/runner_service/controllers/utils.py", line 29, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/runner_service/controllers/hosts.py", line 212, in post
    response = add_host(host_name, group, ssh_port)
  File "/usr/lib/python3.6/site-packages/runner_service/services/hosts.py", line 49, in add_host
    r.data = {"pub_key": fread(pub_key_file)}
  File "/usr/lib/python3.6/site-packages/runner_service/utils.py", line 34, in fread
    with open(file_path, 'r') as file_fd:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/share/ovirt-engine/ansible-runner-servi                                                                                                                                        ce-project/env/ssh_key.pub'
~~~

This was an attempt to see if the ansible script was failing to pull that file from the location, and the only difference was that the file then existed and ended without the FileNotFoundError.

So, if I need to file a bugzilla against THAT instead of this one, then let me know and I'll do that. But the only difference in behavior is that missing file traceback.

Comment 5 Martin Perina 2021-05-18 09:41:34 UTC
(In reply to Allie DeVolder from comment #4)
> The workaround we tried was manually copying ssh_key.pub to
> /usr/share/ovirt-engine/ansible-runner-service-project/env/ssh_key.pub
> because the error before that was:
> 
> ~~~
> 2021-05-14 15:40:51,766 - runner_service.services.hosts - ERROR - SSH -
> NOAUTH:SSH auth error - passw                                               
> ordless ssh not configured for '10.104.136.149'
> 2021-05-14 15:40:51,767 - flask.app - ERROR - Exception on
> /api/v1/hosts/10.104.136.149/groups/ovirt                                   
> [POST]
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in
> full_dispatch_request
>     rv = self.dispatch_request()
>   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in
> dispatch_request
>     return self.view_functions[rule.endpoint](**req.view_args)
>   File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line
> 480, in wrapper
>     resp = resource(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view
>     return self.dispatch_request(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line
> 595, in dispatch_request
>     resp = meth(*args, **kwargs)
>   File
> "/usr/lib/python3.6/site-packages/runner_service/controllers/utils.py", line
> 29, in wrapper
>     return f(*args, **kwargs)
>   File
> "/usr/lib/python3.6/site-packages/runner_service/controllers/hosts.py", line
> 212, in post
>     response = add_host(host_name, group, ssh_port)
>   File "/usr/lib/python3.6/site-packages/runner_service/services/hosts.py",
> line 49, in add_host
>     r.data = {"pub_key": fread(pub_key_file)}
>   File "/usr/lib/python3.6/site-packages/runner_service/utils.py", line 34,
> in fread
>     with open(file_path, 'r') as file_fd:
> FileNotFoundError: [Errno 2] No such file or directory:
> '/usr/share/ovirt-engine/ansible-runner-servi                               
> ce-project/env/ssh_key.pub'
> ~~~
> 
> This was an attempt to see if the ansible script was failing to pull that
> file from the location, and the only difference was that the file then
> existed and ended without the FileNotFoundError.
> 
> So, if I need to file a bugzilla against THAT instead of this one, then let
> me know and I'll do that. But the only difference in behavior is that
> missing file traceback.

Above error is irrelevant, this file is not used at all by ansible-runner-service when the service is utilized by RHV Manager.

I've just performed simple test and adding a new host using engine public key works as expected:

1. Go to the host and create SSH authorized_keys file:

    ssh root@<YOUR HOST>
    mkdir ~/.ssh
    chmod 700 ~/.ssh
    touch ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys
    curl --insecure 'https://<RHV MANAGER FQDN>/ovirt-engine/services/pki-resource?resource=engine-certificate&format=OPENSSH-PUBKEY' -o ~/.ssh/authorized_keys

2. Verify public key access to the host on RHV Manager

    ssh -i /etc/pki/ovirt-engine/keys/engine_id_rsa root@<YOUR HOST>


3. Add a host using public key in webadmin
    a. Go to Compute/Hosts and click on New Host
    b. Fill in Name, Hostname with correct values
    c. Select SSH Public Key in Authentication part
    d. Fill in other values if needed
    e. Click OK

Above steps are enough to successfully add a host using engine SSH public key.

I've also checked and /usr/share/ovirt-engine/ansible-runner-service-project/env/ssh_key.pub doesn't exist. So could you please remove that file (and oany other manual modifications that has been done) and try above steps on the setup?

Comment 6 Allie DeVolder 2021-05-18 14:25:48 UTC
The customer was using the previous version of RHV-H rather than the 4.4.5 version released last week. They're currently wiping and reinstalled from the newest ISO and I'll report back if the issue persists.

Comment 13 Wei Wang 2021-05-20 08:44:37 UTC
Test with rhvh-4.4.5.4-0.20210330.0

I used two latest rhvm-appliance build to deploy he, after he set up, I check the rhvm version
a) rhvm-appliance-4.4-20210310.0.el8ev.x86_64
[root@rhevh-hostedengine-vm-05 ~]# rpm -qa|grep rhvm
rhvm-branding-rhv-4.4.7-1.el8ev.noarch
rhvm-dependencies-4.4.1-1.el8ev.noarch
rhvm-4.4.4.7-0.2.el8ev.noarch
rhvm-setup-plugins-4.4.2-1.el8ev.noarch

rhvm-appliance-4.4-20210402.1.el8ev.x86_64
[root@rhevh-hostedengine-vm-05 ~]# rpm -qa|grep rhvm
rhvm-dependencies-4.4.1-1.el8ev.noarch
rhvm-4.4.6.1-0.11.el8ev.noarch
rhvm-setup-plugins-4.4.2-1.el8ev.noarch
rhvm-branding-rhv-4.4.7-1.el8ev.noarch

So firstly, I cannot get the rhvm-4.4.5 from the rhvm-appliance build which we usually used with testing.

Second I used the rhvm-appliance-4.4-20210402.1.el8ev.x86_64 to retest this issue, using two method (password authentication and public key authentication) to add second host to he environment, successful with all the two method


Guess the problem seems related to your rhvm-4.4.5.11-0.1.

Comment 26 Wei Wang 2021-06-11 08:23:44 UTC
QE doesn't have the customer's environment to reproduce this issue, could you please help to verify it? 

Thanks!

Comment 27 Martin Perina 2021-06-11 09:09:37 UTC
(In reply to Wei Wang from comment #26)
> QE doesn't have the customer's environment to reproduce this issue, could
> you please help to verify it? 
> 
> Thanks!

Verification of a change we implemented is easy, there should be one SSH connection less observed when adding a new host. Please take a look at comment 14: 

1. There should be a new SSH connection observed on the host correlated to ssh-copy-id (message "Executing ssh-copy-id command on host" in engine.log

2. There should be e new SSH connection observed on the host related to execution of Ansible playbook performing host deploy process

Without the fix there is another new SSH connection from engine to the host. With the fix there shouldn't be any other connection between those described above.

Comment 28 Wei Wang 2021-06-18 07:56:07 UTC
The latest rhvm-appliance-4.4-20210527.0.el8ev.x86_64 is include ovirt-engine-4.3.11.3-0.1.el7.noarch, QE will verify this bug until the build including ovirt-engine-4.4.7.2 coming.

Comment 30 Wei Wang 2021-06-28 06:14:59 UTC
Test Version:
RHVH-4.4-20210624.0-RHVH-x86_64-dvd1.iso
rhvm-appliance-4.4-20210625.0.el8ev.x86_64
ovirt-engine-4.4.7.5-0.9.el8ev.noarch


Test Step:
1. Clean install RHVH with host A and B
2. Deploy hosted engine with host A
3. Add additional host B to hosted engine environment with passwordless ssh authentication according to comment 8 and comment 5
4. Check the SSH connection in engine.log

Test Result:

2021-06-28 11:39:12,302+08 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [7ca1a78a] EVENT_ID: VDS_ANSIBLE_INSTALL_STARTED(560), Ansible host-deploy playbook execution has started on host hp-dl388g9-04.lab.eng.pek2.redhat.com.


No error runner_service.services.hosts - ERROR - SSH - NOAUTH:SSH auth error - passwordless ssh not configured in ansible-runner-service.log and no error in sshd log


bug is fixed, move it to "VERIFIED"

Comment 35 errata-xmlrpc 2021-07-22 15:12:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV Manager (ovirt-engine) security update [ovirt-4.4.7]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2865


Note You need to log in before you can comment on or make changes to this bug.