Bug 1356192
| Summary: | Add node using ssh public key fail even if the key is autorized | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-node | Reporter: | Federico Fortini <blackfede> |
| Component: | Installation & Update | Assignee: | Ryan Barry <rbarry> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | dguo |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.0 | CC: | blackfede, bugs, cshao, dguo, fdeutsch, huzhao, mgoldboi, rbarry, ycui, yzhao |
| Target Milestone: | ovirt-4.0.5 | Flags: | rule-engine:
ovirt-4.0.z+
rule-engine: ovirt-4.1+ mgoldboi: planning_ack+ fdeutsch: devel_ack+ cshao: testing_ack+ |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-09-22 08:12:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
Created attachment 1179333 [details]
engine.log error about failed auth
Created attachment 1179335 [details]
Details about public key and node fingerprint
Can you please check /var/log/secure on the Node to see if there's any information there? My geuss is that we do not copy /root/.ssh which basically disables ssh pubkey authentication after updates. This should be covered by the osupdater (In reply to Fabian Deutsch from comment #4) > My geuss is that we do not copy /root/.ssh which basically disables ssh > pubkey authentication after updates. > > This should be covered by the osupdater In investigating this, it appears that /root is already rsynced by osupdater. Additionally, comment#1 indicates that pubkey auth is working. QE, can you reproduce this? If not, we may need to move it out, since there's no obvious cause, and I can't reproduce. daijie, see this bug, and comment 5, could you try this bug on QE side? Rough Steps: 1. Installed Node, make sure network enable. 2. ssh engine server 3. retrieve the public key from a SSH private key # cd /etc/pki/ovirt-engine/keys/ # ssh-keygen -y -f engine_id_rsa > engine_id_rsa.pub # ssh-copy-id -i engine_id_rsa.pub root@node_host 4. verify to add node using ssh PublicKey authentication in engine successful or not. Hi all, Thank you for ycui's steps. #1.Follow the steps on comment 6: 1. Installed Node, make sure network enable. 2. ssh engine server 3. retrieve the public key from a SSH private key # cd /etc/pki/ovirt-engine/keys/ # ssh-keygen -y -f engine_id_rsa > engine_id_rsa.pub # ssh-copy-id -i engine_id_rsa.pub root@node_host 1)After step3,the content of engine_id_rsa.pub in engine and authorized_keys in node are the same.but engine ssh node need password and add node to rhevm failed. ##[root@rhevm-40-1 keys]# ssh root.10.37 root.10.37's password: #2.I add the other steps on comment 6: 1. Installed Node, make sure network enable. 2. ssh engine server 3. retrieve the public key from a SSH private key # cd /etc/pki/ovirt-engine/keys/ # ssh-keygen -y -f engine_id_rsa > engine_id_rsa.pub # ssh-copy-id -i engine_id_rsa.pub root@node_host 4.cp engine_id_rsa.pub /root/.ssh/authorized_keys 5.cp engine_id_rsa /root/.ssh/id_rsa 1)After step5,engine ssh node don't need password.But add node to engine failed. [root@rhevm-40-1 .ssh]# ls authorized_keys id_rsa known_hosts [root@rhevm-40-1 .ssh]# ssh root.10.37 Last login: Wed Sep 7 18:49:10 2016 from 10.66.148.99 imgbase status: OK [root@dhcp-10-37 ~]# Add:the /var/log/secure on engine and node on attachment var_log_engine.log and var_log_node.log. Thanks, Yihui Created attachment 1198652 [details]
var_log_secure_engine.log and var_log_secure_node.log
Ryan, could you help to check the comment 7, it seems QE can not add the rhvh to engine via SSH Public Key too. and the env. is kept. It appears that auth actually works (even from engine), but host-deploy logs indicate that the deployment failed because vdsm failed to start. vdsm failed to start because: Sep 07 16:30:04 dhcp-10-37.nay.redhat.com systemd[1]: ovirt-imageio-daemon.service holdoff time over, scheduling restart. Sep 07 16:30:04 dhcp-10-37.nay.redhat.com systemd[1]: Starting oVirt ImageIO Daemon... Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: Traceback (most recent call last): Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: server.main(sys.argv) Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 48, in main Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: start(config) Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 68, in start Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: secure_server(config, image_server) Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 90, in secure_server Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: keyfile=config.key_file, server_side=True) Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib64/python2.7/ssl.py", line 913, in wrap_socket Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: ciphers=ciphers) Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib64/python2.7/ssl.py", line 526, in __init__ Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: self._context.load_cert_chain(certfile, keyfile) Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: IOError: [Errno 2] No such file or directory Sep 07 16:30:05 dhcp-10-37.nay.redhat.com systemd[1]: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE Investigating host-deploy... I'm still not able to reproduce this locally, unfortunately. It's clear from the logs on the systems that SSH auth is working, and that it's failing when vdsm checks is_configured (by checking ovirt-imageio-daemon), but it's not clear why. I need access to RHEVM to do any debugging here... I also don't know the authentication for your rhevm instance to try re-deploying. The clocks on the systems are slightly off, but I'm not sure if that's significant here. Can you please provide auth details for rhevm-40-1.englab tomorrow? We don't have a reproducer for this. Can we move it out? Federico, can you still reproduce this issue with a more recent Node image? Closing this for now because we can not reproduce it. Please reopen if necessary. |
Created attachment 1179332 [details] successully login to node without password from CLI Description of problem: Fresh node installation "Ovirt Node 4.0.0", when i'm trying to add node to a cluster using "SSH Public Key" fail even if can login from engine server to node without password. Version-Release number of selected component (if applicable): How reproducible: Add new host from engin, select SSH Public Key instead of password. Actual results: Add host procedure fail with error: "Error while executing action: Cannot add Host. SSH authentication failed, verify authentication parameters are correct (Username/Password, public-key etc.) You may refer to the engine.log file for further details." Expected results: Successfully complete the add host Wizard Additional info: I'm able to issue the command ssh root.corp.vcube.it and login without password. The engine server public key was installed on host server usig the command "ssh-copy-id node5.ovirt.corp.vcube.it"