Bug 1734356

Summary: overcloud upgrade prepare displays error: "Couldn't not import keys to one of"
Product: Red Hat OpenStack Reporter: Jose Luis Franco <jfrancoa>
Component: python-tripleoclientAssignee: Sergii Golovatiuk <sgolovat>
Status: CLOSED ERRATA QA Contact: Ronnie Rasouli <rrasouli>
Severity: medium Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: aschultz, hbrock, jschluet, jslagle, lbezdick, mburns, psahoo, sgolovat, slinaber
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: python-tripleoclient-11.5.1-0.20190808130445.f83a1ed.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-05 11:59:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jose Luis Franco 2019-07-30 10:44:14 UTC
During the upgrade from OSP14 to OSP15, the overcloud upgrade prepare step seems to be passing. However, when taking a deeper look at the logs, we can observe the following ERROR log:

2019-07-30 10:07:04 | Connection reset by 192.168.24.9 port 22^M
2019-07-30 10:07:04 | Enabling ssh admin (tripleo-admin) for hosts:
2019-07-30 10:07:04 | 192.168.24.9 192.168.24.7 192.168.24.8 192.168.24.16
2019-07-30 10:07:04 | Using ssh user heat-admin for initial connection.
2019-07-30 10:07:04 | Using ssh key at /home/stack/.ssh/id_rsa for initial connection.
2019-07-30 10:07:04 | Inserting TripleO short term key for 192.168.24.9
2019-07-30 10:07:04 | Removing short term keys locally
2019-07-30 10:07:04 | 2019-07-30 10:07:04.988 51635 ERROR tripleoclient.v1.overcloud_upgrade.MajorUpgradePrepare [-] Couldn't not import keys to one of ['192.168.24.9', '192.168.24.7', '192.168.24.8', '192.168.24.16']. Check if the user/ip are corrects.
2019-07-30 10:07:04 | : subprocess.CalledProcessError: Command '['ssh', '-o', 'ConnectionAttempts=6', '-o', 'ConnectTimeout=30', '-o', 'StrictHostKeyChecking=no', '-o', 'PasswordAuthentication=no', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'StrictHostKeyChecking=no', '-i', '/home/stack/.ssh/id_rsa', '-l', 'heat-admin', '192.168.24.9', "echo -e '\nssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC40BQ4jyGYRbC1euZDmkvTjN4U0yuNaTLKHB4cgeH2FZDAY1dksEBoO4ba7LzCi33WRUGk11ojBz+Jwtwr9FUQlcnk/pKGVHOJVLxOTg7Hnz2Zit4y1vd3HtglRI5fWnDCMeQ1p/a1u6ypdMBHMaAf17tMfi8f94qlwXovS0PQkZo1Nn3fvLy9QM0e2gaCd0SZsOnXHiLy4dRtNXbxQA0ODOat92N3F8U3POuQMy1AlRXozBurL2C7nYiSKyAphWXB9b/KoxqA9J+ZWyjoPFa94YA6O0stBxTYMXNHHpgaKg172omU7FlLa707e6m5K3u2/cGdavdty6u2pPTcU7/+5ngGE3ucnPEHuVZwoyc0L31IuAhnpHdp3hGMc37oqfD609bx6upX86zAupMFO3W1NgMZ1Od1Bm5/VW4IfVc8Jnif/LmPBRS3XD94P+/7zI2UsOZPwkt61JaOBRDOe1FbK/WXuDS8vEUtKgnLXiWIm5dLxceWMcRo32egfx6UZ9h04oYftmWVZ1LPgCTgro9KExZhirEXlY3i8TnAaIsFqcgtENLIiOe0CilVjyrt0Y9Q6L06KH/zmvnwpQ8uVQiwG6DuwLEATvC4olAZ0f/02EqMVLDCuhO7W6SM9ttXtGYQCwqDn+fN3MbyGClOX/Bmpx+6KP0+PKU3v1JylnOu9Q== TripleO split stack short term key\n' >> $HOME/.ssh/authorized_keys"]' returned non-zero exit status 255.^[[00m
2019-07-30 10:07:04 | 2019-07-30 10:07:04.989 51635 INFO tripleoclient.v1.overcloud_upgrade.MajorUpgradePrepare [-] Completed Overcloud Upgrade Prepare for stack overcloud^[[00m
2019-07-30 10:07:04 | 2019-07-30 10:07:04.991 51635 INFO osc_lib.shell [-] END return value: None^[[00m

When checking if those IPs where accessible from the Undercloud they seemed to be pingable. Also from the mistral_executor container, which is the place where the Mistral  workflow is ran.

Besides this error log, some confusing logs are appearing at the very end of the overcloud upgrade prepare:

2019-07-30 10:07:04 | 2019-07-30 10:07:04.989 51635 INFO tripleoclient.v1.overcloud_upgrade.MajorUpgradePrepare [-] Completed Overcloud Upgrade Prepare for stack overcloud^[[00m
2019-07-30 10:07:04 | 2019-07-30 10:07:04.991 51635 INFO osc_lib.shell [-] END return value: None^[[00m
2019-07-30 10:07:04 | /usr/lib/python3.6/site-packages/tripleoclient/workflows/deployment.py:225: ResourceWarning: unclosed file <_io.TextIOWrapper name='/tmp/tmptqbtjjum/id_rsa.pub' mode='r' encoding='UTF-8'>
2019-07-30 10:07:04 |   tmp_key_public_contents = open(tmp_key_public).read()
2019-07-30 10:07:04 | /usr/lib/python3.6/site-packages/tripleoclient/workflows/deployment.py:170: ResourceWarning: unclosed <socket.socket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('192.168.24.1', 34672), raddr=('192.168.24.9', 22)>
2019-07-30 10:07:04 |   socket.socket().connect((host, 22))
2019-07-30 10:07:05 | sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 34048), raddr=('192.168.24.2', 13808)>
2019-07-30 10:07:05 | sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 35242)>
2019-07-30 10:07:05 | sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 53322)>
2019-07-30 10:07:05 | sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 51414)>


These might be unrelated to this issue, so if required I could open another bug for them. However, I will be using this Bugzilla for tracking them and at least triaging the issue.

Reproduce:

Follow the steps to upgrade from OSP14 to OSP15 from the WIP guideline: https://gitlab.cee.redhat.com/osp15/osp-upgrade-el8/blob/master/README.md

The error will appear when checking the logs of the overcloud upgrade prepare command.

Comment 6 Sergii Golovatiuk 2019-08-14 15:30:53 UTC
https://review.opendev.org/#/c/674631/ will be moved to separate bug as it's security fix thus it might have higher severity.

Comment 20 errata-xmlrpc 2020-03-05 11:59:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0643