Bug 1722636 - [DCN][Spine & Leaf] ssh timeout into compute overcloud nodes post overcloud.AllNodesDeploySteps
Summary: [DCN][Spine & Leaf] ssh timeout into compute overcloud nodes post overcloud.A...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: All
OS: Linux
high
high
Target Milestone: z7
: 13.0 (Queens)
Assignee: Harald Jensås
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks: 1723975 1724560 1724565
TreeView+ depends on / blocked
 
Reported: 2019-06-20 20:17 UTC by bjacot
Modified: 2019-08-02 00:12 UTC (History)
13 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.3.1-53.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1723975 1724560 (view as bug list)
Environment:
Last Closed: 2019-07-10 13:05:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
templates being used (11.87 KB, application/gzip)
2019-06-20 20:22 UTC, bjacot
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1834161 0 None None None 2019-06-25 09:20:29 UTC
OpenStack gerrit 667297 0 'None' MERGED Queens only - allow SSH from any source 2020-09-03 15:28:53 UTC
Red Hat Knowledge Base (Solution) 4318871 0 Troubleshoot None Default iptables does not allow ssh service in overcloud nodes 2019-08-02 00:12:19 UTC
Red Hat Product Errata RHBA-2019:1738 0 None None None 2019-07-10 13:06:14 UTC

Description bjacot 2019-06-20 20:17:05 UTC
Description of problem:
During the Overcloud deployment of the OS after the status has changed from BUILD --> ACTIVE enduser CAN ssh into the node.

Not able to ssh into compute leaf overcloud node during or after overcloud.AllNodesDeploySteps during the deployment.

Using a spine and leaf networking setup.

End user from will be able to ssh into director from the overcloud compute node via console.  Pings are successful both ways.

Version-Release number of selected component (if applicable):
OSP13 
13  -p 2019-06-13.2
openstack-tripleo-heat-templates-8.3.1-43.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
0: configure spine and least 1 leaf network.
1. enrol & introspect baremetal nodes. Control:Leaf0 computes:leaf1
2. Prepare overcloud deploy
3. during/after overcloud.AllNodesDeploySteps the compute node connection will timeout 

Actual results:
Port 22: Connection timed out

Expected results:
Succeed 

Additional info:

(undercloud) [stack@core-undercloud-0 ~]$ openstack server list
+--------------------------------------+-------------------------+--------+-------------------------+----------------+-----------+
| ID                                   | Name                    | Status | Networks                | Image          | Flavor    |
+--------------------------------------+-------------------------+--------+-------------------------+----------------+-----------+
| b8c052a6-3823-49f9-a846-2c910c1d517a | overcloud-controller0-2 | BUILD  |                         | overcloud-full | control0  |
| ad3426e7-f83d-4192-98d4-0f226acb3ce7 | overcloud-controller0-0 | BUILD  |                         | overcloud-full | control0  |
| a09a8f9e-1632-4b19-a5bf-82184289d394 | overcloud-compute1-0    | BUILD  |                         | overcloud-full | compute1  |
| 8dfddf16-bfad-4038-bcdf-5be236f516f5 | overcloud-compute5-0    | BUILD  |                         | overcloud-full | compute5  |
| 972675e4-eaa5-4fe7-ad00-2641fc233f39 | overcloud-compute8-0    | BUILD  |                         | overcloud-full | compute8  |
| c9dfadf5-e44e-4b5e-acbe-c6675c5892ee | overcloud-controller0-1 | BUILD  | ctlplane=192.168.220.26 | overcloud-full | control0  |
| d5be69a1-389a-4f0f-a6e1-1705ab18f9ce | overcloud-compute10-0   | BUILD  | ctlplane=192.168.230.27 | overcloud-full | compute10 |
| 4b9352c9-8d2a-4bfe-880c-767ddb9b397e | overcloud-compute9-0    | BUILD  | ctlplane=192.168.229.14 | overcloud-full | compute9  |
| 5e9e4a0b-d658-4e4a-a7fb-0301d4e7322f | overcloud-compute2-0    | BUILD  | ctlplane=192.168.222.28 | overcloud-full | compute2  |
| dff53067-1084-4825-ab2c-ec00f0f66e5b | overcloud-compute6-0    | ACTIVE | ctlplane=192.168.226.13 | overcloud-full | compute6  |
| e9d6316c-941d-4178-a8de-7dadcb5ec432 | overcloud-compute4-0    | ACTIVE | ctlplane=192.168.224.25 | overcloud-full | compute4  |
| 76ee4a17-f56b-4e1d-88f6-f6637484cd75 | overcloud-compute7-0    | ACTIVE | ctlplane=192.168.227.20 | overcloud-full | compute7  |
| 0775782b-27d7-4361-91eb-2ab135648907 | overcloud-compute11-0   | ACTIVE | ctlplane=192.168.231.18 | overcloud-full | compute11 |
| b3ab9679-aade-47af-bc3e-a59c1f2f708d | overcloud-compute3-0    | ACTIVE | ctlplane=192.168.223.11 | overcloud-full | compute3  |
+--------------------------------------+-------------------------+--------+-------------------------+----------------+-----------+
(undercloud) [stack@core-undercloud-0 ~]$ ssh heat-admin.223.11
Warning: Permanently added '192.168.223.11' (ECDSA) to the list of known hosts.
[heat-admin@overcloud-compute3-0 ~]$
[heat-admin@overcloud-compute3-0 ~]$

During / POST overcloud.AllNodesDeploySteps[...]

(undercloud) [stack@core-undercloud-0 ~]$ openstack server list
+--------------------------------------+-------------------------+--------+-------------------------+----------------+-----------+
| ID                                   | Name                    | Status | Networks                | Image          | Flavor    |
+--------------------------------------+-------------------------+--------+-------------------------+----------------+-----------+
| b8c052a6-3823-49f9-a846-2c910c1d517a | overcloud-controller0-2 | ACTIVE | ctlplane=192.168.220.17 | overcloud-full | control0  |
| ad3426e7-f83d-4192-98d4-0f226acb3ce7 | overcloud-controller0-0 | ACTIVE | ctlplane=192.168.220.15 | overcloud-full | control0  |
| a09a8f9e-1632-4b19-a5bf-82184289d394 | overcloud-compute1-0    | ACTIVE | ctlplane=192.168.221.35 | overcloud-full | compute1  |
| 8dfddf16-bfad-4038-bcdf-5be236f516f5 | overcloud-compute5-0    | ACTIVE | ctlplane=192.168.225.29 | overcloud-full | compute5  |
| 972675e4-eaa5-4fe7-ad00-2641fc233f39 | overcloud-compute8-0    | ACTIVE | ctlplane=192.168.228.25 | overcloud-full | compute8  |
| c9dfadf5-e44e-4b5e-acbe-c6675c5892ee | overcloud-controller0-1 | ACTIVE | ctlplane=192.168.220.26 | overcloud-full | control0  |
| d5be69a1-389a-4f0f-a6e1-1705ab18f9ce | overcloud-compute10-0   | ACTIVE | ctlplane=192.168.230.27 | overcloud-full | compute10 |
| 4b9352c9-8d2a-4bfe-880c-767ddb9b397e | overcloud-compute9-0    | ACTIVE | ctlplane=192.168.229.14 | overcloud-full | compute9  |
| 5e9e4a0b-d658-4e4a-a7fb-0301d4e7322f | overcloud-compute2-0    | ACTIVE | ctlplane=192.168.222.28 | overcloud-full | compute2  |
| dff53067-1084-4825-ab2c-ec00f0f66e5b | overcloud-compute6-0    | ACTIVE | ctlplane=192.168.226.13 | overcloud-full | compute6  |
| e9d6316c-941d-4178-a8de-7dadcb5ec432 | overcloud-compute4-0    | ACTIVE | ctlplane=192.168.224.25 | overcloud-full | compute4  |
| 76ee4a17-f56b-4e1d-88f6-f6637484cd75 | overcloud-compute7-0    | ACTIVE | ctlplane=192.168.227.20 | overcloud-full | compute7  |
| 0775782b-27d7-4361-91eb-2ab135648907 | overcloud-compute11-0   | ACTIVE | ctlplane=192.168.231.18 | overcloud-full | compute11 |
| b3ab9679-aade-47af-bc3e-a59c1f2f708d | overcloud-compute3-0    | ACTIVE | ctlplane=192.168.223.11 | overcloud-full | compute3  |
+--------------------------------------+-------------------------+--------+-------------------------+----------------+-----------+
(undercloud) [stack@core-undercloud-0 ~]$ ssh root.223.11 -v
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
debug1: Reading configuration data /home/stack/.ssh/config
debug1: /home/stack/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Connecting to 192.168.223.11 [192.168.223.11] port 22.
debug1: connect to address 192.168.223.11 port 22: Connection timed out
ssh: connect to host 192.168.223.11 port 22: Connection timed out



from the console of the Overcloud compute node SSH into director.

[root@overcloud-compute3-0 heat-admin]# ssh 192.168.220.1
The authenticity of host '192.168.220.1 (192.168.220.1)' can't be established.
ECDSA key fingerprint is SHA256:UK3YpJLnh6tWFMbwWxVt5mQzeonSXfzhO/MtZuAzs9o.
ECDSA key fingerprint is MD5:6b:09:de:c4:5f:e7:40:56:18:34:05:ba:ac:f7:42:d4.
Are you sure you want to continue connecting (yes/no)? yes
\Warning: Permanently added '192.168.220.1' (ECDSA) to the list of known hosts.
root.220.1's password: 
Last login: Thu Jun 20 15:18:57 2019 from 172.16.220.1
[root@core-undercloud-0 ~]#

Comment 1 bjacot 2019-06-20 20:22:33 UTC
Created attachment 1582850 [details]
templates being used

Comment 2 Yuri Obshansky 2019-06-21 13:34:24 UTC
The issue reproduced on virtual environment as well
OSP 13 puddle 2019-06-20.1
(undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.34.10 -v
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
debug1: Reading configuration data /home/stack/.ssh/config
debug1: /home/stack/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Connecting to 192.168.34.10 [192.168.34.10] port 22.
debug1: connect to address 192.168.34.10 port 22: Connection timed out
ssh: connect to host 192.168.34.10 port 22: Connection timed out
(undercloud) [stack@site-undercloud-0 ~]$ cat core_puddle_version

Comment 3 Yuri Obshansky 2019-06-21 19:18:18 UTC
Looks like it is regression.
The issue did not reproduced on OSP 13 puddle 2019-01-10.1
(undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.34.25 -v
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
debug1: Reading configuration data /home/stack/.ssh/config
debug1: /home/stack/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Connecting to 192.168.34.25 [192.168.34.25] port 22.
debug1: Connection established.
debug1: identity file /home/stack/.ssh/id_rsa type 1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_rsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_dsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_dsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_ecdsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_ecdsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_ed25519 type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/stack/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.4
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.4
debug1: match: OpenSSH_7.4 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 192.168.34.25:22 as 'heat-admin'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305 MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305 MAC: <implicit> compression: none
debug1: kex: curve25519-sha256 need=64 dh_need=64
debug1: kex: curve25519-sha256 need=64 dh_need=64
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:VPhO2Gj2/rEJJRK09TVt7CYQE455m+NWe3vyo1N1y/0
Warning: Permanently added '192.168.34.25' (ECDSA) to the list of known hosts.
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic
debug1: Next authentication method: gssapi-keyex
debug1: No valid Key exchange context
debug1: Next authentication method: gssapi-with-mic
debug1: Unspecified GSS failure.  Minor code may provide more information
No Kerberos credentials available (default cache: KEYRING:persistent:1001)

debug1: Unspecified GSS failure.  Minor code may provide more information
No Kerberos credentials available (default cache: KEYRING:persistent:1001)

debug1: Next authentication method: publickey
debug1: Offering RSA public key: /home/stack/.ssh/id_rsa
debug1: Server accepts key: pkalg rsa-sha2-512 blen 279
debug1: Authentication succeeded (publickey).
Authenticated to 192.168.34.25 ([192.168.34.25]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions
debug1: Entering interactive session.
debug1: pledge: network
debug1: client_input_global_request: rtype hostkeys-00 want_reply 0
debug1: Sending environment.
debug1: Sending env XMODIFIERS = @im=none
debug1: Sending env LANG = en_US.UTF-8
Last login: Fri Jun 21 19:15:14 2019 from 192.168.24.1
[heat-admin@overcloud-compute1-0 ~]$

Comment 4 bjacot 2019-06-21 20:21:09 UTC
regression issue added blocker flag

Comment 5 Bob Fournier 2019-06-21 22:45:28 UTC
Just to confirm the statement - "The issue did not reproduced on OSP 13 puddle 2019-01-10.1".  Should that be the 6-10 puddle, or is the Jan 10 puddle correct?

Also, ping to this node was working fine when ssh failed?  It would be interesting to see what a tcpdump of the traffic looks like from both the undercloud and the compute node, and also if there are iptables or sshd issues on the node.  If the setup is available we'd like to take a look.

Comment 6 bjacot 2019-06-24 13:12:58 UTC
Feel free to ping me i have a setup.

Comment 7 Gurenko Alex 2019-06-24 14:53:45 UTC
I've tested today on a puddle 2019-06-20.1 on a virt setup and all leaves' compute nodes are unavailable, although deployment passes.

(undercloud) [stack@site-undercloud-0 ~]$ openstack server list
+--------------------------------------+-------------------------+--------+------------------------+----------------+----------+
| ID                                   | Name                    | Status | Networks               | Image          | Flavor   |
+--------------------------------------+-------------------------+--------+------------------------+----------------+----------+
| f3a12183-0b2d-4ac5-9272-5af78d65fb89 | overcloud-controller0-1 | ACTIVE | ctlplane=192.168.24.27 | overcloud-full | control0 |
| a16872d7-70f0-412e-90fe-bd12717f1d61 | overcloud-controller0-2 | ACTIVE | ctlplane=192.168.24.31 | overcloud-full | control0 |
| 1478859f-2902-45cd-9026-da9213bfa91d | overcloud-compute2-0    | ACTIVE | ctlplane=192.168.44.13 | overcloud-full | compute2 |
| 509513f4-ad7d-4d63-a4f5-7d77cae01870 | overcloud-controller0-0 | ACTIVE | ctlplane=192.168.24.11 | overcloud-full | control0 |
| 6fd9a73f-dbe3-45c1-bae2-e6f5b978cf74 | overcloud-compute0-0    | ACTIVE | ctlplane=192.168.24.12 | overcloud-full | compute0 |
| eed48648-0ae0-410a-be47-d32e1a79c2fb | overcloud-compute1-0    | ACTIVE | ctlplane=192.168.34.28 | overcloud-full | compute1 |
+--------------------------------------+-------------------------+--------+------------------------+----------------+----------+

(undercloud) [stack@site-undercloud-0 ~]$ ssh -v heat-admin.44.13
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
debug1: Reading configuration data /home/stack/.ssh/config
debug1: /home/stack/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Connecting to 192.168.44.13 [192.168.44.13] port 22.
debug1: connect to address 192.168.44.13 port 22: Connection refused
ssh: connect to host 192.168.44.13 port 22: Connection refused

Comment 8 bjacot 2019-06-24 15:08:15 UTC
+++++iptables config on overcloud Leaf Node++++++
[root@overcloud-compute3-0 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED /* 000 accept related established rules ipv4 */
ACCEPT     icmp --  anywhere             anywhere             state NEW /* 001 accept all icmp ipv4 */
ACCEPT     all  --  anywhere             anywhere             state NEW /* 002 accept all to lo interface ipv4 */
ACCEPT     tcp  --  192.168.223.0/24     anywhere             multiport dports ssh state NEW /* 003 accept ssh from controlplane ipv4 */
ACCEPT     udp  --  anywhere             anywhere             multiport dports ntp state NEW /* 105 ntp ipv4 */
ACCEPT     tcp  --  anywhere             anywhere             multiport dports down state NEW /* 113 nova_migration_target ipv4 */
ACCEPT     udp  --  anywhere             anywhere             multiport dports bootps state NEW /* 115 neutron dhcp input ipv4 */
ACCEPT     udp  --  anywhere             anywhere             multiport dports 4789 state NEW /* 118 neutron vxlan networks ipv4 */
ACCEPT     gre  --  anywhere             anywhere             /* 136 neutron gre networks ipv4 */
ACCEPT     tcp  --  anywhere             anywhere             multiport dports 16514,61152:61215,rfb:6923 state NEW /* 200 nova_libvirt ipv4 */
LOG        all  --  anywhere             anywhere             state NEW /* 998 log all ipv4 */ LOG level warning
DROP       all  --  anywhere             anywhere             state NEW /* 999 drop all ipv4 */

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     udp  --  anywhere             anywhere             multiport dports bootpc state NEW /* 116 neutron dhcp output ipv4 */

+++++sshd config on overcloud Leaf Node++++++
# File is managed by Puppet
Port 22

AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
AcceptEnv XMODIFIERS
AuthorizedKeysFile .ssh/authorized_keys
ChallengeResponseAuthentication no
GSSAPIAuthentication yes
GSSAPICleanupCredentials no
HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
HostKey /etc/ssh/ssh_host_ed25519_key
PasswordAuthentication no
PrintMotd no
Subsystem sftp  /usr/libexec/openssh/sftp-server
SyslogFacility AUTHPRIV
UseDNS no
UsePAM yes
UsePrivilegeSeparation sandbox
X11Forwarding yes

Comment 9 Harald Jensås 2019-06-24 16:10:12 UTC
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  192.168.223.0/24     anywhere             multiport dports ssh state NEW /* 003 accept ssh from controlplane ipv4 */
                    ^^ 
                    It only allows SSH from IP's in it's own subnet?

This is most likely happening because of this backport: https://review.opendev.org/656442


There was a follow up change to open up access for the undercloud: https://review.opendev.org/656450.
Possible Workaround: 
  It may work to set 'SshFirewallAllowAll: true' in parameter_defaults: section of some environment file, as is done for the undercloud in https://review.opendev.org/656450.


For master and Stein we may want to replace the use of hiera interpolation and instead create firewall rules for all the CIDR's on the ctlplane network by reading the value from NetCidrMapValue which was introduced in: https://review.opendev.org/613459 and https://review.opendev.org/613442. We can probably backport these to Rocky as well.


_But for Queens_:
 This would require backporting heat change https://review.opendev.org/569053 as well as an instack-undercloud variant of https://review.opendev.org/613442 and https://review.opendev.org/613459.

 We may want to consider reverting the https://review.opendev.org/656442 backport for the queens case.
 
 @cjeanner, wdyt?

Comment 10 Harald Jensås 2019-06-24 18:13:20 UTC
I'm working on a fix upstream here: https://review.opendev.org/667172

Comment 11 Bob Fournier 2019-06-24 20:19:28 UTC
Including DF DFG per Comment 9.

Comment 12 Cédric Jeanneret 2019-06-25 06:12:05 UTC
Hey,

so for Queens, I remember seeing issues opened by customers in order to limit the SSH access to known subnets instead of world wide accesses - so reverting is probably a bad idea if nothing replaces it (i.e. queens-only patch or something like that).

Your patch 667172 looks really nice, and it would be really great to get something like that for Queens..

Comment 13 Harald Jensås 2019-06-25 09:20:29 UTC
(In reply to Cédric Jeanneret from comment #12)
> Hey,
> 
> so for Queens, I remember seeing issues opened by customers in order to
> limit the SSH access to known subnets instead of world wide accesses - so
> reverting is probably a bad idea if nothing replaces it (i.e. queens-only
> patch or something like that).
> 

Yes, it's a useful feature to allow customers to limit SSH access.

So the change that breaks this, is mostly about:
  """This allows operators to define more granular ssh firewall rules via tripleo::firewall::firewall_rules."""

i.e, the main thing is that we want to allow customers to customize the rules? Yet, we also choose to limit it by default. It's the fact that we limit it by default that is causing the regression in queens. For a customer that deployed queens, this means that previously they could ssh or run ansible playbooks from a node that is'nt on the ctlplane_subnet, note that this is true for non DCN (Spine-and-Leaf) usecase as well, but now this no longer works. So it potentially breaks:
 - My monitoring that used SSH to get metrics
 - Ansible playbooks used for automation
 - Operator script's/workflows using SSH from a workstation


> Your patch 667172 looks really nice, and it would be really great to get
> something like that for Queens..

Backporting that to Queens would require backporting heat change https://review.opendev.org/569053. That is not a backportable change.



I've proposed patches upstream and to stable branches:
https://review.opendev.org/#/q/topic:feature/firewall+(status:open+OR+status:merged)

The Rocky and Queens changes reverts to allowing any source to SSH by default. But we still have the tripleo::firewall::firewall_rules interface in Rocky and Queens, so operators still can choose to define more granular SSH rules should they choose to do so.

Comment 14 Cédric Jeanneret 2019-06-25 09:46:06 UTC
Sounds good. Maybe a last check from DFG:Security just to ensure things are fine?

Comment 15 Harald Jensås 2019-06-25 09:57:57 UTC
(In reply to Cédric Jeanneret from comment #14)
> Sounds good. Maybe a last check from DFG:Security just to ensure things are
> fine?

Yes, good idea! Adding DFG:Security for their comments.


I think for Queens the only alternative option would be to document that manually overriding with the CIDR's of all ctlplane subnets for DCN and spine-and-leaf use cases. Not great, but if we can document and somehow proactively get the message out to customers via TAM's/Portal I would'nt be very opposed.

For Rocky, we could do some more backports to plumb in the requirements to use the approch used on stein and master.

Comment 19 Gurenko Alex 2019-06-26 12:12:47 UTC
I've just finished deploying latest puddle 2019-06-25.1 with a new version of the openstack-tripleo-heat-templates-8.3.1-53.el7ost, but yet, I'm still getting connection refused for the nodes outside of the main controlplane network:

(undercloud) [stack@site-undercloud-0 ~]$ . stackrc
(undercloud) [stack@site-undercloud-0 ~]$ openstack server list
+--------------------------------------+-------------------------+--------+------------------------+----------------+----------+
| ID                                   | Name                    | Status | Networks               | Image          | Flavor   |
+--------------------------------------+-------------------------+--------+------------------------+----------------+----------+
| a9bd8d51-1460-4a35-809c-6647f38ddbba | overcloud-controller0-2 | ACTIVE | ctlplane=192.168.24.32 | overcloud-full | control0 |
| 64be2d48-b1d5-4fed-ab78-30efb5ade82d | overcloud-controller0-1 | ACTIVE | ctlplane=192.168.24.11 | overcloud-full | control0 |
| 15446384-b1c0-421e-a238-034e5f924811 | overcloud-compute0-0    | ACTIVE | ctlplane=192.168.24.23 | overcloud-full | compute0 |
| 3c64ef4c-699f-4143-a4fa-8cf90b58145e | overcloud-controller0-0 | ACTIVE | ctlplane=192.168.24.18 | overcloud-full | control0 |
| 9b05a7e9-0c08-41d3-b9c5-e2be102f6870 | overcloud-compute2-0    | ACTIVE | ctlplane=192.168.44.10 | overcloud-full | compute2 |
| 68cc92f6-ae20-4587-a3c8-c0050c20dc0c | overcloud-compute1-0    | ACTIVE | ctlplane=192.168.34.28 | overcloud-full | compute1 |
+--------------------------------------+-------------------------+--------+------------------------+----------------+----------+
(undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.44.10
ssh: connect to host 192.168.44.10 port 22: Connection refused
(undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.34.28
ssh: connect to host 192.168.34.28 port 22: Connection refused
(undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.24.23
Warning: Permanently added '192.168.24.23' (ECDSA) to the list of known hosts.
Last login: Wed Jun 26 12:10:24 2019 from 192.168.24.254
[heat-admin@overcloud-compute0-0 ~]$

I see changes are in the puddle, however issue is still present.

Comment 20 Harald Jensås 2019-06-26 12:18:16 UTC
(In reply to Gurenko Alex from comment #19)
> I've just finished deploying latest puddle 2019-06-25.1 with a new version
> of the openstack-tripleo-heat-templates-8.3.1-53.el7ost, but yet, I'm still
> getting connection refused for the nodes outside of the main controlplane
> network:
> 
> (undercloud) [stack@site-undercloud-0 ~]$ . stackrc
> (undercloud) [stack@site-undercloud-0 ~]$ openstack server list
> +--------------------------------------+-------------------------+--------+--
> ----------------------+----------------+----------+
> | ID                                   | Name                    | Status |
> Networks               | Image          | Flavor   |
> +--------------------------------------+-------------------------+--------+--
> ----------------------+----------------+----------+
> | a9bd8d51-1460-4a35-809c-6647f38ddbba | overcloud-controller0-2 | ACTIVE |
> ctlplane=192.168.24.32 | overcloud-full | control0 |
> | 64be2d48-b1d5-4fed-ab78-30efb5ade82d | overcloud-controller0-1 | ACTIVE |
> ctlplane=192.168.24.11 | overcloud-full | control0 |
> | 15446384-b1c0-421e-a238-034e5f924811 | overcloud-compute0-0    | ACTIVE |
> ctlplane=192.168.24.23 | overcloud-full | compute0 |
> | 3c64ef4c-699f-4143-a4fa-8cf90b58145e | overcloud-controller0-0 | ACTIVE |
> ctlplane=192.168.24.18 | overcloud-full | control0 |
> | 9b05a7e9-0c08-41d3-b9c5-e2be102f6870 | overcloud-compute2-0    | ACTIVE |
> ctlplane=192.168.44.10 | overcloud-full | compute2 |
> | 68cc92f6-ae20-4587-a3c8-c0050c20dc0c | overcloud-compute1-0    | ACTIVE |
> ctlplane=192.168.34.28 | overcloud-full | compute1 |
> +--------------------------------------+-------------------------+--------+--
> ----------------------+----------------+----------+
> (undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.44.10
> ssh: connect to host 192.168.44.10 port 22: Connection refused
> (undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.34.28
> ssh: connect to host 192.168.34.28 port 22: Connection refused
> (undercloud) [stack@site-undercloud-0 ~]$ ssh heat-admin.24.23
> Warning: Permanently added '192.168.24.23' (ECDSA) to the list of known
> hosts.
> Last login: Wed Jun 26 12:10:24 2019 from 192.168.24.254
> [heat-admin@overcloud-compute0-0 ~]$
> 
> I see changes are in the puddle, however issue is still present.

Can you get onto the node by SSH'ing from a node in the same subnet and get the iptables rules?

Comment 21 Harald Jensås 2019-06-26 12:42:11 UTC
It's the firewall in the hypervisor used to run the test:

Adding rule to FORWARD all traffic on the hypervisor fixes the issue.

 [root@titan68 ~]# sudo iptables -I FORWARD 1 -j ACCEPT


(undercloud) [stack@site-undercloud-0 ~]$ ssh  heat-admin.34.28
Warning: Permanently added '192.168.34.28' (ECDSA) to the list of known hosts.
Last login: Wed Jun 26 12:40:35 2019 from 192.168.24.1
[heat-admin@overcloud-compute1-0 ~]$

Comment 22 Bob Fournier 2019-06-26 12:46:30 UTC
Can this be retested?

Comment 24 bjacot 2019-06-26 17:21:49 UTC
Hello All,

I have validated this on my setup.  I am able to ssh from the director on core network --> a compute on leaf2 network.

(undercloud) [stack@core-undercloud-0 virt-all]$ rpm -qa | grep openstack-tripleo-heat-templates
openstack-tripleo-heat-templates-8.3.1-53.el7ost.noarch

Output:

(undercloud) [stack@core-undercloud-0 virt-all]$ ssh heat-admin.222.14
Warning: Permanently added '192.168.222.14' (ECDSA) to the list of known hosts.
Last login: Wed Jun 26 17:11:46 2019 from 10.35.64.2
[heat-admin@overcloud-compute2-1 ~]$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
[..]
ACCEPT     tcp  --  anywhere             anywhere             multiport dports ssh state NEW /* 003 accept ssh from any ipv4 */

Comment 27 errata-xmlrpc 2019-07-10 13:05:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1738


Note You need to log in before you can comment on or make changes to this bug.