Description of problem: When trying to SSH to an instance launched on an upgraded OSP8->OSP9 overcloud the SSH connection fails: [stack@undercloud ~]$ ssh fedora.18.143 -v OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013 debug1: Reading configuration data /etc/ssh/ssh_config debug1: /etc/ssh/ssh_config line 56: Applying options for * debug1: Connecting to 172.16.18.143 [172.16.18.143] port 22. debug1: Connection established. debug1: identity file /home/stack/.ssh/id_rsa type 1 debug1: identity file /home/stack/.ssh/id_rsa-cert type -1 debug1: identity file /home/stack/.ssh/id_dsa type -1 debug1: identity file /home/stack/.ssh/id_dsa-cert type -1 debug1: identity file /home/stack/.ssh/id_ecdsa type -1 debug1: identity file /home/stack/.ssh/id_ecdsa-cert type -1 debug1: identity file /home/stack/.ssh/id_ed25519 type -1 debug1: identity file /home/stack/.ssh/id_ed25519-cert type -1 debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_6.6.1 debug1: Remote protocol version 2.0, remote software version OpenSSH_6.8 debug1: match: OpenSSH_6.8 pat OpenSSH* compat 0x04000000 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-sha1-etm none debug1: kex: client->server aes128-ctr hmac-sha1-etm none debug1: kex: curve25519-sha256 need=20 dh_need=20 debug1: kex: curve25519-sha256 need=20 dh_need=20 debug1: sending SSH2_MSG_KEX_ECDH_INIT debug1: expecting SSH2_MSG_KEX_ECDH_REPLY Connection closed by 172.16.18.143 It appears that the MTU for the ports created on the compute node for the new instance is of 1350B, while inside the instance the interface gets the MTU set to 1400B: [root@overcloud-compute-0 heat-admin]# brctl show bridge name bridge id STP enabled interfaces qbr1a01fd91-90 8000.c65abb2b3e52 no qvb1a01fd91-90 tap1a01fd91-90 qbrfc90dc3f-fe 8000.beb702a152fd no qvbfc90dc3f-fe tapfc90dc3f-fe [root@overcloud-compute-0 heat-admin]# ip a s dev qbrfc90dc3f-fe 22: qbrfc90dc3f-fe: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1350 qdisc noqueue state UP link/ether be:b7:02:a1:52:fd brd ff:ff:ff:ff:ff:ff [root@overcloud-compute-0 heat-admin]# ip a s dev tapfc90dc3f-fe 25: tapfc90dc3f-fe: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1350 qdisc pfifo_fast master qbrfc90dc3f-fe state UNKNOWN qlen 500 link/ether fe:16:3e:95:18:d4 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc16:3eff:fe95:18d4/64 scope link valid_lft forever preferred_lft forever [root@overcloud-compute-0 heat-admin]# ip a s dev qvbfc90dc3f-fe 24: qvbfc90dc3f-fe@qvofc90dc3f-fe: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1350 qdisc pfifo_fast master qbrfc90dc3f-fe state UP qlen 1000 link/ether be:b7:02:a1:52:fd brd ff:ff:ff:ff:ff:ff inet6 fe80::bcb7:2ff:fea1:52fd/64 scope link valid_lft forever preferred_lft forever Inside the instance I can see that the MTU of eth0 is set to 1400. This is the network that the instance is connected to: [stack@undercloud ~]$ neutron net-show stack-76-tenant_net_ext_tagged-q3pqucnr3qnn-private_network-bdp7opz6hmht /usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 172.16.18.25 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning /usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 172.16.18.25 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning +---------------------------+--------------------------------------------------------------------------+ | Field | Value | +---------------------------+--------------------------------------------------------------------------+ | admin_state_up | True | | availability_zone_hints | | | availability_zones | nova | | created_at | 2016-08-08T12:40:12 | | description | | | id | 0161eb0f-8eba-4fff-b7a9-6ac6bc2244fd | | ipv4_address_scope | | | ipv6_address_scope | | | mtu | 1350 | | name | stack-76-tenant_net_ext_tagged-q3pqucnr3qnn-private_network-bdp7opz6hmht | | port_security_enabled | True | | provider:network_type | vxlan | | provider:physical_network | | | provider:segmentation_id | 96 | | qos_policy_id | | | router:external | False | | shared | False | | status | ACTIVE | | subnets | e77997c5-1e91-4cf8-a846-53cc8ca88bb3 | | tags | | | tenant_id | 2c85f77a58f34fdc91cdb3d90ce5b3b0 | | updated_at | 2016-08-08T12:40:12 | +---------------------------+--------------------------------------------------------------------------+ [stack@undercloud ~]$ neutron subnet-show e77997c5-1e91-4cf8-a846-53cc8ca88bb3 /usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 172.16.18.25 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning /usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 172.16.18.25 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning +-------------------+-------------------------------------------------------------------------+ | Field | Value | +-------------------+-------------------------------------------------------------------------+ | allocation_pools | {"start": "10.10.10.2", "end": "10.10.10.254"} | | cidr | 10.10.10.0/24 | | created_at | 2016-08-08T12:40:15 | | description | | | dns_nameservers | 8.8.8.8 | | | 8.8.4.4 | | enable_dhcp | True | | gateway_ip | 10.10.10.1 | | host_routes | | | id | e77997c5-1e91-4cf8-a846-53cc8ca88bb3 | | ip_version | 4 | | ipv6_address_mode | | | ipv6_ra_mode | | | name | stack-76-tenant_net_ext_tagged-q3pqucnr3qnn-private_subnet-cuzos5vunata | | network_id | 0161eb0f-8eba-4fff-b7a9-6ac6bc2244fd | | subnetpool_id | | | tenant_id | 2c85f77a58f34fdc91cdb3d90ce5b3b0 | | updated_at | 2016-08-08T12:40:15 | +-------------------+-------------------------------------------------------------------------+ I'm trying to reach the instance via a floating IP. Note that instances created before upgrade got the ports created with 1400B MTU so it appears that the network mtu got changed from 1400 to 1350 during the upgrade. Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-liberty-2.0.0-30.el7ost.noarch openstack-tripleo-heat-templates-2.0.0-30.el7ost.noarch openstack-tripleo-heat-templates-kilo-0.8.14-16.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP8 overcloud 2. Upgrade overcloud to OSP9 3. Create network, router, floting ip on an external network 4. Launch instance Actual results: Unable to SSH to the instance via the floating IP. It looks like there's a mismatch between the MTU set inside the instance (1400B) and the MTU for the devices on the compute nodes,set to 1350B. Expected results: I'm able to SSH to instances created in the same way as OSP8. Additional info:
Potentially related to https://review.openstack.org/#/c/333333/ (I'll add it as an actual tracker when I get confirmation)
Nir, Can you review if this is indeed a RC blocker, or can it wait for a fix for GA (0day) or a later maintenance release. Thanks, Scott
Clearing flags now that build is done
We have a dependency on 4 other RHBZs, as of right now 3 are ready, 1 is pending on backports. Once all 4 are ON_QA, this bug can be verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1766.html