Bug 1387644

Summary: Ironic can't deploy nodes in the overcloud after a tenant network is created
Product: Red Hat OpenStack Reporter: Derek Higgins <derekh>
Component: rhosp-directorAssignee: Bob Fournier <bfournie>
Status: CLOSED NOTABUG QA Contact: Raviv Bar-Tal <rbartal>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: dbecker, dsneddon, dtantsur, mburns, morazi, rhel-osp-director-maint, srevivo
Target Milestone: ---   
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-21 20:14:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Derek Higgins 2016-10-21 12:36:57 UTC
Following the  upstream document to deploy Ironic in the overcloud
http://tripleo.org/advanced_deployment/baremetal_overcloud.html

I can successfully boot an image, multiple times with this command and delete it again
nova boot --image centos-image --flavor baremetal --key default t-$(date '+%s')

This is booting with an IP on the "external" network (shared subnet with the underclouds' ctlplane network, but different allocation range)

Then when trying a newly created tenant network I run the following commands (from the upstream doc)
openstack network create tenant-net
openstack subnet create --network tenant-net --subnet-range 192.0.3.0/24     --allocation-pool start=192.0.3.10,end=192.0.3.20 tenant-subnet
openstack router create default-router
openstack router add subnet default-router external-subnet
openstack router add subnet default-router tenant-subnet

but the nova boot no longer works, 2682de22-b3d5-41fc-a0e6-339e0a6ec5b1 is the id of the external network and is needed as we now have a second network to select from, but this is the same network that we used above, I'm not using the newly created network
nova boot --image centos-image --flavor baremetal --key default --nic net-id=2682de22-b3d5-41fc-a0e6-339e0a6ec5b1  t-$(date '+%s')

The console of the node trying to boot shows
http://172.21.64.7:8088/boot.ipxe... Connection reset....

tcpdump on em1 on the overcloud controller shows the server sending a tcp RST
[root@overcloud-controller-0 ironic]# tcpdump -i em1 -A -nn port 8088
12:01:16.878335 IP 172.21.64.42.48823 > 172.21.64.7.8088: Flags [S], seq 345159258, win 65532, options [nop,nop,TS val 786961 ecr 0,nop,nop,sackOK,nop,wscale 9,mss 1460], length 0
E..@....@.....@*..@........Z.........E.....
............... ....
12:01:16.878736 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74091530 ecr 786961,nop,wscale 7], length 0
E..<..@.@.b`..@...@*...........[..q g1.........
.j.
........
12:01:16.878856 IP 172.21.64.42.48823 > 172.21.64.7.8088: Flags [P.], seq 1:112, ack 1, win 512, options [nop,nop,TS val 786961 ecr 74091530], length 111
E.......@..u..@*..@........[...............
.....j.
GET /boot.ipxe HTTP/1.1
Connection: keep-alive
User-Agent: iPXE/1.0.0+ (6366fa7a)
Host: 172.21.64.7:8088


12:01:16.879096 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [R], seq 2933959827, win 0, length 0
E..(..@.@.....@...@*............P.......
12:01:18.079703 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74092732 ecr 786961,nop,wscale 7], length 0
E..<..@.@.b`..@...@*...........[..q ...........
.j..........
12:01:20.079705 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74094732 ecr 786961,nop,wscale 7], length 0
E..<..@.@.b`..@...@*...........[..q ...........
.j..........
12:01:24.079708 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74098732 ecr 786961,nop,wscale 7], length 0





The same tcpdump on br-ex only shows a subset of the traffic, which may be relevant 
[root@overcloud-controller-0 ironic]# tcpdump -i br-ex -A -nn port 8088
11:57:41.714331 IP 172.21.64.42.53969 > 172.21.64.7.8088: Flags [S], seq 827453374, win 65532, options [nop,nop,TS val 783059 ecr 0,nop,nop,sackOK,nop,wscale 9,mss 1460], length 0
E..@....@.....@*..@.....1Q..........WF.....
............... ....
11:57:41.714415 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73876366 ecr 783059,nop,wscale 7], length 0
E..<..@.@.b`..@...@*....`...1Q....q ...........
.gC.........
11:57:43.115679 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73877768 ecr 783059,nop,wscale 7], length 0
E..<..@.@.b`..@...@*....`...1Q....q ...........
.gI.........
11:57:45.115678 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73879768 ecr 783059,nop,wscale 7], length 0
E..<..@.@.b`..@...@*....`...1Q....q ...........
.gP.........
11:57:49.115684 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73883768 ecr 783059,nop,wscale 7], length 0
E..<..@.@.b`..@...@*....`...1Q....q ...........
.g`x........



openstack-neutron-ml2-9.0.0-1.3.el7ost.noarch
openstack-neutron-bigswitch-agent-7.0.5-0.20161003220959.509a93d.el7ost.noarch
openstack-neutron-9.0.0-1.3.el7ost.noarch
openstack-ironic-conductor-6.2.2-0.20161006174219.500a27d.el7ost.noarch
openstack-neutron-common-9.0.0-1.3.el7ost.noarch
openstack-neutron-bigswitch-lldp-7.0.5-0.20161003220959.509a93d.el7ost.noarch
openstack-ironic-common-6.2.2-0.20161006174219.500a27d.el7ost.noarch
openstack-ironic-api-6.2.2-0.20161006174219.500a27d.el7ost.noarch
openstack-neutron-openvswitch-9.0.0-1.3.el7ost.noarch
openstack-neutron-metering-agent-9.0.0-1.3.el7ost.noarch
openstack-neutron-lbaas-9.0.0-2.el7ost.noarch
openstack-neutron-sriov-nic-agent-9.0.0-1.3.el7ost.noarch

Comment 1 Dmitry Tantsur 2016-10-21 12:48:55 UTC
I don't the fix is in Ironic itself, so moving to a more generic component for triaging. Bob, mind taking a look please?

Comment 2 Bob Fournier 2016-10-21 20:14:58 UTC
It looks like the external-subnet had the same IP for a gateway as the em1 interface so, while dhcp worked OK, the http tcp requests ended in TCP RST from overcloud-controller-0 because duplicate packets were seen.

Dan made the following changes -
"neutron subnet-update external-subnet --default-gateway 172.21.64.40"
"neutron router-interface-add default-router external-subnet"

stack@host07 ~]$ neutron subnet-show external-subnet
+-------------------+---------------------------------------------------+
| Field             | Value                                             |
+-------------------+---------------------------------------------------+
| allocation_pools  | {"start": "172.21.64.41", "end": "172.21.64.100"} |
| cidr              | 172.21.64.0/24                                    |
| created_at        | 2016-10-20T16:21:20Z                              |
| description       |                                                   |
| dns_nameservers   |                                                   |
| enable_dhcp       | True                                              |
| gateway_ip        | 172.21.64.40                                      |
| host_routes       |                                                   |
| id                | ac92ca1f-7e44-4682-b871-db02be8109b8              |
| ip_version        | 4                                                 |
| ipv6_address_mode |                                                   |
| ipv6_ra_mode      |                                                   |
| name              | external-subnet                                   |
| network_id        | c0f4dbd7-126c-4ba9-b02b-3f9c25479936              |
| project_id        | 45951f2e923c4462ab7fd776a573529d                  |
| revision_number   | 3                                                 |
| service_types     |                                                   |
| subnetpool_id     |                                                   |
| tenant_id         | 45951f2e923c4462ab7fd776a573529d                  |
| updated_at        | 2016-10-21T18:19:44Z                              |
+-------------------+---------------------------------------------------+

After rerunning "nova boot" the BM instance came up:

[stack@host07 ~]$ openstack baremetal node list
+-------------------+--------+-------------------+-------------+--------------------+-------------+
| UUID              | Name   | Instance UUID     | Power State | Provisioning State | Maintenance |
+-------------------+--------+-------------------+-------------+--------------------+-------------+
| aede9ccf-d03e-    | host09 | c76a7f10-089a-    | power on    | active             | False       |
| 4a09-ac6e-        |        | 4df3-ae6f-        |             |                    |             |
| 5bcdf3da2619      |        | 4c23db59d644      |             |                    |             |
+-------------------+--------+-------------------+-------------+--------------------+-------------+
[stack@host07 ~]$ nova list
+--------------------------------------+-------+--------+------------+-------------+-----------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks              |
+--------------------------------------+-------+--------+------------+-------------+-----------------------+
| 9318bddd-4a3f-439f-bd78-a472b192450f | test1 | ERROR  | -          | NOSTATE     |                       |
| c76a7f10-089a-4df3-ae6f-4c23db59d644 | test1 | ACTIVE | -          | Running     | external=172.21.64.47 |
+--------------------------------------+-------+--------+------------+-------------+-----------------------+

On the console there is a login prompt.

Comment 3 Dmitry Tantsur 2016-10-22 10:49:31 UTC
Thanks for looking into it, Dan, Bob!