Hide Forgot
Following the upstream document to deploy Ironic in the overcloud http://tripleo.org/advanced_deployment/baremetal_overcloud.html I can successfully boot an image, multiple times with this command and delete it again nova boot --image centos-image --flavor baremetal --key default t-$(date '+%s') This is booting with an IP on the "external" network (shared subnet with the underclouds' ctlplane network, but different allocation range) Then when trying a newly created tenant network I run the following commands (from the upstream doc) openstack network create tenant-net openstack subnet create --network tenant-net --subnet-range 192.0.3.0/24 --allocation-pool start=192.0.3.10,end=192.0.3.20 tenant-subnet openstack router create default-router openstack router add subnet default-router external-subnet openstack router add subnet default-router tenant-subnet but the nova boot no longer works, 2682de22-b3d5-41fc-a0e6-339e0a6ec5b1 is the id of the external network and is needed as we now have a second network to select from, but this is the same network that we used above, I'm not using the newly created network nova boot --image centos-image --flavor baremetal --key default --nic net-id=2682de22-b3d5-41fc-a0e6-339e0a6ec5b1 t-$(date '+%s') The console of the node trying to boot shows http://172.21.64.7:8088/boot.ipxe... Connection reset.... tcpdump on em1 on the overcloud controller shows the server sending a tcp RST [root@overcloud-controller-0 ironic]# tcpdump -i em1 -A -nn port 8088 12:01:16.878335 IP 172.21.64.42.48823 > 172.21.64.7.8088: Flags [S], seq 345159258, win 65532, options [nop,nop,TS val 786961 ecr 0,nop,nop,sackOK,nop,wscale 9,mss 1460], length 0 E..@....@.....@*..@........Z.........E..... ............... .... 12:01:16.878736 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74091530 ecr 786961,nop,wscale 7], length 0 E..<..@.@.b`..@...@*...........[..q g1......... .j. ........ 12:01:16.878856 IP 172.21.64.42.48823 > 172.21.64.7.8088: Flags [P.], seq 1:112, ack 1, win 512, options [nop,nop,TS val 786961 ecr 74091530], length 111 E.......@..u..@*..@........[............... .....j. GET /boot.ipxe HTTP/1.1 Connection: keep-alive User-Agent: iPXE/1.0.0+ (6366fa7a) Host: 172.21.64.7:8088 12:01:16.879096 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [R], seq 2933959827, win 0, length 0 E..(..@.@.....@...@*............P....... 12:01:18.079703 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74092732 ecr 786961,nop,wscale 7], length 0 E..<..@.@.b`..@...@*...........[..q ........... .j.......... 12:01:20.079705 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74094732 ecr 786961,nop,wscale 7], length 0 E..<..@.@.b`..@...@*...........[..q ........... .j.......... 12:01:24.079708 IP 172.21.64.7.8088 > 172.21.64.42.48823: Flags [S.], seq 2933959826, ack 345159259, win 28960, options [mss 1460,sackOK,TS val 74098732 ecr 786961,nop,wscale 7], length 0 The same tcpdump on br-ex only shows a subset of the traffic, which may be relevant [root@overcloud-controller-0 ironic]# tcpdump -i br-ex -A -nn port 8088 11:57:41.714331 IP 172.21.64.42.53969 > 172.21.64.7.8088: Flags [S], seq 827453374, win 65532, options [nop,nop,TS val 783059 ecr 0,nop,nop,sackOK,nop,wscale 9,mss 1460], length 0 E..@....@.....@*..@.....1Q..........WF..... ............... .... 11:57:41.714415 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73876366 ecr 783059,nop,wscale 7], length 0 E..<..@.@.b`..@...@*....`...1Q....q ........... .gC......... 11:57:43.115679 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73877768 ecr 783059,nop,wscale 7], length 0 E..<..@.@.b`..@...@*....`...1Q....q ........... .gI......... 11:57:45.115678 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73879768 ecr 783059,nop,wscale 7], length 0 E..<..@.@.b`..@...@*....`...1Q....q ........... .gP......... 11:57:49.115684 IP 172.21.64.7.8088 > 172.21.64.42.53969: Flags [S.], seq 1624820428, ack 827453375, win 28960, options [mss 1460,sackOK,TS val 73883768 ecr 783059,nop,wscale 7], length 0 E..<..@.@.b`..@...@*....`...1Q....q ........... .g`x........ openstack-neutron-ml2-9.0.0-1.3.el7ost.noarch openstack-neutron-bigswitch-agent-7.0.5-0.20161003220959.509a93d.el7ost.noarch openstack-neutron-9.0.0-1.3.el7ost.noarch openstack-ironic-conductor-6.2.2-0.20161006174219.500a27d.el7ost.noarch openstack-neutron-common-9.0.0-1.3.el7ost.noarch openstack-neutron-bigswitch-lldp-7.0.5-0.20161003220959.509a93d.el7ost.noarch openstack-ironic-common-6.2.2-0.20161006174219.500a27d.el7ost.noarch openstack-ironic-api-6.2.2-0.20161006174219.500a27d.el7ost.noarch openstack-neutron-openvswitch-9.0.0-1.3.el7ost.noarch openstack-neutron-metering-agent-9.0.0-1.3.el7ost.noarch openstack-neutron-lbaas-9.0.0-2.el7ost.noarch openstack-neutron-sriov-nic-agent-9.0.0-1.3.el7ost.noarch
I don't the fix is in Ironic itself, so moving to a more generic component for triaging. Bob, mind taking a look please?
It looks like the external-subnet had the same IP for a gateway as the em1 interface so, while dhcp worked OK, the http tcp requests ended in TCP RST from overcloud-controller-0 because duplicate packets were seen. Dan made the following changes - "neutron subnet-update external-subnet --default-gateway 172.21.64.40" "neutron router-interface-add default-router external-subnet" stack@host07 ~]$ neutron subnet-show external-subnet +-------------------+---------------------------------------------------+ | Field | Value | +-------------------+---------------------------------------------------+ | allocation_pools | {"start": "172.21.64.41", "end": "172.21.64.100"} | | cidr | 172.21.64.0/24 | | created_at | 2016-10-20T16:21:20Z | | description | | | dns_nameservers | | | enable_dhcp | True | | gateway_ip | 172.21.64.40 | | host_routes | | | id | ac92ca1f-7e44-4682-b871-db02be8109b8 | | ip_version | 4 | | ipv6_address_mode | | | ipv6_ra_mode | | | name | external-subnet | | network_id | c0f4dbd7-126c-4ba9-b02b-3f9c25479936 | | project_id | 45951f2e923c4462ab7fd776a573529d | | revision_number | 3 | | service_types | | | subnetpool_id | | | tenant_id | 45951f2e923c4462ab7fd776a573529d | | updated_at | 2016-10-21T18:19:44Z | +-------------------+---------------------------------------------------+ After rerunning "nova boot" the BM instance came up: [stack@host07 ~]$ openstack baremetal node list +-------------------+--------+-------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +-------------------+--------+-------------------+-------------+--------------------+-------------+ | aede9ccf-d03e- | host09 | c76a7f10-089a- | power on | active | False | | 4a09-ac6e- | | 4df3-ae6f- | | | | | 5bcdf3da2619 | | 4c23db59d644 | | | | +-------------------+--------+-------------------+-------------+--------------------+-------------+ [stack@host07 ~]$ nova list +--------------------------------------+-------+--------+------------+-------------+-----------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+------------+-------------+-----------------------+ | 9318bddd-4a3f-439f-bd78-a472b192450f | test1 | ERROR | - | NOSTATE | | | c76a7f10-089a-4df3-ae6f-4c23db59d644 | test1 | ACTIVE | - | Running | external=172.21.64.47 | +--------------------------------------+-------+--------+------------+-------------+-----------------------+ On the console there is a login prompt.
Thanks for looking into it, Dan, Bob!