Bug 1330219 - Unable to boot nodes when deploying Overcloud (PXE)
Summary: Unable to boot nodes when deploying Overcloud (PXE)
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic
Version: 8.0 (Liberty)
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Lucas Alvares Gomes
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-25 16:20 UTC by Guillaume Chenuet
Modified: 2016-04-26 10:19 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-26 10:19:20 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Guillaume Chenuet 2016-04-25 16:20:29 UTC
Description of problem:
Currently on-site for a Spanish customers, I'm facing a problem with PXE/DHCP boot.

When trying to deploy the overcloud, nodes stuck on PXE screen.
I'm able to introspect all nodes without any problems.

I'm using Huawei E9000 Blade with CH121 V3 nodes.
Network card are MZ512 Mezzanine card configured as simple NIC.

I already tried to use some workaround:
- Bootif-fix script
- Use PXE instead of iPXE
- Use an up-to-date undionly.kpxe image

When deploying, i can see with tcpdump DISCOVER and OFFER dhcp-messages.
But not REQUEST and ACK messages.

            Domain-Name-Server Option 6, length 4: 10.50.228.130
17:34:37.082801 e0:36:76:d3:44:1e > Broadcast, ethertype IPv4 (0x0800), length 590: (tos 0x0, ttl 20, id 0, offset 0, flags [none], proto UDP (17), length 576)
    0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from e0:36:76:d3:44:1e, length 548, xid 0x77d3441e, secs 4, Flags [Broadcast]
          Client-Ethernet-Address e0:36:76:d3:44:1e
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            Parameter-Request Option 55, length 36:
              Subnet-Mask, Time-Zone, Default-Gateway, Time-Server
              IEN-Name-Server, Domain-Name-Server, RL, Hostname
              BS, Domain-Name, SS, RP
              EP, RSZ, TTL, BR
              YD, YS, NTP, Vendor-Option
              Requested-IP, Lease-Time, Server-ID, RN
              RB, Vendor-Class, TFTP, BF
              Option 128, Option 129, Option 130, Option 131
              Option 132, Option 133, Option 134, Option 135
            MSZ Option 57, length 2: 1260
            GUID Option 97, length 17: 0.158.248.21.40.210.29.178.17.149.60.0.24.35.229.246.139
            ARCH Option 93, length 2: 0
            NDI Option 94, length 3: 1.2.1
            Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
17:34:37.083670 fa:16:3e:1a:ba:d0 > Broadcast, ethertype IPv4 (0x0800), length 382: (tos 0xc0, ttl 64, id 15790, offset 0, flags [none], proto UDP (17), length 368)
    10.184.20.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 340, xid 0x77d3441e, secs 4, Flags [Broadcast]
          Your-IP 10.184.20.7
          Server-IP 10.184.20.1
          Client-Ethernet-Address e0:36:76:d3:44:1e
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 10.184.20.5
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 10.184.20.255
            Hostname Option 12, length 16: "host-10-184-20-7"
            TFTP Option 66, length 12: "10.184.20.1^@"
            BF Option 67, length 14: "undionly.kpxe^@"
            Default-Gateway Option 3, length 4: 10.184.20.1
            Domain-Name-Server Option 6, length 4: 10.50.228.130

Version-Release number of selected component (if applicable):
RHOSP8
openstack-puppet-modules-7.0.17-1.el7ost.noarch
openstack-ceilometer-alarm-5.0.2-2.el7ost.noarch
openstack-utils-2014.2-1.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch
openstack-neutron-openvswitch-7.0.1-15.el7ost.noarch
openstack-heat-api-5.0.1-5.el7ost.noarch
openstack-swift-proxy-2.5.0-2.el7ost.noarch
openstack-tripleo-0.0.7-1.el7ost.noarch
openstack-nova-common-12.0.2-5.el7ost.noarch
openstack-ceilometer-common-5.0.2-2.el7ost.noarch
openstack-ceilometer-collector-5.0.2-2.el7ost.noarch
python-openstackclient-1.7.2-1.el7ost.noarch
openstack-ironic-common-4.2.2-4.el7ost.noarch
openstack-selinux-0.6.58-1.el7ost.noarch
openstack-glance-11.0.1-4.el7ost.noarch
openstack-nova-compute-12.0.2-5.el7ost.noarch
openstack-swift-2.5.0-2.el7ost.noarch
openstack-ironic-inspector-2.2.5-2.el7ost.noarch
openstack-heat-engine-5.0.1-5.el7ost.noarch
openstack-nova-scheduler-12.0.2-5.el7ost.noarch
openstack-swift-container-2.5.0-2.el7ost.noarch
openstack-nova-api-12.0.2-5.el7ost.noarch
openstack-ceilometer-notification-5.0.2-2.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-5.el7ost.noarch
openstack-ceilometer-polling-5.0.2-2.el7ost.noarch
openstack-heat-templates-0-0.1.20151019.el7ost.noarch
openstack-tripleo-image-elements-0.9.9-1.el7ost.noarch
openstack-neutron-ml2-7.0.1-15.el7ost.noarch
openstack-ironic-api-4.2.2-4.el7ost.noarch
openstack-heat-api-cfn-5.0.1-5.el7ost.noarch
openstack-keystone-8.0.1-1.el7ost.noarch
openstack-swift-object-2.5.0-2.el7ost.noarch
openstack-nova-cert-12.0.2-5.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-5.el7ost.noarch
openstack-ceilometer-central-5.0.2-2.el7ost.noarch
openstack-tripleo-common-0.3.1-1.el7ost.noarch
openstack-ceilometer-api-5.0.2-2.el7ost.noarch
openstack-heat-api-cloudwatch-5.0.1-5.el7ost.noarch
openstack-swift-plugin-swift3-1.9-1.el7ost.noarch
openstack-neutron-common-7.0.1-15.el7ost.noarch
openstack-neutron-7.0.1-15.el7ost.noarch
openstack-ironic-conductor-4.2.2-4.el7ost.noarch
openstack-nova-conductor-12.0.2-5.el7ost.noarch
openstack-heat-common-5.0.1-5.el7ost.noarch
openstack-swift-account-2.5.0-2.el7ost.noarch

How reproducible:

Setup an RHOSP-8 and try to deploy it


Steps to Reproduce:
1. Install RHOSP8
2. Introspect nodes
3. Deploy an Overcloud

Actual results:

Currently, nodes are waiting an answer from Undercloud DHCP.

Expected results:

Have an Overcloud fully deployed


Additional info:

Comment 2 Guillaume Chenuet 2016-04-26 08:48:30 UTC
Hello,

More informations here:

- /httpboot/pxeconfig.cfg/ has the correct mac address (not the second nic).
The configuration is:
[root@lm2puc01 httpboot]# cat pxelinux.cfg/e0-36-76-d3-4f-4c
#!ipxe

dhcp

goto deploy

:deploy
kernel http://10.184.20.1:8088/65f19384-0076-4f5f-b1c4-a45df83ce1af/deploy_kernel selinux=0 disk=cciss/c0d0,sda,hda,vda iscsi_target_iqn=iqn.2008-10.org.openstack:65f19384-0076-4f5f-b1c4-a45df83ce1af deployment_id=65f19384-0076-4f5f-b1c4-a45df83ce1af deployment_key=LAFSRDH4XZYW0QONR2FH5QXJRJ900QJM ironic_api_url=http://10.184.20.1:6385 troubleshoot=0 text nofb nomodeset vga=normal boot_option=local ip=${ip}:${next-server}:${gateway}:${netmask} BOOTIF=${net0/mac}  ipa-api-url=http://10.184.20.1:6385 ipa-driver-name=pxe_ipmitool boot_mode=bios initrd=deploy_ramdisk coreos.configdrive=0

initrd http://10.184.20.1:8088/65f19384-0076-4f5f-b1c4-a45df83ce1af/deploy_ramdisk
boot

:boot_partition
kernel http://10.184.20.1:8088/65f19384-0076-4f5f-b1c4-a45df83ce1af/kernel root={{ ROOT }} ro text nofb nomodeset vga=normal
initrd http://10.184.20.1:8088/65f19384-0076-4f5f-b1c4-a45df83ce1af/ramdisk
boot

:boot_whole_disk
sanboot --no-describe

- I succeed to see DHCP trames (only DISCOVER and OFFER) on OVS interface:
sudo ip netns exec qdhcp-651da49f-57ab-4050-9a64-1b0dcc98d4bc tcpdump port 67 or port 68 -vnes0

0:41:58.519856 e0:36:76:d3:4f:4c > Broadcast, ethertype IPv4 (0x0800), length 590: (tos 0x0, ttl 20, id 1, offset 0, flags [none], proto UDP (17), length 576)
    0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from e0:36:76:d3:4f:4c, length 548, xid 0x78d34f4c, secs 6, Flags [Broadcast]
          Client-Ethernet-Address e0:36:76:d3:4f:4c
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            Parameter-Request Option 55, length 36:
              Subnet-Mask, Time-Zone, Default-Gateway, Time-Server
              IEN-Name-Server, Domain-Name-Server, RL, Hostname
              BS, Domain-Name, SS, RP
              EP, RSZ, TTL, BR
              YD, YS, NTP, Vendor-Option
              Requested-IP, Lease-Time, Server-ID, RN
              RB, Vendor-Class, TFTP, BF
              Option 128, Option 129, Option 130, Option 131
              Option 132, Option 133, Option 134, Option 135
            MSZ Option 57, length 2: 1260
            GUID Option 97, length 17: 0.200.53.145.39.210.29.178.17.144.217.0.24.35.229.246.139
            ARCH Option 93, length 2: 0
            NDI Option 94, length 3: 1.2.1
            Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
10:41:58.520179 fa:16:3e:1a:ba:d0 > Broadcast, ethertype IPv4 (0x0800), length 383: (tos 0xc0, ttl 64, id 11578, offset 0, flags [none], proto UDP (17), length 369)
    10.184.20.5.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 341, xid 0x78d34f4c, secs 6, Flags [Broadcast]
          Your-IP 10.184.20.12
          Server-IP 10.184.20.1
          Client-Ethernet-Address e0:36:76:d3:4f:4c
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 10.184.20.5
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 10.184.20.255
            Hostname Option 12, length 17: "host-10-184-20-12"
            BF Option 67, length 14: "undionly.kpxe^@"
            TFTP Option 66, length 12: "10.184.20.1^@"
            Default-Gateway Option 3, length 4: 10.184.20.1
            Domain-Name-Server Option 6, length 4: 8.8.8.8

- Informations are correctly fill in dnsmasq files: 
--/var/lib/neutron/dhcp/<UUID>/opts:
[stack@lm2puc01 ~]$ cat /var/lib/neutron/dhcp/651da49f-57ab-4050-9a64-1b0dcc98d4bc/opts
tag:tag0,option:dns-server,8.8.8.8
tag:tag0,option:classless-static-route,169.254.169.254/32,10.184.20.1,0.0.0.0/0,10.184.20.1
tag:tag0,249,169.254.169.254/32,10.184.20.1,0.0.0.0/0,10.184.20.1
tag:tag0,option:router,10.184.20.1
tag:e1f083e9-b512-4f43-b1de-a95c19360b82,option:server-ip-address,10.184.20.1
tag:e1f083e9-b512-4f43-b1de-a95c19360b82,option:tftp-server,10.184.20.1
tag:e1f083e9-b512-4f43-b1de-a95c19360b82,tag:!ipxe,option:bootfile-name,undionly.kpxe
tag:e1f083e9-b512-4f43-b1de-a95c19360b82,tag:ipxe,option:bootfile-name,http://10.184.20.1:8088/boot.ipxe

Comment 3 Guillaume Chenuet 2016-04-26 10:19:20 UTC
Problem was due to the RHEV Hypervisor, where my VM is running.

This article solve my problem: https://access.redhat.com/solutions/2060423

Thanks,
Guillaume


Note You need to log in before you can comment on or make changes to this bug.