Bug 1302402 - pxe boot time outs (PXE-E32) on ospd 7 during introspection and during deployment
pxe boot time outs (PXE-E32) on ospd 7 during introspection and during deploy...
Status: CLOSED CURRENTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
7.0 (Kilo)
Unspecified Unspecified
high Severity high
: ---
: 10.0 (Newton)
Assigned To: Hugh Brock
Shai Revivo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-27 13:33 EST by Alex Krzos
Modified: 2016-10-14 12:20 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-10-14 12:20:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alex Krzos 2016-01-27 13:33:05 EST
Description of problem:
I am testing the bits from rhn (OSPD 7.2?) for an anticipated deployment with similar hardware and I can not get consistent behavior with introspection and can not get a working deployment.  I can not get past PXE booting nodes in a consistent manner.

Version-Release number of selected component (if applicable):
python-rdomanager-oscplugin-0.0.10-22.el7ost.noarch
openstack-ironic-common-2015.1.2-2.el7ost.noarch
python-ironicclient-0.5.1-12.el7ost.noarch
python-ironic-discoverd-1.1.0-8.el7ost.noarch
openstack-ironic-conductor-2015.1.2-2.el7ost.noarch
openstack-ironic-api-2015.1.2-2.el7ost.noarch
openstack-ironic-discoverd-1.1.0-8.el7ost.noarch

How reproducible:
Always

Steps to Reproduce:
1.  Follow deployment instructions on access.redhat.com and attempt introspection while viewing nodes consoles.
2.
3.

Actual results:
PXE-E32: TFTP open timeout on each nodes consoles

Expected results:
Nodes to boot (consistently) and run through discovery so then they can be provisioned.

Additional info:
The network is correctly setup as previous builds worked on this hardware.  last working version used on this hardware was (python-rdomanager-oscplugin-0.0.10-19.el7ost.noarch)  I have a lab network, provisioning network and single vlan-ed network for all other required networks.

Several attempts at deploying (this version) on this setup have results in occasionally successful discovery process.  However the following deployment has consistently failed everytime afterwards.  Something is also adding iptable rules to prevent pxe booting consistently:

Chain discovery (1 references)
target     prot opt source               destination
DROP       all  --  anywhere             anywhere             MAC D4:85:64:79:EB:FC
DROP       all  --  anywhere             anywhere             MAC D4:85:64:79:AF:C0
ACCEPT     all  --  anywhere             anywhere

Even if I manually drop them, the rules reappear after 5-10seconds.  This behavior is not exhibited during introspection however I still see timeouts.
Comment 2 Jaromir Coufal 2016-01-27 17:06:45 EST
Hi Lucas, on MikeO's behalf -- do you mind having a quick look into this one and provide some investigation what might be a cause?
Comment 4 Mike Burns 2016-04-07 17:07:13 EDT
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.
Comment 6 Dmitry Tantsur 2016-10-14 11:41:12 EDT
Hi! Could you please check if the issue can be reproduced on OSP 10?
Comment 7 Alex Krzos 2016-10-14 11:51:32 EDT
(In reply to Dmitry Tantsur from comment #6)
> Hi! Could you please check if the issue can be reproduced on OSP 10?

I have not seen this(or reproduced it) on any of the OSP10 (Newton) deployments I have done via director.
Comment 8 Dmitry Tantsur 2016-10-14 12:20:37 EDT
Thanks! I suspect it was fixed after we updated the shipped firmware.

Note You need to log in before you can comment on or make changes to this bug.