Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1302402

Summary:	pxe boot time outs (PXE-E32) on ospd 7 during introspection and during deployment
Product:	Red Hat OpenStack	Reporter:	Alex Krzos <akrzos>
Component:	rhosp-director	Assignee:	Hugh Brock <hbrock>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Shai Revivo <srevivo>
Severity:	high	Docs Contact:
Priority:	high
Version:	7.0 (Kilo)	CC:	akrzos, dtantsur, jcoufal, jtaleric, lmartins, mburns, rhel-osp-director-maint
Target Milestone:	---
Target Release:	10.0 (Newton)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-10-14 16:20:37 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Alex Krzos 2016-01-27 18:33:05 UTC

Description of problem:
I am testing the bits from rhn (OSPD 7.2?) for an anticipated deployment with similar hardware and I can not get consistent behavior with introspection and can not get a working deployment. I can not get past PXE booting nodes in a consistent manner.

Version-Release number of selected component (if applicable):
python-rdomanager-oscplugin-0.0.10-22.el7ost.noarch
openstack-ironic-common-2015.1.2-2.el7ost.noarch
python-ironicclient-0.5.1-12.el7ost.noarch
python-ironic-discoverd-1.1.0-8.el7ost.noarch
openstack-ironic-conductor-2015.1.2-2.el7ost.noarch
openstack-ironic-api-2015.1.2-2.el7ost.noarch
openstack-ironic-discoverd-1.1.0-8.el7ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Follow deployment instructions on access.redhat.com and attempt introspection while viewing nodes consoles.
2.
3.

Actual results:
PXE-E32: TFTP open timeout on each nodes consoles

Expected results:
Nodes to boot (consistently) and run through discovery so then they can be provisioned.

Additional info:
The network is correctly setup as previous builds worked on this hardware. last working version used on this hardware was (python-rdomanager-oscplugin-0.0.10-19.el7ost.noarch) I have a lab network, provisioning network and single vlan-ed network for all other required networks.

Several attempts at deploying (this version) on this setup have results in occasionally successful discovery process. However the following deployment has consistently failed everytime afterwards. Something is also adding iptable rules to prevent pxe booting consistently:

Chain discovery (1 references)
target prot opt source destination
DROP all -- anywhere anywhere MAC D4:85:64:79:EB:FC
DROP all -- anywhere anywhere MAC D4:85:64:79:AF:C0
ACCEPT all -- anywhere anywhere

Even if I manually drop them, the rules reappear after 5-10seconds. This behavior is not exhibited during introspection however I still see timeouts.

Comment 2 Jaromir Coufal 2016-01-27 22:06:45 UTC

Hi Lucas, on MikeO's behalf -- do you mind having a quick look into this one and provide some investigation what might be a cause?

Comment 4 Mike Burns 2016-04-07 21:07:13 UTC

This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 6 Dmitry Tantsur 2016-10-14 15:41:12 UTC

Hi! Could you please check if the issue can be reproduced on OSP 10?

Comment 7 Alex Krzos 2016-10-14 15:51:32 UTC

(In reply to Dmitry Tantsur from comment #6)
> Hi! Could you please check if the issue can be reproduced on OSP 10?

I have not seen this(or reproduced it) on any of the OSP10 (Newton) deployments I have done via director.

Comment 8 Dmitry Tantsur 2016-10-14 16:20:37 UTC

Thanks! I suspect it was fixed after we updated the shipped firmware.