Bug 1238133

Summary: Proper error output when deployment fails due to network connectivity issues in network isolation setup
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: rhosp-directorAssignee: Ana Krivokapic <akrivoka>
Status: CLOSED DUPLICATE QA Contact: Shai Revivo <srevivo>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0 (Kilo)CC: akrivoka, ebarrera, ggillies, hbrock, jslagle, mburns, rhel-osp-director-maint, sbaker
Target Milestone: ---Keywords: Triaged, UserExperience
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-13 20:45:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2015-07-01 09:27:31 UTC
Description of problem:

I was trying an overcloud deployment(3 ctrl, 1 compute, 1 ceph) on baremetal with network isolation and it failed with the following error message after long period:

ERROR: openstack ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1
Authentication required

After further checks I discovered that connectivity for nodes on the storage network was broken. I disabled the storage network thus bringing it to the provisioning network and deployment succeeded. 

However the error that I got is misleading. This leads me thinking that a deployment will fail without a proper message pointing to which node(IP on certain network) timed out if one node's networking is not configured correctly. This will make troubleshooting difficult especially when you'll have large deployments. I expect poor switch port configuration or incorrect cabling to happen and we should be able to point this out when deployment fails.

Comment 8 James Slagle 2016-10-14 15:05:06 UTC
this is not a regression, removing from osp10

Further, I believe we may have fixed this by having Heat reauthenticate to Keystone prior to the token expiring. Steve can you confirm if that was the root cause for this error?

Comment 9 Steve Baker 2016-12-13 20:45:34 UTC
This looks like a dupe for a bug fixed at about the same time as this one was raised.

The fix was to raise the token expiry, but James could be right about token re-authentication happening now too.

*** This bug has been marked as a duplicate of bug 1235908 ***