Bug 1308614

Summary: Intermittent 'end of file reached (EOFError)' when re-using the Director for another deployment
Product: Red Hat Quickstart Cloud Installer Reporter: Antonin Pagac <apagac>
Component: Installation - RHELOSPAssignee: John Matthews <jmatthew>
Status: CLOSED WONTFIX QA Contact: Dave Johnson <dajohnso>
Severity: unspecified Docs Contact: Dan Macpherson <dmacpher>
Priority: unspecified    
Version: 1.0CC: bthurber
Target Milestone: gaKeywords: Triaged
Target Release: 1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-15 19:05:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
first error traceback from production.log
none
second error traceback from production.log none

Description Antonin Pagac 2016-02-15 15:46:54 UTC
Description of problem:
I have seen this error a couple of times now in deployments including RHELOSP and re-using the Director. From the Errors tab of the relevant subtask:

Action:
Actions::Fusor::Deployment::OpenStack::OvercloudConfiguration

Input:
{"deployment_id"=>4, "locale"=>"en"}

Output:
{}

Exception:
Excon::Errors::SocketError: end of file reached (EOFError)

This is not 100% reproducible and happens randomly. After undeploying the Overcloud and running another deployment of RHELOSP (while again re-using the Director) it usually goes away. When the error occurs, the Director reports 'Stack CREATE completed successfully' and the Overcloud UI is available. Attaching excerpt from production.log with traceback.
After resuming the failing subtask, the error changes to:

Error Expected([200]) <=> Actual(409 Conflict) response => #<Excon::Response:0x0000000f45afd8 @data={:body=>"{\"error\": {\"message\": \"Conflict occurred attempting to store project - Duplicate Entry\", \"code\": 409, \"title\": \"Conflict\"}}", :headers=>{"Vary"=>"X-Auth-Token", "Content-Type"=>"application/json", "Content-Length"=>"123", "X-Openstack-Request-Id"=>"req-2a8d3b9b-2f42-4b90-8733-cde1c029f064", "Date"=>"Mon, 15 Feb 2016 15:30:04 GMT"}, :status=>409, :remote_ip=>"192.0.2.13", :local_port=>42289, :local_address=>"192.168.252.1"}, @body="{\"error\": {\"message\": \"Conflict occurred attempting to store project - Duplicate Entry\", \"code\": 409, \"title\": \"Conflict\"}}", @headers={"Vary"=>"X-Auth-Token", "Content-Type"=>"application/json", "Content-Length"=>"123", "X-Openstack-Request-Id"=>"req-2a8d3b9b-2f42-4b90-8733-cde1c029f064", "Date"=>"Mon, 15 Feb 2016 15:30:04 GMT"}, @status=409, @remote_ip="192.0.2.13", @local_port=42289, @local_address="192.168.252.1"> 

Attaching second excerpt from production.log with traceback.

Version-Release number of selected component (if applicable):
TP2 RC9
RHCI-6.0-RHEL-7-20160208.1-RHCI-x86_64-dvd1.iso
RHCIOOO-7-RHEL-7-20160127.0-RHCIOOO-x86_64-dvd1.iso

How reproducible:
Happens randomly; I have seen it at least 3 times

Steps to Reproduce:
1. Do a deployment involving RHELOSP (RHELOSP alone, RHELOSP+CFME or all in one)
2. There is a chance this error will appear at 90.7% of RHELOSP deployment
3.

Actual results:
Deployment of RHELOSP stops at 90.7% but the Overcloud UI is available and working

Expected results:
Successfully deploy RHELOSP and possibly continue deploying other components

Additional info:

Comment 1 Antonin Pagac 2016-02-15 15:47:45 UTC
Created attachment 1127318 [details]
first error traceback from production.log

Comment 2 Antonin Pagac 2016-02-15 15:48:12 UTC
Created attachment 1127319 [details]
second error traceback from production.log

Comment 3 John Matthews 2016-04-11 17:58:27 UTC
I saw this on 4/11, using TP3 ISO (a pre-release for what will be RC3).

Had a successful OSP deployment at first.
Undeployed then reused same VMs.

Action:
Actions::Fusor::Deployment::OpenStack::OvercloudConfiguration
Input:
{"deployment_id"=>4, "locale"=>"en"}
Output:
{}
Exception:
Excon::Errors::SocketError: end of file reached (EOFError)
Backtrace:
/opt/rh/ruby193/root/usr/share/gems/gems/excon-0.38.0/lib/excon/socket.rb:99:in `readline'
/opt/rh/ruby193/root/usr/share/gems/gems/excon-0.38.0/lib/excon/response.rb:40:in `parse'
/opt/rh/ruby193/root/usr/share/gems/gems/excon-0.38.0/lib/excon/middlewares/response_parser.rb:6:in `response_call'
/opt/rh/ruby193/root/usr/share/gems/gems/docker-api-1.17.0/lib/excon/middlewares/hijack.rb:45:in `response_call'
/opt/rh/ruby193/root/usr/share/gems/gems/excon-0.38.0/lib/excon/connection.rb:402:in `response'
/opt/rh/ruby193/root/usr/share/gems/gems/excon-0.38.0/lib/excon/connection.rb:272:in `request'
/opt/rh/ruby193/root/usr/share/gems/gems/fog-core-1.24.0/lib/fog/core/connection.rb:64:in `request'
/opt/rh/ruby193/root/usr/share/gems/gems/fog-1.24.1/lib/fog/openstack/core.rb:217:in `get_supported_version'
/opt/rh/ruby193/root/usr/share/gems/gems/fog-1.24.1/lib/fog/openstack/network.rb:342:in `authenticate'
/opt/rh/ruby193/root/usr/share/gems/gems/fog-1.24.1/lib/fog/openstack/network.rb:256:in `initialize'
/opt/rh/ruby193/root/usr/share/gems/gems/fog-core-1.24.0/lib/fog/core/service.rb:115:in `new'
/opt/rh/ruby193/root/usr/share/gems/gems/fog-core-1.24.0/lib/fog/core/service.rb:115:in `new'
/opt/rh/ruby193/root/usr/share/gems/gems/fusor_server-0.0.1/app/lib/actions/fusor/deployment/open_stack/overcloud_configuration.rb:66:in `configure_networks'
/opt/rh/ruby193/root/usr/share/gems/gems/fusor_server-0.0.1/app/lib/actions/fusor/deployment/open_stack/overcloud_configuration.rb:40:in `run'