Bug 1358811

Summary: immediately after upgrade from CFME 4.0 TO cfme 4.1 UI requests to separate VMDB appliance are timing out-
Product: Red Hat CloudForms Management Engine Reporter: Thomas Hennessy <thenness>
Component: AutomateAssignee: mkanoor
Status: CLOSED CURRENTRELEASE QA Contact: luke couzens <lcouzens>
Severity: high Docs Contact:
Priority: high    
Version: 5.6.0CC: abellott, bascar, benglish, brant.evans, cpelland, jdeubel, jfrey, jhardy, jocarter, jrafanie, mfeifer, mkanoor, obarenbo, simaishi, tfitzger
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: 5.7.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: upgrade:distributed
Fixed In Version: 5.7.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1359295 (view as bug list) Environment:
Last Closed: 2017-01-11 19:53:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1359295    

Comment 2 Thomas Hennessy 2016-07-21 15:09:27 UTC
on the phone with the consultant a few moments ago.  he has increased the pool counts for UI worker and webservices worker and is still encountering the issue.

What he said today was that UI interactions would brievly complete but then would would start timing out suggesting to me that there are either UI threads that are looping or pool instances that are not being free.

I requested a full netstat from the UI appliance which I am attaching to this case.

Consultant is on the last day of his engagement so some attention would be appreciated.

Comment 14 Joe Rafaniello 2016-07-22 02:43:57 UTC
Update:

We performed some diagnostics and have found the following

1) If we configure puma to use only 1 thread per process, see [a], and restart
evmserverd, the service catalog order pulldown no longer hangs after making a selection.  Note, in prior cfme versions, we also had a single thread handling requests per web server process.

Note: The customer has at least 4 UI worker processes enabled with apache load balancing, therefore concurrent requests will still be processed in spite of only 1 thread per process.


2) We recreated the issue with 2 puma threads, and enabled
some diagnostics logs uploaded in comment 13 that we'll review.


3) We were missing the ServerName directive in the apache ssl configuration which caused some ugly warning log messages but ultimately was not the cause of the problem.  Using the cfme shipped certs led to the same ultimate hang on the
service catalog ordering form.  The ssl configuration does not appear to be
at fault at all.


We will use 1 puma thread per process for the time being and continue to
research the issue.


[a] We can do this by changing config/puma.rb:
threads 1, 1 (from the default of 5, 5)
https://github.com/ManageIQ/manageiq/blob/939f6ebfb0daaf600a7ebad9f5816cfe84eb1835/config/puma.rb#L20

Comment 19 Joe Rafaniello 2016-07-25 19:34:25 UTC
This pull request [1] may resolve the reported issue here where the custom dialog form hangs.  We'll need to verify if that PR in fact prevents the deadlock causing the UI "hang".

[1] https://github.com/ManageIQ/manageiq/pull/10038