1358811 – immediately after upgrade from CFME 4.0 TO cfme 4.1 UI requests to separate VMDB appliance are timing out-

Bug 1358811 - immediately after upgrade from CFME 4.0 TO cfme 4.1 UI requests to separate VMDB appliance are timing out-

Summary: immediately after upgrade from CFME 4.0 TO cfme 4.1 UI requests to separate V...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Automate
Sub Component:
Version:	5.6.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.7.0
Assignee:	mkanoor
QA Contact:	luke couzens
Docs Contact:
URL:
Whiteboard:	upgrade:distributed
Depends On:
Blocks:	1359295
TreeView+	depends on / blocked

Reported:	2016-07-21 14:11 UTC by Thomas Hennessy
Modified:	2019-11-14 08:46 UTC (History)
CC List:	15 users (show)
Fixed In Version:	5.7.0.0
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1359295 (view as bug list)
Environment:
Last Closed:	2017-01-11 19:53:57 UTC
Category:	---
Cloudforms Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 2 Thomas Hennessy 2016-07-21 15:09:27 UTC

on the phone with the consultant a few moments ago.  he has increased the pool counts for UI worker and webservices worker and is still encountering the issue.

What he said today was that UI interactions would brievly complete but then would would start timing out suggesting to me that there are either UI threads that are looping or pool instances that are not being free.

I requested a full netstat from the UI appliance which I am attaching to this case.

Consultant is on the last day of his engagement so some attention would be appreciated.

Comment 14 Joe Rafaniello 2016-07-22 02:43:57 UTC

Update:

We performed some diagnostics and have found the following

1) If we configure puma to use only 1 thread per process, see [a], and restart
evmserverd, the service catalog order pulldown no longer hangs after making a selection.  Note, in prior cfme versions, we also had a single thread handling requests per web server process.

Note: The customer has at least 4 UI worker processes enabled with apache load balancing, therefore concurrent requests will still be processed in spite of only 1 thread per process.


2) We recreated the issue with 2 puma threads, and enabled
some diagnostics logs uploaded in comment 13 that we'll review.


3) We were missing the ServerName directive in the apache ssl configuration which caused some ugly warning log messages but ultimately was not the cause of the problem.  Using the cfme shipped certs led to the same ultimate hang on the
service catalog ordering form.  The ssl configuration does not appear to be
at fault at all.


We will use 1 puma thread per process for the time being and continue to
research the issue.


[a] We can do this by changing config/puma.rb:
threads 1, 1 (from the default of 5, 5)
https://github.com/ManageIQ/manageiq/blob/939f6ebfb0daaf600a7ebad9f5816cfe84eb1835/config/puma.rb#L20

Comment 19 Joe Rafaniello 2016-07-25 19:34:25 UTC

This pull request [1] may resolve the reported issue here where the custom dialog form hangs.  We'll need to verify if that PR in fact prevents the deadlock causing the UI "hang".

[1] https://github.com/ManageIQ/manageiq/pull/10038

Note You need to log in before you can comment on or make changes to this bug.