on the phone with the consultant a few moments ago. he has increased the pool counts for UI worker and webservices worker and is still encountering the issue. What he said today was that UI interactions would brievly complete but then would would start timing out suggesting to me that there are either UI threads that are looping or pool instances that are not being free. I requested a full netstat from the UI appliance which I am attaching to this case. Consultant is on the last day of his engagement so some attention would be appreciated.
Update: We performed some diagnostics and have found the following 1) If we configure puma to use only 1 thread per process, see [a], and restart evmserverd, the service catalog order pulldown no longer hangs after making a selection. Note, in prior cfme versions, we also had a single thread handling requests per web server process. Note: The customer has at least 4 UI worker processes enabled with apache load balancing, therefore concurrent requests will still be processed in spite of only 1 thread per process. 2) We recreated the issue with 2 puma threads, and enabled some diagnostics logs uploaded in comment 13 that we'll review. 3) We were missing the ServerName directive in the apache ssl configuration which caused some ugly warning log messages but ultimately was not the cause of the problem. Using the cfme shipped certs led to the same ultimate hang on the service catalog ordering form. The ssl configuration does not appear to be at fault at all. We will use 1 puma thread per process for the time being and continue to research the issue. [a] We can do this by changing config/puma.rb: threads 1, 1 (from the default of 5, 5) https://github.com/ManageIQ/manageiq/blob/939f6ebfb0daaf600a7ebad9f5816cfe84eb1835/config/puma.rb#L20
This pull request [1] may resolve the reported issue here where the custom dialog form hangs. We'll need to verify if that PR in fact prevents the deadlock causing the UI "hang". [1] https://github.com/ManageIQ/manageiq/pull/10038