1422988 – Web UI inaccessible after changing number of UI Workers

Bug 1422988 - Web UI inaccessible after changing number of UI Workers

Summary: Web UI inaccessible after changing number of UI Workers

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Appliance
Sub Component:
Version:	5.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.8.0
Assignee:	Joe Rafaniello
QA Contact:	Alex Newman
Docs Contact:
URL:
Whiteboard:	worker
Duplicates (1):	1441372 (view as bug list)
Depends On:
Blocks:	1432463
TreeView+	depends on / blocked

Reported:	2017-02-16 18:32 UTC by Brant Evans
Modified:	2020-07-16 09:13 UTC (History)
CC List:	7 users (show)
Fixed In Version:	5.8.0.6
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1432463 (view as bug list)
Environment:
Last Closed:	2017-06-12 17:12:35 UTC
Category:	---
Cloudforms Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Brant Evans 2017-02-16 18:32:19 UTC

Description of problem:
After changing the number of UI Workers of an appliance the Web UI on the appliance becomes inaccessible

Version-Release number of selected component (if applicable):
5.7.0.17

How reproducible:
Always

Steps to Reproduce:
1. Navigate to Configuration (under username)
2. Select UI appliance
3. Click Workers tab
4. Under UI Worker change Count from 1 to 2
5. Save


Actual results:
Web UI does not load


Expected results:
Web UI loads


Additional info:

Comment 2 CFME Bot 2017-02-21 21:06:26 UTC

https://github.com/ManageIQ/manageiq/pull/14007

Comment 3 CFME Bot 2017-03-10 22:16:23 UTC

New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/da9523ee2168da89511c9260528c7bb243bfb777

commit da9523ee2168da89511c9260528c7bb243bfb777
Author:     Joe Rafaniello <jrafanie>
AuthorDate: Fri Feb 17 12:23:27 2017 -0500
Commit:     Joe Rafaniello <jrafanie>
CommitDate: Thu Feb 23 14:46:19 2017 -0500

    Configure apache balancer with up to 10 members at startup
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1422988
    
    Start Ui, Web Service, Web Socket, etc. puma workers bound to a port
    from STARTING_PORT to the maximum worker count port (3000 to 3009 if max
    worker count is 10).  Configure apache at boot with these ports as
    balancer members.
    
    Fixes a failure after we start new puma workers and try to gracefully
    restart apache.  The next request will fail since apache is waiting for
    active connections to close before restarting.  The subsequent request will
    then be ok since the failure would cause the websocket connections to
    close, allowing apaache to restart fully.
    
    Previously, we would add and remove members in the balancer configuration
    when starting or stopping puma workers.  We would then gracefully restart
    apache since the new workers wouldn't be used until apache reloaded the
    configuration.  Note, we didn't do anything after removing members from
    the balancer configuration because apache's mod_proxy_balancer gracefully
    handles dead members by marking them as in Error and not retrying them for
    60 seconds by default. Therefore, it's not necessary to restart apache to
    "remove" members.
    
    The problem is when we would try to add balancer members to the
    configuration and gracefully restart apache.  It turns out, our web
    socket workers maintain active connections to apache so apache wouldn't
    restart until those connections were closed.
    
    Now, we take the idea mentioned above of the mod_proxy_balancer
    keeping track of which members are alive or in error by configuring up
    to 10, maximum_workers_count, members at server startup.  We can then
    start and stop workers and let apache route traffic to the members that
    are alive.  We no longer have to update the apache configuration and
    restart it when a worker starts or stops.
    
    Note, apache has a graceful reload option that could allow us to
    maintain an accurate list of balancer members as workers start and stop
    and tell apache workers to gracefully reload the configuration.  This
    option was buggy until fixed in [1]. It also required us to keep
    touching the balancer configuration which we probably shouldn't have ben
    doing in the first place.
    
    [1] https://bz.apache.org/bugzilla/show_bug.cgi?id=44736

 app/models/miq_server/environment_management.rb  |  2 +-
 app/models/mixins/miq_web_server_worker_mixin.rb | 53 +++++++-----------------
 spec/models/miq_ui_worker_spec.rb                | 21 ++++++++++
 3 files changed, 36 insertions(+), 40 deletions(-)

Comment 4 CFME Bot 2017-03-13 21:36:38 UTC

https://github.com/ManageIQ/manageiq/pull/14311

Comment 5 CFME Bot 2017-03-14 13:26:27 UTC

New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/a1ad99f7c1354e4d7f5fbe1aa78be5972f314279

commit a1ad99f7c1354e4d7f5fbe1aa78be5972f314279
Author:     Joe Rafaniello <jrafanie>
AuthorDate: Mon Mar 13 17:16:23 2017 -0400
Commit:     Joe Rafaniello <jrafanie>
CommitDate: Mon Mar 13 17:24:12 2017 -0400

    Add balancer members after configs have been written
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1422988
    
    Fixes a regression in #14007 that affects the initial start of the
    appliance which caused a 503 error when trying to access the UI.
    
    Because adding balancer members does a validation of the configuration
    files and these files try to load the redirect files among others,
    we need to add the balancers members after all configuration files have
    been written by install_apache_proxy_config.

 app/models/miq_server/environment_management.rb  | 9 +++++++++
 app/models/mixins/miq_web_server_worker_mixin.rb | 1 -
 2 files changed, 9 insertions(+), 1 deletion(-)

Comment 7 Joe Rafaniello 2017-04-13 14:33:14 UTC

*** Bug 1441372 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.