Bug 1493046 - Slow start of rails environment when starting passenger or foreman-tasks
Summary: Slow start of rails environment when starting passenger or foreman-tasks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Performance
Version: 6.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: Unspecified
Assignee: Ivan Necas
QA Contact: Sachin Ghai
URL:
Whiteboard:
: 1495205 1499402 (view as bug list)
Depends On:
Blocks: Sat_6.3_Upgrade_tracker
TreeView+ depends on / blocked
 
Reported: 2017-09-19 08:58 UTC by Ivan Necas
Modified: 2019-04-01 20:26 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-21 16:46:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
path to workaround the slow start of the rails environment in foreman-tasks (516 bytes, patch)
2017-10-10 13:17 UTC, Ivan Necas
no flags Details | Diff
Patch for properly fixing the slow start of the rails environment (513 bytes, patch)
2017-10-12 06:04 UTC, Ivan Necas
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 21295 0 Normal Closed Slow start of rails environment when starting passenger or foreman-tasks 2020-12-22 11:15:10 UTC

Description Ivan Necas 2017-09-19 08:58:07 UTC
Description of problem:
In satellite 6.3, the time for everything to be ready to respond
to http requests or handle tasks is significantly longer than in 6.2.
This leads to potential initial timeouts when trying to register
a host after the installation finishes or after restarting of services.

Steps to Reproduce (a):
1. systemctl restart httpd
2. wait until the /users/login is available


Actual results:
It takes about 2 minutes to be ready

Expected results:
On similar setup on satellite 6.2, it takes under a minute


Steps to Reproduce (b):
1. systemctl restart foreman-tasks
2. watch /var/log/foreman/dynflow_executor.output for containing 'Everything ready for world'


Actual results:
It takes about 3.5 minutes to be ready

Expected results:
On similar setup on satellite 6.2, it takes under a minute


Additional info:

I've narrowed it down to /opt/rh/rh-ror42/root/usr/share/gems/gems/railties-4.2.6/lib/rails/application/routes_reloader.rb , that runs twice during the boot time and seems to cause the significant slow down at startup. It seems that
something changed between rails 4.1 and 4.2 to cause the delays.

The fix should make sure we don't start reloading the routes twice at the initialization, as that's what causes the long time to start.

Setting the severity on high, as it has impact not only on response times after the installation finishes, but also when the passenger scales up the processes.

Comment 4 Mike McCune 2017-10-05 14:11:54 UTC
We are seeing this bug block upgrades from 6.2 -> 6.3, as outlined here:

https://bugzilla.redhat.com/show_bug.cgi?id=1495205

Comment 5 Sachin Ghai 2017-10-07 07:22:44 UTC
Another issue we saw w/ upgrade automation and I guess its because of this issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1499402

Comment 6 Ivan Necas 2017-10-09 22:15:17 UTC
I was able to narrow down the slow startup issue with the following:

1. the slow startup of the environment is caused by howrails prepare the routes helpers: it iterates though all the defined routes and calls routes define_method on a module, that is being included to all the controllers (and other parts that utilize the helpers) https://github.com/rails/rails/blob/4-2-stable/actionpack/lib/action_dispatch/routing/route_set.rb#L342

2. while usually, defining a method using define_method is relatively cheap operation, it gets dramatically more expensive when there are a lot of classes the module is included into.

3. there was also refactoring between rails 4.1 and 4.2, that removed some caching of used modules, which also contributed into the amount of controllers to be used https://github.com/rails/rails/commit/2bbcca004cc232cef868cd0e301f274ce5638df0#diff-ccdef70ce0e95afc8816f47f89b56c60L400, there has been some fixes in Rails 5 around that in https://github.com/rails/rails/commit/226cd8a094c81edb72143173e4d0c88a2eb01d5f and https://github.com/rails/rails/commit/3b637802b26fd9b4b80e87d9df24e25580d1a85b, but even with this change, the load times were not significantly faster. There might be some other changes, that I was not able to track down

4. there were also changes in foreman and other plugins, that use more helper modules in more classes than before, all of them probably contributing to the slow load times.

One workaround in foreman-tasks was turning off the eager-loading of the classes, so that the majority of the controllers would not get loaded, which reduced the amount of classes that included the module. I was able to get to original numbers, when it comes to startup times for foreman-tasks. However, since foreman-tasks is running multiple threads, this solution would include some risks.

The solution I would like to explore is, instead of turning eager loading completely, disabling eager loading for helpers and controller, which are the classes that should never be used from the foreman-tasks process. This should also give us some memory benefits. However, the slower time for the rails application would still be there: I can't think of any workaround for this issue at the moment. It however should not affect the upgrades issues that we are facing currently.

Additional improvement could be done on the check for the tasks availability to be a bit more resilient on the tasks reporting un-availability, as in this case, the tasks were starting, but since they were busy by loading the additional code, the respose times were slower than katello expected.

Comment 7 Ivan Necas 2017-10-10 13:05:19 UTC
*** Bug 1495205 has been marked as a duplicate of this bug. ***

Comment 8 Ivan Necas 2017-10-10 13:05:39 UTC
*** Bug 1499402 has been marked as a duplicate of this bug. ***

Comment 9 Ivan Necas 2017-10-10 13:17:31 UTC
Created attachment 1336764 [details]
path to workaround the slow start of the rails environment in foreman-tasks

To apply:

cd /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-*
patch -p1 < ~/eager_load.patch


For the upgrades: this should happen right after the yum update is called and the 6.3 packages are freshly installed on the system

Comment 10 Adam Ruzicka 2017-10-11 09:26:38 UTC
Created redmine issue http://projects.theforeman.org/issues/21295 from this bug

Comment 11 Satellite Program 2017-10-11 10:18:46 UTC
Upstream bug assigned to inecas

Comment 12 Ivan Necas 2017-10-12 06:02:13 UTC
I've proposed a final patch upstream, that should address both slow start of the passenger, as well as foreman-tasks https://github.com/theforeman/foreman/pull/4908.

Steps to verify:

   service httpd restart; time curl -k https://localhost

On my setup, without the patch, it gives me 1m 15s, with
the patch, I get 15s only, which is similar or even a bit faster than on sat 62

Comment 13 Ivan Necas 2017-10-12 06:04:55 UTC
Created attachment 1337521 [details]
Patch for properly fixing the slow start of the rails environment

This patch should properly fix the slow start of the rails environment and was proposed upstream. To apply:

cd /usr/share/foreman
patch -p1 < ~/foreman_routes.patch


As with the previous patch, this should be applied after yum update before running the installer. This previous patch obsoletes the previous one.

Comment 14 Sachin Ghai 2017-10-12 12:37:53 UTC
Thanks Ivan. I tested the Patch w/ regular upgrade automation and didn't see any issue while satellite upgrade. Now trying upgrade w/ some other DB and will update here.

Comment 15 Evgeni Golov 2017-10-16 11:03:27 UTC
with the patch:

# systemctl restart httpd; time curl -k https://localhost
<html><body>You are being <a href="https://localhost/users/login">redirected</a>.</body></html>
real	0m23.379s
user	0m0.034s
sys	0m0.045s


without the patch:

# systemctl restart httpd; time curl -k https://localhost
<html><body>You are being <a href="https://localhost/users/login">redirected</a>.</body></html>
real	2m38.011s
user	0m0.050s
sys	0m0.036s


this is some massive improvement!

Comment 16 Satellite Program 2017-10-18 12:17:41 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/21295 has been resolved.

Comment 17 Sachin Ghai 2017-10-23 12:34:31 UTC
Verified w/ sat6.3 snap21.

I used the same DB with which I faced the reported issue and this time w/ snap21, I didn't see any issue w/ upgrade. Just to ensure everything is working, I'm using two DB's for testing. With both DB's upgrade was completed successfully.

Comment 18 Bryan Kearney 2018-02-21 16:42:10 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0336

Comment 19 Bryan Kearney 2018-02-21 16:46:09 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0336


Note You need to log in before you can comment on or make changes to this bug.