Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2122461 - with high UI & API load, getting about 0.5% of errors: No such file or directory @ rb_sysopen - /usr/share/foreman/tmp/cache/C3B/630/.permissions_check.224680.136744.49172
Summary: with high UI & API load, getting about 0.5% of errors: No such file or direct...
Keywords:
Status: CLOSED DUPLICATE of bug 2063717
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: API
Version: 6.12.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Gaurav Talreja
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-08-30 05:57 UTC by Jan Hutař
Modified: 2023-05-31 13:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-31 13:35:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jan Hutař 2022-08-30 05:57:36 UTC
Description of problem:
I'm testing new script to simulate high UI and API load and I'm getting about 0.5% of errors:

No such file or directory @ rb_sysopen - /usr/share/foreman/tmp/cache/C3B/630/.permissions_check.224680.136744.49172


Version-Release number of selected component (if applicable):
satellite-6.12.0-2.el8sat.noarch


How reproducible:
Always


Steps to Reproduce:
1. Install Satellite, sync content, register hosts
2. Run the script (you need to `pip install locust`)


Actual results:
Errors like:
=====
    <div class="alert alert-danger "><span class="pficon pficon-error-circle-o "></span> <strong>Oops, we&#39;re sorry but something went wrong </strong><span class="text">No such file or directory @ rb_sysopen - /usr/share/foreman/tmp/cache/C3B/630/.permissions_check.224680.136744.49172</span><div class="alert-actions"><hr><a class="btn btn-default" href="/">Back</a></div></div>

<p id="message">
                                          If you feel this is an error with Satellite itself, please open a new issue with
                                          <a rel="external" href="https://access.redhat.com/support/cases/#/case/new">Satellite ticketing system</a>,
                                            Please include in your report the full error log that can be acquired by running: 
                                            <strong> foreman-rake errors:fetch_log request_id=5ffb6961</strong>
                                             and it is highly recommended to also attach the sosreport output.
                                        </p>
=====


Expected results:
Requests should not be failing


Additional info:
# python aaa.py --satellite-password changeme --locust-host https://localhost --locust-num-clients 10 --test-duration 300
[...]
request                                     count    fail ratio    med resp time    total RPS
----------------------------------------  -------  ------------  ---------------  -----------
GET users_login_get                            10         0.000         1000.000        0.033
GET locations                                 262         0.004          340.000        0.871
GET smart_proxies                             277         0.004          340.000        0.921
GET hostgroups                                275         0.007          340.000        0.915
GET organizations                             272         0.000          340.000        0.905
GET overview                                  304         0.007          420.000        1.011
GET foreman_tasks_tasks                       293         0.003          290.000        0.975
GET audits_page_per_page_search               258         0.000          790.000        0.858
GET templates_provisioning_templates          280         0.007          510.000        0.931
GET job_invocations                           260         0.008          410.000        0.865
GET hosts                                     275         0.007          830.000        0.915
GET domains                                   296         0.010          310.000        0.985
GET katello_api_v2_content_views_nondef…      291         0.000          320.000        0.968
GET audits                                    255         0.016          290.000        0.848
GET foreman_tasks_api_tasks_include_per…      283         0.000          650.000        0.941
GET katello_api_v2_products_organizatio…      259         0.004         1400.000        0.861
GET katello_api_v2_packages_organizatio…      274         0.000         1800.000        0.911
SUMMARY                                      4424         0.005          581.361       14.715
Errors encountered:
name                                     method    error                                       occurrences
---------------------------------------  --------  ----------------------------------------  -------------
katello_api_v2_products_organization_id  GET       CatchResponseError('Got wrong response')              1
overview                                 GET       CatchResponseError('Got wrong response')              2
hosts                                    GET       CatchResponseError('Got wrong response')              2
hostgroups                               GET       CatchResponseError('Got wrong response')              2
smart_proxies                            GET       CatchResponseError('Got wrong response')              1
job_invocations                          GET       CatchResponseError('Got wrong response')              2
audits                                   GET       CatchResponseError('Got wrong response')              4
locations                                GET       CatchResponseError('Got wrong response')              1
foreman_tasks_tasks                      GET       CatchResponseError('Got wrong response')              1
domains                                  GET       CatchResponseError('Got wrong response')              3
templates_provisioning_templates         GET       CatchResponseError('Got wrong response')              2

Error "Got wrong response" means some very basic check on content sanity (usually just checking for page title or other unique-enough string) failed.

In this run, we can see that 21 requests out of 4424 failed. I have quickly checked the output and looks like they are all the same.

Comment 4 Ewoud Kohl van Wijngaarden 2022-09-20 13:41:12 UTC
It is my theory that we're hitting the limits of the file based cache that we use. Quoting https://guides.rubyonrails.org/caching_with_rails.html#activesupport-cache-filestore

> With this cache store, multiple server processes on the same host can share a cache. This cache store is appropriate for low to medium traffic sites that are served off one or two hosts. Server processes running on different hosts could share a cache by using a shared file system, but that setup is not recommended.

> As the cache will grow until the disk is full, it is recommended to periodically clear out old entries.

Rails also has support for Redis caching (https://guides.rubyonrails.org/caching_with_rails.html#activesupport-cache-rediscachestore) and so does our installer (https://github.com/theforeman/puppet-foreman#rails-cache-support). Untested, but I think this should work:

    --foreman-rails-cache-store:type redis

There are consideration we (as the platform team) should make. For example, our current Redis is tuned for persistence (because Dynflow and Pulp need to survive a Redis restart) but for caching you don't. You can only tune a whole instance so we may want to run two Redis instances. On the other hand, the amount we cache is rather small so perhaps it's not really an issue.

It also mentions hiredis as a faster library, which we could also package.

Jan: is this something you could test? I'd be happy to work with you offline to see if we can make this happen.

Comment 5 Jan Hutař 2022-09-26 06:28:25 UTC
Sure, happy to test as I have appropriate setup around, pinging you on GChat.

Comment 6 Ewoud Kohl van Wijngaarden 2022-09-26 15:05:55 UTC
We found out that in Satellite the foreman-redis package is not in the repositories, so as a user you're unable to use it today.

Comment 13 Eric Helms 2023-05-31 13:35:32 UTC

*** This bug has been marked as a duplicate of bug 2063717 ***


Note You need to log in before you can comment on or make changes to this bug.