Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
DescriptionBrad Buckingham
2022-06-03 15:58:11 UTC
+++ This bug was initially created as a clone of Bug #2090271 +++
Description of problem:
Manifest refresh randomly fails on a Satellite with multiple dynflow workers with error:
Error: No such file or directory @ rb_sysopen - /tmp/0.7851943882678857.zip
The reason is *tricky* :
- ManifestRefresh task determines filename for the new manifest file as /tmp/#{rand}.zip
- UpstreamExport dynflow step is asked to export the new manifest to that file
- subsequent Import dynflow step is asked to read the file and process the update further
The dynflow steps can be processed by different dynflow workers, which are run as different systemd services. And sadly for us, the services use their own private temp directory like:
/tmp/systemd-private-4f8b157ce7c040f4b27e7ecbba68aa22-dynflow-sidekiq/tmp/
So, when UpstreamExport step is executed by one dynflow worker, it puts the zip file to its own private temp. And if we are unlucky, the Import step is picked by another worker that misses the file in its own private temp /o\ .
Which means, having 3 dynflow workers, there is just 1/3 probability the manifest refresh succeeds.
We need to use static/shared tmp file instead.
Version-Release number of selected component (if applicable):
Sat 6.10.5
How reproducible:
2/3 when having 3 dynflow workers
Steps to Reproduce:
1. Set up Satellite with 3 dynflow workers, e.g. per https://access.redhat.com/solutions/5695311
2. Import a manifest
3. Repeatedly refresh it:
hammer subscription refresh-manifest --organization-id=1
Actual results:
3. randomly fails with error:
Error: No such file or directory @ rb_sysopen - /tmp/0.7851943882678857.zip
in such a case, the zip file can be spot under a private temp dir of a worker's service, like:
/tmp/systemd-private-4f8b157ce7c040f4b27e7ecbba68aa22-dynflow-sidekiq/tmp/0.7851943882678857.zip
Expected results:
manifest refresh to always succeed
Additional info:
--- Additional comment from on 2022-05-25T14:20:07Z
We could either use a different temporary directory (~foreman/tmp maybe?) or make all the workers run in the same mount namespace using JoinsNamespaceOf[1] in the service definition. Depending on this, the fix will either need to go to katello or foreman, either way I'm not sure about the right component.
[1] - https://www.freedesktop.org/software/systemd/man/systemd.unit.html#JoinsNamespaceOf=
--- Additional comment from on 2022-05-25T14:27:32Z
Created redmine issue https://projects.theforeman.org/issues/34957 from this bug
--- Additional comment from on 2022-05-25T14:33:19Z
I guess the "correct" solution depends on which part of this we consider a bug ;-)
Is the general answer "dynflow workers should be able to exchange data via the filesystem", then they need either be in the same namespace (JoinNamespaceOf above) or explicitly have a way to say "store this data for sharing" (in Rails.root/tmp, or somewhere else).
Is the general answer "dynflow workers should be as isolated as possible, but this specific katello workflow needs it" then this workflow should write to Rails.root/tmp or similar
--- Additional comment from on 2022-05-26T16:04:46Z
Upstream bug assigned to aruzicka
--- Additional comment from on 2022-05-26T16:04:49Z
Upstream bug assigned to aruzicka
--- Additional comment from on 2022-05-27T16:04:44Z
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/34957 has been resolved.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Satellite 6.10.7 Async Bug Fix Update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2022:5516