Description of problem: Puppet repo sync has the following steps: 1) Unassociate all puppet modules from the repo 2) Sync the Repo 3) Remove all Orphans Lets say we are syncing 'Repo 1' and 'Repo 2' at the same time. 'Module A' is only associated to 'Repo 1' Unassociate 'Module A' from 'Repo 1' will make it an orphan, so the 'Repo 1' sync task will assume 'Module A' is still in Pulp and will not try to re-download it again. At the same time, 'Repo 2' sync task reaches the 'Remove all orphans step' and remove 'Module A' from Pulp. This will cause the 'Repo 1' sync to fail with the following error: IOError: [Errno 2] No such file or directory: u'/var/lib/pulp/content/units/puppet_module/1f/37902b7b8dd30cd2f0a2a99914f703c1b1459bc905504b8c23225cee4b728b/mweetman-hosts-0.1.0.tar.gz' How reproducible: This issue is hard to reproduce, because this issue will only be triggered in a very specific timing.
Do you have the full traceback available?
Thank you for that. Was there any other exception in the system log from that about time? I'm hoping to find a python traceback for an exception besides the "PulpCodedException".
The potential temporary workaround is to get the missed files in place. One way is described in comment 7 - copy files from Satellite: "Restoring the missing file by copying it from the Satellite temporarily resolves the problem, but the problem comes back again and again, on a random puppet module file each time." The other way may be is to remove repository/clean orphans/recreate/sync, in that order - orphan clean up should happen after repository removal. It will help in case missing units were only in this repo, if they are in multiple repositories all of them needs to be recreated in the described way. # remove repo pulp-admin --username admin --password <password> puppet repo remove --repo-id <repo_id> # check if there are puppet_module orphans pulp-admin --username admin --password <password> orphan list # remove the orphaned content pulp-admin --username admin --password <password> orphan remove --type puppet_module # create repository pulp-admin --username admin --password <password> puppet repo create --repo-id <repo_id> --feed <feed from which you'd like to sync> # sync repo pulp-admin --username admin --password <password> puppet repo sync run --repo-id <repo_id> If the issue is indeed in race condition like described in BZ then it is better to sync one repository at a time. If there were no orphans or you know that the same affected puppet module is present in multiple repositories, there is a way to find out in which ones but it only directly in db, in mongo shell. Let me know if this approach will work for you, I will provide some commands.
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
If the same module is missing again then it is likely that it was in multiple repositories. In this case the easiest way is to copy files from the main Satellite, like mentioned in comment 7.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.
The Pulp upstream issue fixed the remove_missing option for the puppet importer. This fix will allow Katello not to use the following steps: > Puppet repo sync has the following steps: > 1) Unassociate all puppet modules from the repo > 2) Sync the Repo > 3) Remove all Orphans But just sync with the remove_missing option enabled. This will eliminate the race condition because there should be no remove_orphan task in this sync workflow.
Created redmine issue http://projects.theforeman.org/issues/18920 from this bug
Cloning to katello for the katello change
I'm changing the component since all Pulp work is done. Feel free to change the component if I didn't pick the right one.
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/18920 has been resolved.
I believe I reproduced this (iirc) by putting a break point in between steps 1 and 2. This helped greatly with the timing and allowed testing with small repos. Unfortunately, this is probably not an option in production. > Puppet repo sync has the following steps: > 1) Unassociate all puppet modules from the repo > 2) Sync the Repo > 3) Remove all Orphans For the Katello-side verification, you can ensure the remove_missing option is passed to Pulp in the Sync action in foreman-tasks or Dynflow.
FailedQA. @tfm-rubygem-katello-3.0.0.161-1.el7sat.noarch I tried to at least verify as a sanity only @6.2.14 So I wanted to sync Library into external capsule...But During task planning phase Actions::Katello::CapsuleContent::Sync failed with undefined method `content_type' for nil:NilClass (NoMethodError) Actions::Katello::CapsuleContent::Sync Input: --- smart_proxy: id: 2 name: cap.example.com services_checked: - pulp - pulp_auth undefined method `content_type' for nil:NilClass (NoMethodError) /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:65:in `block (3 levels) in sync_repos_to_capsule' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/execution_plan.rb:316:in `call' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/execution_plan.rb:316:in `switch_flow' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/action.rb:369:in `sequence' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:54:in `block (2 levels) in sync_repos_to_capsule' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:53:in `each' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:53:in `block in sync_repos_to_capsule' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/execution_plan.rb:316:in `call' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/execution_plan.rb:316:in `switch_flow' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/action.rb:364:in `concurrence' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:52:in `sync_repos_to_capsule' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:46:in `block in plan' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/execution_plan.rb:316:in `call' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/execution_plan.rb:316:in `switch_flow' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/action.rb:369:in `sequence' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.161/app/lib/actions/katello/capsule_content/sync.rb:41:in `plan' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.6/lib/dynflow/action.rb:461:in `block (3 levels) in execute_plan'
Requesting needsinfo from upstream developer ttereshc because the 'FailedQA' flag is set.
The FailedQA regression is fixed in http://projects.theforeman.org/issues/20540
Setting to POST as upstream fix is merged.
VERIFIED. @satellite-6.2.14-4.0.el7sat.noarch tfm-rubygem-katello-3.0.0.162-1.el7sat.noarch used steps described in comment#40 and focused on overall sanity @UI > Monitor > Tasks 32: Actions::Pulp::Consumer::SyncCapsule (success) [ 34.65s / 2.07s ] Started at: 2018-01-30 15:33:45 UTC Ended at: 2018-01-30 15:34:20 UTC Real time: 34.65s Execution time (excluding suspended state): 2.07s Input: capsule_id: 2 repo_pulp_id: Default_Organization-Capsule-Capsule_6_2_RHEL7 sync_options: remove_missing: true force_full: true remote_user: admin remote_cp_user: admin >>> Actions::Pulp::Consumer::SyncCapsule suceeded for both yum and puppet repo while remove_missing sync option was enabled >>> Actions::Pulp::Consumer::UnassociateUnits step is no longer required to avoid race condition @SAT # foreman-rake katello:delete_orphaned_content RAILS_ENV=production Orphaned content deletion started in background. @UI > Monitor > Tasks Remove orphans {"services_checked"=>["pulp", "pulp_auth"], "capsule_id"=>2, ... stopped success 2018-01-31 13:19:44 +0100 2018-01-31 13:19:44 +0100 foreman_admin Remove orphans {"services_checked"=>["pulp", "pulp_auth"], "capsule_id"=>1, ... stopped success 2018-01-31 13:19:43 +0100 2018-01-31 13:19:44 +0100 foreman_admin >>> weekly cleanup cron job now renders separate cleanup tasks for all capsules and these tasks are run successfully
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0273