Description of problem: When publishing a content view we generate a lot of tasks in pulp and then wait on them. For some reason on occasion we initiate the task and as soon as we request the task to poll its status pulp throws a 404. Requests for the same task later on work fine. Version-Release number of selected component (if applicable): 2.6.0-1.1.beta How reproducible: Randomly Steps to Reproduce: Will work on reproducer script, unclear how to reproduce currently. Pulp logs: Apr 1 20:52:14 qe-blade-16 pulp: pulp.server.webservices.middleware.exception:INFO: Missing resource(s): resource_id=6f29bc57-55f0-49eb-97ee-1978407be409 Apr 1 20:52:14 qe-blade-16 pulp: celery.worker.strategy:INFO: Received task: pulp.server.managers.repo.unit_association.associate_from_repo[6f29bc57-55f0-49eb-97ee-1978407be409] Apr 1 20:52:14 qe-blade-16 pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._release_resource[7b3ce53b-d3ae-4101-bbd3-15776e975d52] Apr 1 20:52:14 qe-blade-16 pulp: celery.worker.job:INFO: Task pulp.server.async.tasks._queue_reserved_task[01c536c8-f0cf-444c-a799-c65a5f5413da] succeeded in 0.702265332002s: apache logs: 10.16.76.78 - - [01/Apr/2015:20:52:13 -0400] "POST /pulp/api/v2/repositories/Default_Organization-rhel-7-cv-1_0-Satellite_Tools_Packages-RHEL_7/actions/associate/ HTTP/1.1" 202 172 "-" "Ruby" 10.16.76.78 - - [01/Apr/2015:20:52:14 -0400] "GET /pulp/api/v2/tasks/6f29bc57-55f0-49eb-97ee-1978407be409/ HTTP/1.1" 404 533 "-" "Ruby"
Ruby traceback showing the 404 comes from polling a task: - /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/abstract_response.rb:48:in `return!' - /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/base.rb:92:in `block in get_response' - /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:228:in `call' - /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:228:in `process_result' - /opt/rh/ruby193/root/usr/share/gems/gems/rbovirt-0.0.29/lib/restclient_ext/request.rb:50:in `block in transmit' - /opt/rh/ruby193/root/usr/share/ruby/net/http.rb:746:in `start' - /opt/rh/ruby193/root/usr/share/gems/gems/rbovirt-0.0.29/lib/restclient_ext/request.rb:44:in `transmit' - /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:64:in `execute' - /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:33:in `execute' - /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/resource.rb:51:in `get' - /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/base.rb:91:in `get_response' - /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/base.rb:82:in `call' - /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/resources/task.rb:41:in `poll'
Workaround: Go to the failed task under "Monitor > Tasks" and click 'resume'
Moving to need_info, as this appears to be a pulp issue (possible race condition)
And to note that RHCI can reproduce it pretty easily.
Does this only happen for the unit association task? Or does it ever happen for other tasks? This is a mysterious problem. The code that creates the task record in the database saves it, and then reads it back from the DB, before returning anything. So the record should be available in mongo before the task ID is even returned to the client. https://github.com/pulp/pulp/blob/pulp-2.6.0-1/server/pulp/server/async/task_status_manager.py#L54 I'm not sure what to make of this. What you're seeing should theoretically be impossible, yet there it is. :)
I am not sure if the needinfo is for someone in particular; however, when I have seen this issue it is typically on the task that is part of either a copy or remove (i.e. associate or unassociate) and it is most when the unit is for rpm or errata. For RHCI, I most frequently see the error during a 're-deploy' which means it can be publishing a new version of an existing content view and then promoting to an environment.
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
We have an upstream Katello issue that reflects this same issue (http://projects.theforeman.org/issues/10107). The user who reported seeing it in multiple scenarios tested this change (https://github.com/pulp/pulp/pull/1864) in their local environment and has reported no issues since applying that patch.
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
Created attachment 1034314 [details] patch for satellite 6.1
Hi, please provide verfication steps thanks
There really isn't a simple set of verification steps because it would randomly occur. The only thing you can really do is attempt to publish a content view with lots of repositories full of content. The repositories there are the more likely to reproduce. Try publishing that content view a few times and if this issue does not occur I guess mark as verified.
VERIFIED: rpm -qa | grep foreman ruby193-rubygem-foreman_discovery-2.0.0.15-1.el7sat.noarch foreman-libvirt-1.7.2.27-1.el7sat.noarch ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el7sat.noarch ruby193-rubygem-foreman_docker-1.2.0.14-1.el7sat.noarch rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el7sat.noarch foreman-selinux-1.7.2.13-1.el7sat.noarch dell-pe1955-02.rhts.eng.bos.redhat.com-foreman-proxy-1.0-2.noarch foreman-compute-1.7.2.27-1.el7sat.noarch foreman-gce-1.7.2.27-1.el7sat.noarch ruby193-rubygem-foreman-redhat_access-0.2.0-8.el7sat.noarch rubygem-hammer_cli_foreman-0.1.4.14-1.el7sat.noarch foreman-debug-1.7.2.27-1.el7sat.noarch foreman-vmware-1.7.2.27-1.el7sat.noarch ruby193-rubygem-foreman-tasks-0.6.12.8-1.el7sat.noarch rubygem-hammer_cli_foreman_tasks-0.0.3.4-1.el7sat.noarch rubygem-hammer_cli_foreman_docker-0.0.3.7-1.el7sat.noarch foreman-proxy-1.7.2.5-1.el7sat.noarch dell-pe1955-02.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el7sat.noarch dell-pe1955-02.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch foreman-ovirt-1.7.2.27-1.el7sat.noarch rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el7sat.noarch foreman-1.7.2.27-1.el7sat.noarch ruby193-rubygem-foreman_hooks-0.3.7-2.el7sat.noarch foreman-postgresql-1.7.2.27-1.el7sat.noarch steps: Try publishing that content view a few times No issues occur
This bug is slated to be released with Satellite 6.1.
This bug was fixed in version 6.1.1 of Satellite which was released on 12 August, 2015.
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
Originally, Satellite received a custom patch to fix this issue, but the patch was not part of Pulp 2.6.x stream. Pulp 2.7.0 will contain a more generic fix that addresses this issue.
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.