Bug 1208615 - Content view publish randomly fails with 404 on fetching task
Summary: Content view publish randomly fails with 404 on fetching task
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: Unspecified
Assignee: Partha Aji
QA Contact: Tazim Kolhar
URL:
Whiteboard:
Depends On:
Blocks: rhci-common-installer
TreeView+ depends on / blocked
 
Reported: 2015-04-02 16:56 UTC by Justin Sherrill
Modified: 2021-04-06 18:02 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-12 13:55:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch for satellite 6.1 (1.58 KB, patch)
2015-06-03 13:06 UTC, Dennis Kliban
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Pulp Redmine 974 0 High CLOSED - CURRENTRELEASE GET of a task sometimes 404s on first request Never

Description Justin Sherrill 2015-04-02 16:56:21 UTC
Description of problem:

When publishing a content view we generate a lot of tasks in pulp and then wait on them.  For some reason on occasion we initiate the task and as soon as we request the task to poll its status pulp throws a 404.

Requests for the same task later on work fine.

Version-Release number of selected component (if applicable):
2.6.0-1.1.beta

How reproducible:
Randomly

Steps to Reproduce:
Will work on reproducer script, unclear how to reproduce currently.


Pulp logs:

Apr  1 20:52:14 qe-blade-16 pulp: pulp.server.webservices.middleware.exception:INFO: Missing resource(s): resource_id=6f29bc57-55f0-49eb-97ee-1978407be409
Apr  1 20:52:14 qe-blade-16 pulp: celery.worker.strategy:INFO: Received task: pulp.server.managers.repo.unit_association.associate_from_repo[6f29bc57-55f0-49eb-97ee-1978407be409]
Apr  1 20:52:14 qe-blade-16 pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._release_resource[7b3ce53b-d3ae-4101-bbd3-15776e975d52]
Apr  1 20:52:14 qe-blade-16 pulp: celery.worker.job:INFO: Task pulp.server.async.tasks._queue_reserved_task[01c536c8-f0cf-444c-a799-c65a5f5413da] succeeded in 0.702265332002s: 


apache logs:


10.16.76.78 - - [01/Apr/2015:20:52:13 -0400] "POST /pulp/api/v2/repositories/Default_Organization-rhel-7-cv-1_0-Satellite_Tools_Packages-RHEL_7/actions/associate/ HTTP/1.1" 202 172 "-" "Ruby"
10.16.76.78 - - [01/Apr/2015:20:52:14 -0400] "GET /pulp/api/v2/tasks/6f29bc57-55f0-49eb-97ee-1978407be409/ HTTP/1.1" 404 533 "-" "Ruby"

Comment 2 Justin Sherrill 2015-04-02 17:26:15 UTC
Ruby traceback showing the 404 comes from polling a task:


- /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/abstract_response.rb:48:in
  `return!'
- /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/base.rb:92:in
  `block in get_response'
- /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:228:in
  `call'
- /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:228:in
  `process_result'
- /opt/rh/ruby193/root/usr/share/gems/gems/rbovirt-0.0.29/lib/restclient_ext/request.rb:50:in
  `block in transmit'
- /opt/rh/ruby193/root/usr/share/ruby/net/http.rb:746:in `start'
- /opt/rh/ruby193/root/usr/share/gems/gems/rbovirt-0.0.29/lib/restclient_ext/request.rb:44:in
  `transmit'
- /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:64:in
  `execute'
- /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/request.rb:33:in
  `execute'
- /opt/rh/ruby193/root/usr/share/gems/gems/rest-client-1.6.7/lib/restclient/resource.rb:51:in
  `get'
- /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/base.rb:91:in
  `get_response'
- /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/base.rb:82:in
  `call'
- /opt/rh/ruby193/root/usr/share/gems/gems/runcible-1.3.0/lib/runcible/resources/task.rb:41:in
  `poll'

Comment 3 Justin Sherrill 2015-04-02 17:26:36 UTC
Workaround:

Go to the failed task under "Monitor > Tasks" 

and click 'resume'

Comment 4 Justin Sherrill 2015-04-23 18:06:15 UTC
Moving to need_info, as this appears to be a pulp issue (possible race condition)

Comment 5 Justin Sherrill 2015-04-23 18:08:08 UTC
And to note that RHCI can reproduce it pretty easily.

Comment 6 Michael Hrivnak 2015-04-23 19:23:14 UTC
Does this only happen for the unit association task? Or does it ever happen for other tasks?

This is a mysterious problem. The code that creates the task record in the database saves it, and then reads it back from the DB, before returning anything. So the record should be available in mongo before the task ID is even returned to the client.

https://github.com/pulp/pulp/blob/pulp-2.6.0-1/server/pulp/server/async/task_status_manager.py#L54

I'm not sure what to make of this. What you're seeing should theoretically be impossible, yet there it is. :)

Comment 7 Brad Buckingham 2015-04-29 12:25:03 UTC
I am not sure if the needinfo is for someone in particular; however, when I have seen this issue it is typically on the task that is part of either a copy or remove (i.e. associate or unassociate) and it is most when the unit is for rpm or errata.

For RHCI, I most frequently see the error during a 're-deploy' which means it can be publishing a new version of an existing content view and then promoting to an environment.

Comment 8 pulp-infra@redhat.com 2015-05-11 17:30:17 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2015-05-11 17:30:19 UTC
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 10 Eric Helms 2015-05-19 21:23:28 UTC
We have an upstream Katello issue that reflects this same issue (http://projects.theforeman.org/issues/10107). The user who reported seeing it in multiple scenarios tested this change (https://github.com/pulp/pulp/pull/1864) in their local environment and has reported no issues since applying that patch.

Comment 11 pulp-infra@redhat.com 2015-05-22 16:00:18 UTC
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 12 Dennis Kliban 2015-06-03 13:06:51 UTC
Created attachment 1034314 [details]
patch for satellite 6.1

Comment 15 Tazim Kolhar 2015-06-11 10:22:16 UTC
Hi,

please provide verfication steps

thanks

Comment 16 Justin Sherrill 2015-06-11 13:27:22 UTC
There really isn't a simple set of verification steps because it would randomly occur.  

The only thing you can really do is attempt to publish a content view with lots of repositories full of content.  The repositories there are the more likely to reproduce.  Try publishing that content view a few times and if this issue does not occur I guess mark as verified.

Comment 17 Tazim Kolhar 2015-06-12 09:36:00 UTC
VERIFIED:

 rpm -qa | grep foreman
ruby193-rubygem-foreman_discovery-2.0.0.15-1.el7sat.noarch
foreman-libvirt-1.7.2.27-1.el7sat.noarch
ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el7sat.noarch
ruby193-rubygem-foreman_docker-1.2.0.14-1.el7sat.noarch
rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el7sat.noarch
foreman-selinux-1.7.2.13-1.el7sat.noarch
dell-pe1955-02.rhts.eng.bos.redhat.com-foreman-proxy-1.0-2.noarch
foreman-compute-1.7.2.27-1.el7sat.noarch
foreman-gce-1.7.2.27-1.el7sat.noarch
ruby193-rubygem-foreman-redhat_access-0.2.0-8.el7sat.noarch
rubygem-hammer_cli_foreman-0.1.4.14-1.el7sat.noarch
foreman-debug-1.7.2.27-1.el7sat.noarch
foreman-vmware-1.7.2.27-1.el7sat.noarch
ruby193-rubygem-foreman-tasks-0.6.12.8-1.el7sat.noarch
rubygem-hammer_cli_foreman_tasks-0.0.3.4-1.el7sat.noarch
rubygem-hammer_cli_foreman_docker-0.0.3.7-1.el7sat.noarch
foreman-proxy-1.7.2.5-1.el7sat.noarch
dell-pe1955-02.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch
ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el7sat.noarch
dell-pe1955-02.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch
foreman-ovirt-1.7.2.27-1.el7sat.noarch
rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el7sat.noarch
foreman-1.7.2.27-1.el7sat.noarch
ruby193-rubygem-foreman_hooks-0.3.7-2.el7sat.noarch
foreman-postgresql-1.7.2.27-1.el7sat.noarch

steps:
 Try publishing that content view a few times 
No issues occur

Comment 18 Bryan Kearney 2015-08-11 13:18:39 UTC
This bug is slated to be released with Satellite 6.1.

Comment 19 Bryan Kearney 2015-08-12 13:55:30 UTC
This bug was fixed in version 6.1.1 of Satellite which was released on 12 August, 2015.

Comment 20 pulp-infra@redhat.com 2015-10-07 20:12:01 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 21 pulp-infra@redhat.com 2015-10-12 16:00:38 UTC
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.

Comment 22 Dennis Kliban 2015-10-12 16:38:01 UTC
Originally, Satellite received a custom patch to fix this issue, but the patch was not part of Pulp 2.6.x stream. Pulp 2.7.0 will contain a more generic fix that addresses this issue.

Comment 23 pulp-infra@redhat.com 2015-11-06 20:00:41 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.


Note You need to log in before you can comment on or make changes to this bug.