Recently, the dispatch style of apply_async was changed to use routing_key instead of queue for tasks that require reservation and are sent to a specific worker. We use that field for cancellation purposes in case all tasks sent to a given worker need to be cancelled. The routing_key we use is the name of the worker, so we no longer need to track the queue, but we do need to track the worker's name. The cancel should be updated to find TaskStatus documents to cancel by worker name.
PR available at: https://github.com/pulp/pulp/pull/1172
This was the actual pull request: https://github.com/pulp/pulp/pull/1177
merged to 2.5-dev -> master
I wonder if we should make queue be the real queue name, and not simply a copy of worker_name. What do you think?
I agree it should be correct. Here is a fix, and it comes with tests! https://github.com/pulp/pulp/pull/1178 Since it's not related to this BZ, I'm going to leave this one at BZ.
Does this BZ require a migration? I get this error after upgrading from 2.5.0 to 2.6.0 alpha: pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._queue_reserved_task[3b65119e-5f49-449a-b67a-5f5dec5d50eb] beavqe-net0 pulp: celery.worker.job:ERROR: (5357-96480) Task pulp.server.async.tasks._queue_reserved_task[3b65119e-5f49-449a-b67a-5f5dec5d50eb] raised unexpected: KeyError('worker_name',) pulp: celery.worker.job:ERROR: (5357-96480) Traceback (most recent call last): pulp: celery.worker.job:ERROR: (5357-96480) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task pulp: celery.worker.job:ERROR: (5357-96480) R = retval = fun(*args, **kwargs) pulp: celery.worker.job:ERROR: (5357-96480) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__ pulp: celery.worker.job:ERROR: (5357-96480) return self.run(*args, **kwargs) pulp: celery.worker.job:ERROR: (5357-96480) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 64, in _queue_reserved_task pulp: celery.worker.job:ERROR: (5357-96480) worker = resources.get_unreserved_worker() pulp: celery.worker.job:ERROR: (5357-96480) File "/usr/lib/python2.7/site-packages/pulp/server/managers/resources.py", line 62, in get_unreserved_worker pulp: celery.worker.job:ERROR: (5357-96480) reserved_names = [r['worker_name'] for r in resources.ReservedResource.get_collection().find()] pulp: celery.worker.job:ERROR: (5357-96480) KeyError: 'worker_name'
I looked more into this; yes we need a migration, but for a different BZ. It's not related to this BZ (even though it has the same name). It's related to this change [0] which also happened in 2.6.0. I didn't consider the order that the user starts their services, but I think a migration would be safer. I've made a new BZ [1] to add a migration to resolve this issue. [0]: https://github.com/pulp/pulp/pull/1158/files?diff=unified#diff-5d58b00ed0c231fdf673ede3a6680640R88 [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1167908
fixed in pulp 2.6.0-0.2.beta
verified >>> pic.GET('/v2/tasks/80d2d7f9-58d5-4a16-b812-6d6f17eafc06/') Response Body { "exception": null, "task_type": "pulp.server.tasks.repository.sync_with_auto_publish", "_href": "/pulp/api/v2/tasks/80d2d7f9-58d5-4a16-b812-6d6f17eafc06/", "task_id": "80d2d7f9-58d5-4a16-b812-6d6f17eafc06", "tags": [ "pulp:repository:rhel7", "pulp:action:sync" ], "finish_time": null, "_ns": "task_status", "start_time": "2015-02-03T18:14:33Z", "traceback": null, "spawned_tasks": [], "progress_report": { "yum_importer": { "content": { "size_total": 0, "items_left": 0, "items_total": 0, "state": "IN_PROGRESS", "size_left": 0, "details": { "rpm_total": 0, "rpm_done": 0, "drpm_total": 0, "drpm_done": 0 }, "error_details": [] }, "comps": { "state": "NOT_STARTED" }, "distribution": { "items_total": 0, "state": "NOT_STARTED", "error_details": [], "items_left": 0 }, "errata": { "state": "NOT_STARTED" }, "metadata": { "state": "FINISHED" } } }, "queue": "reserved_resource_worker-1.lab.eng.bos.redhat.com.dq", "state": "running", "worker_name": "reserved_resource_worker-1.lab.eng.bos.redhat.com", "result": null, "error": null, "_id": { "$oid": "54d11009c9db986e252fb3b3" }, "id": "54d110099f9b813d80e2342d" } (200, {u'exception': None, u'task_type': u'pulp.server.tasks.repository.sync_with_auto_publish', u'_href': u'/pulp/api/v2/tasks/80d2d7f9-58d5-4a16-b812-6d6f17eafc06/', u'task_id': u'80d2d7f9-58d5-4a16-b812-6d6f17eafc06', u'tags': [u'pulp:repository:rhel7', u'pulp:action:sync'], u'finish_time': None, u'_ns': u'task_status', u'start_time': u'2015-02-03T18:14:33Z', u'traceback': None, u'spawned_tasks': [], u'progress_report': {u'yum_importer': {u'content': {u'size_total': 0, u'items_left': 0, u'items_total': 0, u'state': u'IN_PROGRESS', u'size_left': 0, u'details': {u'rpm_total': 0, u'rpm_done': 0, u'drpm_total': 0, u'drpm_done': 0}, u'error_details': []}, u'comps': {u'state': u'NOT_STARTED'}, u'distribution': {u'items_total': 0, u'state': u'NOT_STARTED', u'error_details': [], u'items_left': 0}, u'errata': {u'state': u'NOT_STARTED'}, u'metadata': {u'state': u'FINISHED'}}}, u'queue': u'reserved_resource_worker-1.lab.eng.bos.redhat.com.dq', u'state': u'running', u'worker_name': u'reserved_resource_worker-1.lab.eng.bos.redhat.com', u'result': None, u'error': None, u'_id': {u'$oid': u'54d11009c9db986e252fb3b3'}, u'id': u'54d110099f9b813d80e2342d'})
Moved to https://pulp.plan.io/issues/533