Bug 2083490

Summary: [RFE] pulp3 tasks history should be easily exportable for support and troubleshooting purposes
Product: Red Hat Satellite Reporter: Stefan Nemeth <snemeth>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: NEW --- QA Contact: Satellite QE Team <sat-qe-bz-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.11.0CC: dalley, dkliban, ggainey, peter.vreman, rchan
Target Milestone: UnspecifiedKeywords: FutureFeature
Target Release: UnusedFlags: dalley: needinfo? (snemeth)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stefan Nemeth 2022-05-10 08:45:48 UTC
Description of problem:

Pulp does not store pulp-task history in reasonably readable form. It is hard for support to provide RCA and sometimes troubleshoot pulp problems.  

Pulp3 uses pgsql now, so storing tasks at into some table can be very helpful.
Export command like

#pulp tasks history export

would be nice   

Version-Release number of selected component (if applicable):
6.10.5

Comment 1 Grant Gainey 2022-08-17 20:17:09 UTC
Pulp3 stores tasks in the core_tasks table in postgres:

===
pulp=> \d core_task
                               Table "public.core_task"
          Column           |           Type           | Collation | Nullable | Default 
---------------------------+--------------------------+-----------+----------+---------
 pulp_id                   | uuid                     |           | not null | 
 pulp_created              | timestamp with time zone |           | not null | 
 pulp_last_updated         | timestamp with time zone |           |          | 
 state                     | text                     |           | not null | 
 name                      | text                     |           | not null | 
 started_at                | timestamp with time zone |           |          | 
 finished_at               | timestamp with time zone |           |          | 
 error                     | jsonb                    |           |          | 
 worker_id                 | uuid                     |           |          | 
 parent_task_id            | uuid                     |           |          | 
 task_group_id             | uuid                     |           |          | 
 logging_cid               | text                     |           | not null | 
 args                      | jsonb                    |           |          | 
 kwargs                    | jsonb                    |           |          | 
 reserved_resources_record | text[]                   |           |          | 
Indexes:
    "core_task_pkey" PRIMARY KEY, btree (pulp_id)
    "core_task_logging_cid_0bc78a42" btree (logging_cid)
    "core_task_parent_task_id_07cf4230" btree (parent_task_id)
    "core_task_pulp_cr_10223f_idx" btree (pulp_created)
    "core_task_task_group_id_a45c142c" btree (task_group_id)
    "core_task_worker_id_ca31e694" btree (worker_id)
Foreign-key constraints:
    "core_task_parent_task_id_07cf4230_fk_core_task_pulp_id" FOREIGN KEY (parent_task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
    "core_task_task_group_id_a45c142c_fk_core_taskgroup_pulp_id" FOREIGN KEY (task_group_id) REFERENCES core_taskgroup(pulp_id) DEFERRABLE INITIALLY DEFERRED
    "core_task_worker_id_ca31e694_fk_core_worker__id" FOREIGN KEY (worker_id) REFERENCES core_worker(pulp_id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "core_createdresource" CONSTRAINT "core_createdresource_task_id_acf70fb7_fk_core_task__id" FOREIGN KEY (task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
    TABLE "core_export" CONSTRAINT "core_export_task_id_4947760b_fk_core_task_pulp_id" FOREIGN KEY (task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
    TABLE "core_import" CONSTRAINT "core_import_task_id_b927da56_fk_core_task_pulp_id" FOREIGN KEY (task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
    TABLE "core_progressreport" CONSTRAINT "core_progressreport_task_id_0c3fbc3b_fk_core_task_pulp_id" FOREIGN KEY (task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
    TABLE "core_task" CONSTRAINT "core_task_parent_task_id_07cf4230_fk_core_task_pulp_id" FOREIGN KEY (parent_task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
    TABLE "core_taskschedule" CONSTRAINT "core_taskschedule_last_task_id_5c1ee058_fk_core_task_pulp_id" FOREIGN KEY (last_task_id) REFERENCES core_task(pulp_id) DEFERRABLE INITIALLY DEFERRED
===

You can query this data using the REST API directly - see https://docs.pulpproject.org/pulpcore/restapi.html#tag/Tasks/operation/tasks_list for details.


The pulp-cli already supports task "export", as JSON (def easier to parse via script than the human-readable pulp-admin output from Pulp2):
===
(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 /]$ pulp task list --state completed --name-contains synchronize
[
  {
    "pulp_href": "/pulp/api/v3/tasks/db1eec95-92bb-4db4-854a-5637c409ff08/",
    "pulp_created": "2022-08-17T12:41:37.177606Z",
    "state": "completed",
    "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
    "logging_cid": "aae0a7304c304f28b7609b0f9aab9e63",
    "started_at": "2022-08-17T12:41:37.294694Z",
    "finished_at": "2022-08-17T12:41:42.636583Z",
    "error": null,
    "worker": "/pulp/api/v3/workers/5a1a6122-242b-4036-a4f6-048896f77064/",
    "parent_task": null,
    "child_tasks": [],
    "task_group": null,
    "progress_reports": [
      {
        "message": "Downloading Metadata Files",
        "code": "sync.downloading.metadata",
        "state": "completed",
        "total": null,
        "done": 10,
        "suffix": null
      },
      {
        "message": "Downloading Artifacts",
        "code": "sync.downloading.artifacts",
        "state": "completed",
        "total": null,
        "done": 0,
        "suffix": null
      },
      {
        "message": "Associating Content",
        "code": "associating.content",
        "state": "completed",
        "total": null,
        "done": 43,
        "suffix": null
      },
      {
        "message": "Skipping Packages",
        "code": "sync.skipped.packages",
        "state": "completed",
        "total": 0,
        "done": 0,
        "suffix": null
      },
      {
        "message": "Parsed Packages",
        "code": "sync.parsing.packages",
        "state": "completed",
        "total": 35,
        "done": 35,
        "suffix": null
      },
      {
        "message": "Parsed Comps",
        "code": "sync.parsing.comps",
        "state": "completed",
        "total": 3,
        "done": 3,
        "suffix": null
      },
      {
        "message": "Parsed Advisories",
        "code": "sync.parsing.advisories",
        "state": "completed",
        "total": 4,
        "done": 4,
        "suffix": null
      },
      {
        "message": "Un-Associating Content",
        "code": "unassociating.content",
        "state": "completed",
        "total": null,
        "done": 0,
        "suffix": null
      }
    ],
    "created_resources": [
      "/pulp/api/v3/repositories/rpm/rpm/8d307920-3244-4107-8ac9-f40005422d9d/versions/1/",
      "/pulp/api/v3/publications/rpm/rpm/dc06fe7d-e0fe-4d84-981f-4814f501625f/"
    ],
    "reserved_resources_record": [
      "/pulp/api/v3/repositories/rpm/rpm/8d307920-3244-4107-8ac9-f40005422d9d/",
      "shared:/pulp/api/v3/remotes/rpm/rpm/1f0cd15e-592e-4e85-b126-5bc32de98a85/"
    ]
  }
]
===

One thing the CLI is missing, is full support for all the things the Task REST API provides for limiting output. See these two RFEs:
* https://github.com/pulp/pulp-cli/issues/543
* https://github.com/pulp/pulp-cli/issues/542

Between these and being able to use 'jq' to massage the pulp-cli output to get whatever you're looking for - would that answer your RFE?

Comment 3 Daniel Alley 2022-10-31 14:08:51 UTC
Stefan, see comment #1

Comment 4 Daniel Alley 2023-06-13 02:41:39 UTC
@ggainey I don't think it would be enough because the CLI is paginated, so you'd have to write some messy logic to get all the tasks.  Probably it would be better if it were dumped as one big json file.

Comment 5 Brad Buckingham 2023-07-21 21:06:39 UTC
Upon review of our valid but aging backlog the Satellite Team has concluded that this Bugzilla does not meet the criteria for a resolution in the near term, and are planning to close in a month. This message may be a repeat of a previous update and the bug is again being considered to be closed. If you have any concerns about this, please contact your Red Hat Account team.  Thank you.