Bug 1962462

Summary: pulp3: Sat6.9 with pulpcore tasking system assigns task to a removed worker
Product: Red Hat Satellite Reporter: Pavel Moravec <pmoravec>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Lai <ltran>
Severity: high Docs Contact:
Priority: high    
Version: 6.9.0CC: desingh, ggainey, jjeffers, jyejare, osousa, pcreech, peter.vreman, pmendezh, rchan, ttereshc
Target Milestone: 6.9.6Keywords: AutomationBlocker, PrioBumpQA, Triaged, UpgradeBlocker
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-pulpcore-3.7.8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-21 14:37:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Moravec 2021-05-20 06:29:31 UTC
Description of problem:
Playing with pulpcore in Sat6.9, I repeatedly get task system stuck with all pulp tasks waiting. The reproducer is publishing specific Content View (see Steps to Reproduce).


Version-Release number of selected component (if applicable):
Sat 6.9 with pulpcore enabled.


How reproducible:
100%


Steps to Reproduce:
1. Sync these 5 repos (for Caps6.9):
2. Create a CV with these 5 repos, depsolving enabled, and CV filters to include everything older than 2021-05-01:

# update repository-ids per the IDs of the 5 repos
hammer -u admin -p redhat  content-view create --name cv_caps69_yes_include_2021-05-01 --organization-id=1 --repository-ids=6,9,10,13,17 --solve-dependencies=yes
hammer -u admin -p redhat  content-view filter create --organization-id=1 --content-view=cv_caps69_yes_include_2021-05-01 --name=include_base --inclusion=true --original-packages=true --type=rpm
hammer -u admin -p redhat  content-view filter create --organization-id=1 --content-view=cv_caps69_yes_include_2021-05-01 --name=include_errata --inclusion=true --type=erratum
hammer -u admin -p redhat  content-view filter rule create --organization-id=1 --content-view=cv_caps69_yes_include_2021-05-01 --content-view-filter=include_errata --date-type='updated' --end-date='2021-05-01'
hammer -u admin -p redhat  content-view filter create --organization-id=1 --content-view=cv_caps69_yes_include_2021-05-01 --name=include_modules --inclusion=true --original-module-streams=true --type=modulemd

3. Publish the CV:
hammer -u admin -p redhat  content-view publish --name=cv_caps69_yes_include_2021-05-01 --organization-id=1 --async


Actual results:
The publish never terminates. It gets stuck in:

 Actions::Pulp3::Repository::MultiCopyContent (waiting for Pulp to start the task)

waiting on pulp:
  pulp_data: !ruby/hash:ActiveSupport::HashWithIndifferentAccess
    pulp_href: "/pulp/api/v3/tasks/40b8dba3-e4fa-4e9a-aaa2-bf24ceb2808c/"
    pulp_created: '2021-05-19T19:19:41.758+00:00'
    state: waiting
    name: pulp_rpm.app.tasks.copy.copy_content
    worker: "/pulp/api/v3/workers/207894a5-0b0d-4138-b7e9-ee9de1906414/"
    child_tasks: []
    progress_reports: []
    created_resources: []
    reserved_resources_record:
    - "/pulp/api/v3/repositories/rpm/rpm/8d4247f2-3b1a-47ff-831d-a5650899b1b0/"
    - "/pulp/api/v3/repositories/rpm/rpm/b91b6053-9852-4869-ade3-832fff6198fe/"
    - "/pulp/api/v3/repositories/rpm/rpm/c6e334a8-1e9e-41f0-8dd2-49f5f5eaf25a/"
    - "/pulp/api/v3/repositories/rpm/rpm/d93ffa7d-765b-4e49-9d9d-412a2469cfef/"
    - "/pulp/api/v3/repositories/rpm/rpm/306fc070-f812-4cab-ace6-dfd52ada593a/"
    - "/pulp/api/v3/repositories/rpm/rpm/b4233bb7-d684-4a4a-9852-64b68ed73fd8/"
    - "/pulp/api/v3/repositories/rpm/rpm/7b9deeee-a150-4b5f-9830-c6f554d84181/"
    - "/pulp/api/v3/repositories/rpm/rpm/79700284-edf4-4744-ab9c-fd35c22481bf/"
    - "/pulp/api/v3/repositories/rpm/rpm/2b9b3119-3338-405a-8b4c-d8c8f18957d3/"
    - "/pulp/api/v3/repositories/rpm/rpm/71c95e46-f3b8-40bc-9deb-cefea3ab2633/"
  href: "/pulp/api/v3/tasks/40b8dba3-e4fa-4e9a-aaa2-bf24ceb2808c/"

There are 9 such copy_content pulp tasks, all in waiting state. All pulpcore.app.tasks.repository.add_and_remove are completed, all copy_content are waiting for Godot.

No task is running, per core_task table in pulpcore DB of postgres. These 9 are waiting, all others are completed.

Redis data shows no activity:

# rq info -u redis://localhost:6379/8
1154.redhat.com | 0
1090.redhat.com | 0
1070.redhat.com |████████████████ 665
1059.redhat.com | 0
1057.redhat.com |███████ 287
22015.redhat.com | 0
1103.redhat.com | 0
resource-manager | 0
1097.redhat.com | 0
22009.redhat.com | 0
1112.redhat.com | 0
13679.redhat.com | 0
1065.redhat.com | 0
1124.redhat.com | 0
13683.redhat.com | 0
22013.redhat.com | 0
1111.redhat.com | 0
1055.redhat.com | 0
1098.redhat.com | 0
1088.redhat.com | 0
13676.redhat.com | 0
1082.redhat.com | 0
1162.redhat.com | 0
13682.redhat.com | 0
22012.redhat.com | 0
25 queues, 952 jobs total

resource-manager (pmoravec-sat69-pulp3.satotest.redhat.com 13586): idle resource-manager
13676.redhat.com (pmoravec-sat69-pulp3.satotest.redhat.com 13676): idle 13676.redhat.com
13683.redhat.com (pmoravec-sat69-pulp3.satotest.redhat.com 13683): idle 13683.redhat.com
13682.redhat.com (pmoravec-sat69-pulp3.satotest.redhat.com 13682): idle 13682.redhat.com
13679.redhat.com (pmoravec-sat69-pulp3.satotest.redhat.com 13679): idle 13679.redhat.com
5 workers, 25 queues

Updated: 2021-05-20 08:25:00.238811
#


Expected results:
CV completes in a reasonable time.


Additional info:
I will provide sosreport with pulpcore plugin related data from the time of stuck.

Comment 2 Pavel Moravec 2021-05-20 06:53:39 UTC
I forgot to specify the 5 repos in the CV, here they are:


Red Hat Ansible Engine 2.9 RPMs for Red Hat Enterprise Linux 7 Server x86_64
Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server
Red Hat Satellite Capsule 6.9 for RHEL 7 Server RPMs x86_64
Red Hat Satellite Maintenance 6 for RHEL 7 Server RPMs x86_64
Red Hat Software Collections RPMs for Red Hat Enterprise Linux 7 Server x86_64 7Server

Comment 3 Pavel Moravec 2021-05-20 12:25:16 UTC
Re-reproduced with a CV filter "include everything older than 2021-01-01".

https://hackmd.io/@pulp/reserved_resource_debugging#Gimme-answer-RIGHT-NOW :

Worker 207894a5-0b0d-4138-b7e9-ee9de1906414 owns ReservedResource 71274740-970d-427a-90df-65e8d0f6b4bf and is not in online_workers!!

while core_workers:

               pulp_id                |         pulp_created          |       pulp_last_updated       |                      name                      |        last_heartbeat         | gracefully_stopped | cleaned_up 
--------------------------------------+-------------------------------+-------------------------------+------------------------------------------------+-------------------------------+--------------------+------------
 207894a5-0b0d-4138-b7e9-ee9de1906414 | 2021-05-17 09:46:15.450749+02 | 2021-05-17 09:54:57.859257+02 | 1070.redhat.com  | 2021-05-18 08:31:59.962307+02 | f                  | t

last_heartbeat seen 2 days ago, but the worker got assigned a new task..?




(I smell some ungracefull shutdown behind it, I will rather start with a clean table from scratch)

Comment 4 Pavel Moravec 2021-05-26 15:27:25 UTC
The problem isnt with a specific CV (I can't re-reproduce it on a fresh install), but in the way how I stopped tasks previously. I *think* I rebooted the system as it started to swap, and that abrupt termination of the tasks caused the tasks and reservations were wrongly loaded after the start.

I will try to reproduce *this* behaviour, in further days.

Comment 6 pulp-infra@redhat.com 2021-06-08 13:27:44 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 7 pulp-infra@redhat.com 2021-06-08 13:27:47 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 8 pulp-infra@redhat.com 2021-06-08 16:26:04 UTC
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 9 Pavel Moravec 2021-06-24 13:01:15 UTC
I might have a reproducer for this:

- put the system under a bigger load
- such big that workers send their keepalives too late
(I mimicked these two steps by stopping chronyd, moving time back a bit(*), starting chronyd and letting it to move clocks to the right time)
  (*) 12 times in a row, I mved time 1s back and slept for 10s
- restart all services

Then the symptoms exactly match.


I will re-run this if it is more deterministic reproducer.


(and I tested this on Sat6.10 batch4 / python3-pulpcore-3.11.2-2.el7pc.noarch / python3-pulp-rpm-3.11.0-1.el7pc.noarch)

Comment 10 Pavel Moravec 2021-06-28 11:09:45 UTC
Let put this bug on hold: I can't reproduce it. I know some sequence (with some randomness) among:
- modifying clocks (just to force workers to be marked as inactive due to lost heartbeats)
- having some running + waiting tasks
- restarting services or rebooting the system

some such sequence did the trick for me twice. Among few tens of attempts. So there is some issue with assigning tasks to inactive workers, but with no known reproducer :( .

I will give it some new trials after some time, again..

Comment 11 Tanya Tereshchenko 2021-06-29 11:12:37 UTC
This should go away with the new tasking system in pulpcore 3.14.

Comment 12 Tanya Tereshchenko 2021-06-29 11:52:10 UTC
*** Bug 1975858 has been marked as a duplicate of this bug. ***

Comment 13 pulp-infra@redhat.com 2021-07-06 15:08:25 UTC
The Pulp upstream bug status is at CLOSED - WONTFIX. Updating the external tracker on this bug.

Comment 14 Peter Vreman 2021-07-13 10:03:30 UTC
I have the issue also with testing the pulp3.
Waiting tasks seen on various tasks:
- normal CV publish
- composite CV publish
- redhat repo sync (daily suync of ~60 repos)

The bug is persistent and cannot be solved/fixed/workaround by me as user. With pulp2 a reboot would just normally restart the work again and finish things. But now even after a reboot the pending/waiting work is not rescheduled to the new workers.

For me as beta tester this bug is showstopper to do any real-world testing (e.g. functional/performance that all my use cases work) with pulp3

I really hope (also for RedHat support) that there will be a 2nd HTB test program with pulpcore 3.14 before it goes public.

Comment 15 Peter Vreman 2021-07-13 10:24:27 UTC
Also another thing i noticed is that concurrency is also not handled correctly. Just like in the RQ redis output show in the description above were 1 worker has 600+ tasks.
For me it also like this, that 1 worker got 80% of the tasks assigned. And therefor the parallel execution for the content handling is almost serialized.

~~~
crash/LI] root@li-lc-2222:~# curl -k --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key "https://localhost/pulp/api/v3/tasks/?state=waiting" | jq '.results[].worker'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  100k  100  100k    0     0  47584      0  0:00:02  0:00:02 --:--:-- 47590
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/fbe0b7a9-ae4c-4c38-98b7-d89a300b926c/"
"/pulp/api/v3/workers/fbe0b7a9-ae4c-4c38-98b7-d89a300b926c/"
"/pulp/api/v3/workers/fbe0b7a9-ae4c-4c38-98b7-d89a300b926c/"
"/pulp/api/v3/workers/fbe0b7a9-ae4c-4c38-98b7-d89a300b926c/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/03c106c6-4eb9-4487-b104-abfaee42b157/"
"/pulp/api/v3/workers/2a224ade-3fc8-497e-a0d3-dc56666ce606/"
"/pulp/api/v3/workers/5703421a-7fcd-4e55-afe7-15499e86a07e/"
null
null
[crash/LI] root@li-lc-2222:~#
~~~

Comment 16 Peter Vreman 2021-07-13 10:35:20 UTC
i did another test after cancling all tasks in pulp3 and katello
Now i restarted the publishing of 8x CV with each ~7 Redhat repos. The result is all tasks are waiting on a single worker:

~~~
[crash/LI] root@li-lc-2222:~# curl -k --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key "https://localhost/pulp/api/v3/workers/?online=true" | jq '.results[].pulp_href'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1892  100  1892    0     0   9144      0 --:--:-- --:--:-- --:--:--  9229
"/pulp/api/v3/workers/3d45473d-bd72-4011-bb5a-46b018345965/"
"/pulp/api/v3/workers/b8a96a16-6670-4970-ba55-a7d24503ab3d/"
"/pulp/api/v3/workers/c44a28a9-94fa-4100-9ec8-7e7f2dafad5f/"
"/pulp/api/v3/workers/774326e7-6378-476b-aa49-20b32b5a9bc4/"
"/pulp/api/v3/workers/ca44f583-6168-47a2-a559-ab70ef28f01e/"
"/pulp/api/v3/workers/aece7a92-f3df-493a-a483-9414b60368fb/"
"/pulp/api/v3/workers/45b2dd0c-dfdd-43aa-864d-9871d2e47fa9/"
"/pulp/api/v3/workers/b0e00463-256e-4d8b-ba66-8c87230f9cc1/"
"/pulp/api/v3/workers/5930b6a0-b38b-4b6a-a8c5-94df09f35f24/"

[crash/LI] root@li-lc-2222:~# curl -k --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key "https://localhost/pulp/api/v3/tasks/?state=running" | jq '.results[].worker'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1691  100  1691    0     0   6963      0 --:--:-- --:--:-- --:--:--  6987
"/pulp/api/v3/workers/45b2dd0c-dfdd-43aa-864d-9871d2e47fa9/"

[crash/LI] root@li-lc-2222:~# curl -k --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key "https://localhost/pulp/api/v3/tasks/?state=waiting" | jq '.results[].worker' | sort | uniq -c
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 28987  100 28987    0     0   100k      0 --:--:-- --:--:-- --:--:--  100k
     17 "/pulp/api/v3/workers/45b2dd0c-dfdd-43aa-864d-9871d2e47fa9/"
[crash/LI] root@li-lc-2222:~#
~~~

Comment 17 pulp-infra@redhat.com 2021-07-13 14:07:30 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 19 pulp-infra@redhat.com 2021-07-14 17:16:16 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 20 pulp-infra@redhat.com 2021-07-19 12:12:13 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 22 pulp-infra@redhat.com 2021-07-20 20:08:09 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 23 pulp-infra@redhat.com 2021-07-20 20:08:11 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 24 pulp-infra@redhat.com 2021-07-22 15:08:37 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 25 Pavel Moravec 2021-07-22 20:08:19 UTC
(In reply to Peter Vreman from comment #15)
> Also another thing i noticed is that concurrency is also not handled
> correctly. Just like in the RQ redis output show in the description above
> were 1 worker has 600+ tasks.
> For me it also like this, that 1 worker got 80% of the tasks assigned. And
> therefor the parallel execution for the content handling is almost
> serialized.
> 

Though this should be filed as a separate bug, I got a similar impression as well. Some pulp devel told me it could be due to a dependency among the repos (or maybe some other objects?) that require serialization. And really, running pulp tasks on really independent repos made busy all workers. But I still feel my original user story (as well as yours) contained some independent tasks that could be executed concurrently, but the tasking system buffered all of them to a single worker. More deterministic reproducer would be needed.. (I will try to play with this if having some time..)

Comment 26 pulp-infra@redhat.com 2021-07-23 18:08:45 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 27 pulp-infra@redhat.com 2021-07-23 18:08:47 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 28 pulp-infra@redhat.com 2021-07-26 20:07:35 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 29 Peter Vreman 2021-07-27 10:55:30 UTC
As mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1962462#c25 i have created a dedicated BZ https://bugzilla.redhat.com/show_bug.cgi?id=1986356 regarding the concurrency of pulp3 tasks (which looks to be a caused by exclusive-lock always used instead of reader/writer-locks)

Comment 30 Brad Buckingham 2021-07-27 12:55:01 UTC
Removing this bugzilla from Satellite 6.10.  This bugzilla only applies on the old tasking system, which will be included only for 6.9.z to support migrations.  In Satellite 6.10, a new tasking system is used.

Comment 31 Lai 2021-09-16 17:36:59 UTC
Steps to retest:

1. On a 6.9 sat, sync the following repos:

Red Hat Ansible Engine 2.9 RPMs for Red Hat Enterprise Linux 7 Server x86_64
Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server
Red Hat Satellite Capsule 6.9 for RHEL 7 Server RPMs x86_64
Red Hat Satellite Maintenance 6 for RHEL 7 Server RPMs x86_64
Red Hat Software Collections RPMs for Red Hat Enterprise Linux 7 Server x86_64 7Server

2. create cv and add repos from step 1
3. publish repo
4. perform migration
5. perform switchover but do not migrate to 6.10 (this will enable use to use pulp3 on 6.9)
6. restart services
7. republish cv created in step 2

Expected:
3) Publish should be successful
4) migration should be successful
5) switchover should be successful
7) republish should be successful

Actual:
3) Publish is successful
4) migration is successful
5) switchover is successful
7) republish is successful

In both instances, I tried isolating it to just publish in 6.9 with pulp2 and then another new instance of doing a switchover to pulp3 and publishing those same repos.  Both instances are successful and without errors.

Verified on sat 6.9.6_02 with python3-pulpcore-3.7.8-1.el7pc.noarch

Comment 36 errata-xmlrpc 2021-09-21 14:37:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.9.6 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3628

Comment 37 pulp-infra@redhat.com 2021-10-12 14:08:42 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 38 pulp-infra@redhat.com 2021-10-12 14:08:43 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.