Bug 2082209 - Another deadlock issue when syncing repos with high concurrency [NEEDINFO]
Summary: Another deadlock issue when syncing repos with high concurrency
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.10.3
Hardware: Unspecified
OS: Unspecified
high
high vote
Target Milestone: 6.11.4
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
Depends On: 2127154
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-05 15:14 UTC by Brad Buckingham
Modified: 2022-09-27 02:00 UTC (History)
15 users (show)

Fixed In Version: pulpcore-3.16.14
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2062526
Environment:
Last Closed:
Target Upstream Version:
ltran: needinfo? (hyu)
pulp-infra: needinfo? (dkliban)
pulp-infra: needinfo? (dkliban)
ggainey: needinfo? (dkliban)
pulp-infra: needinfo? (ggainey)
pulp-infra: needinfo? (ggainey)
ltran: needinfo? (ggainey)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulpcore issues 2420 0 None closed bulk_create() deadlock 2022-05-05 15:14:47 UTC
Github pulp pulpcore issues 2430 0 None closed bulk_update() in content-stages can cause (very rare) deadlock 2022-05-05 15:14:47 UTC
Github pulp pulpcore issues 3111 0 None open bulk_update() can still deadlock in content_stages 2022-08-19 17:56:53 UTC
Github pulp pulpcore issues 3192 0 None open content_stages deadlock - Once More Unto The Breach 2022-09-13 13:09:30 UTC

Comment 3 pulp-infra@redhat.com 2022-05-17 22:24:09 UTC
Requesting needsinfo from upstream developer dkliban, ggainey because the 'FailedQA' flag is set.

Comment 4 pulp-infra@redhat.com 2022-05-17 22:24:12 UTC
Requesting needsinfo from upstream developer dkliban, ggainey because the 'FailedQA' flag is set.

Comment 6 pulp-infra@redhat.com 2022-05-19 14:21:25 UTC
Requesting needsinfo from upstream developer dkliban because the 'FailedQA' flag is set.

Comment 7 pulp-infra@redhat.com 2022-05-19 14:21:28 UTC
Requesting needsinfo from upstream developer dkliban because the 'FailedQA' flag is set.

Comment 12 pulp-infra@redhat.com 2022-05-26 13:40:01 UTC
Requesting needsinfo from upstream developer dkliban because the 'FailedQA' flag is set.

Comment 13 pulp-infra@redhat.com 2022-05-26 13:40:05 UTC
Requesting needsinfo from upstream developer dkliban because the 'FailedQA' flag is set.

Comment 18 pulp-infra@redhat.com 2022-06-06 18:26:24 UTC
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.

Comment 19 pulp-infra@redhat.com 2022-06-06 18:26:28 UTC
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.

Comment 21 Grant Gainey 2022-06-21 19:18:03 UTC
We're failing at https://github.com/pulp/pulpcore/blob/main/pulpcore/plugin/stages/artifact_stages.py#L405 , which is exactly what we'd hoped would fix this deadlock. Apparently not, as that code is "in play" on the reproducer machine, poss due to the way batches() works. Working on trying to force a reproducer for this particular line in the code. Investigation in process.

@ltran if we can keep the reproducers around, it would def help the investigation. Let me know if you need to halt/reclaim them any time "soon".

Comment 22 pulp-infra@redhat.com 2022-06-21 20:23:16 UTC
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.

Comment 23 pulp-infra@redhat.com 2022-06-21 20:23:20 UTC
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.

Comment 24 Lai 2022-06-22 00:09:02 UTC
@ggainey I can keep the machines around for as long as you need them.  Just let me know when you're done and I will wipe it out.  If it's wiped out by accident, let me know and I'll provide another reproducer.


Note You need to log in before you can comment on or make changes to this bug.