Bug 2021406 - syncing tens of repos to capsule can cause deadlock: while updating tuple (...) in relation "core_content"
Summary: syncing tens of repos to capsule can cause deadlock: while updating tuple (.....
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
high with 2 votes vote
Target Milestone: 6.11.0
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-09 07:46 UTC by Jan Hutař
Modified: 2022-07-05 14:30 UTC (History)
18 users (show)

Fixed In Version: pulpcore-3.16.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2043710 (view as bug list)
Environment:
Last Closed: 2022-07-05 14:30:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulpcore issues 2157 0 None closed touch() path still has a potential deadlock window. 2022-03-31 15:03:18 UTC
Red Hat Knowledge Base (Solution) 6718181 0 None None None 2022-02-09 08:47:56 UTC
Red Hat Product Errata RHSA-2022:5498 0 None None None 2022-07-05 14:30:46 UTC

Description Jan Hutař 2021-11-09 07:46:05 UTC
Description of problem:
Syncing tens of repos to capsule can cause deadlock: while updating tuple (...) in relation "core_content"


Version-Release number of selected component (if applicable):
Satellite: satellite-6.10.0-3.el7sat.noarch
Capsule: satellite-capsule-6.10.0-3.el7sat.noarch


How reproducible:
Seen this in about 2 from 4 sync attempts in my case


Steps to Reproduce:
1. Create a setup: https://github.com/redhat-performance/satperf/blob/3c5df3ff6175070602ba5ed656dbee09a38f4024/scripts/create-big-setup.sh
2. Generate more (C)CV versions to sync: https://github.com/redhat-performance/satperf/blob/3c5df3ff6175070602ba5ed656dbee09a38f4024/scripts/create-big-setup-update.sh
3. During this testing I have hit bug 2013320 so I have these patches applied:
   https://github.com/pulp/pulp_rpm/commit/72c313a8a343402ad8419aae7971a947a9ebac36.patch
   https://gitlab.sat.engineering.redhat.com/satellite6/katello/-/merge_requests/897.patch
   and as per Ian Ballou help I have cleaned mine environment using:

> I think, for the sake of time, the best workaround would be to delete the
> repositories in Pulp on the capsule and sync again.  To do this, just
> remove the related lifecycle environments from your capsule and run
> `foreman-rake katello:delete_orphaned_content`.  Your capsule's Pulp
> repositories will be cleaned up because the capsule is no longer associated
> with them from the Satellite's perspective.  The Pulp repositories will be
> recreated when you re-add the lifecycle environments to your capsule and
> sync it.

After this I had a green sync.

After that I have ran create-big-setup-update.sh 3 times to get more CCVVs to sync to capsule.


Actual results:
Capsule sync is still running, but one sub-task failed already with deadlock.


Expected results:
All should be green.

Comment 2 Jan Hutař 2021-11-09 07:48:20 UTC
Traceback and deadlock summary:

    traceback: |2
        File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 317, in _perform_task
          result = func(*args, **kwargs)
        File "/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 490, in synchronize
          version = dv.create()
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create
          loop.run_until_complete(pipeline)
        File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
          return future.result()
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
          await asyncio.gather(*futures)
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
          await self.run()
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/content_stages.py", line 52, in run
          model_type.objects.filter(content_q).touch()
        File "/usr/lib/python3.6/site-packages/pulpcore/app/models/content.py", line 107, in touch
          return self.order_by("pk").update(timestamp_of_interest=now())
        File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 741, in update
          rows = query.get_compiler(self.db).execute_sql(CURSOR)
        File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1479, in execute_sql
          aux_rows = query.get_compiler(self.using).execute_sql(result_type)
        File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1471, in execute_sql
          cursor = super().execute_sql(result_type)
        File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1142, in execute_sql
          cursor.execute(sql, params)
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
          return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
          return executor(sql, params, many, context)
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
          return self.cursor.execute(sql, params)
        File "/usr/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
          raise dj_exc_value.with_traceback(traceback) from exc_value
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
          return self.cursor.execute(sql, params)
    description: |
      deadlock detected
      DETAIL:  Process 16636 waits for ShareLock on transaction 3908059; blocked by process 16640.
      Process 16640 waits for ShareLock on transaction 3908060; blocked by process 16636.
      HINT:  See server log for query details.
      CONTEXT:  while updating tuple (75,44) in relation "core_content"
  worker: "/pulp/api/v3/workers/42f31558-ac12-4f6e-8e0d-f5c4d0c20aec/"

Comment 7 Dennis Kliban 2021-11-19 14:52:11 UTC
The traceback looks like the code we added to fix deadlocks is not preventing them in all cases. https://github.com/pulp/pulpcore/commit/69b2e1dcd9f2b51887d0faf856834fc820925f0d

Grant, could you please investigate this?

Comment 8 Grant Gainey 2021-12-01 12:52:11 UTC
(In reply to Dennis Kliban from comment #7)
> The traceback looks like the code we added to fix deadlocks is not
> preventing them in all cases.
> https://github.com/pulp/pulpcore/commit/
> 69b2e1dcd9f2b51887d0faf856834fc820925f0d

Bah, we discussed whether we could fix the ordering-problem from 'inside' touch(), it looks like the answer is 'no'.

> Grant, could you please investigate this?

Yup, onnit.

Comment 9 Hao Chang Yu 2021-12-23 07:13:43 UTC
(In reply to Grant Gainey from comment #8)
> (In reply to Dennis Kliban from comment #7)
> > The traceback looks like the code we added to fix deadlocks is not
> > preventing them in all cases.
> > https://github.com/pulp/pulpcore/commit/
> > 69b2e1dcd9f2b51887d0faf856834fc820925f0d
> 
> Bah, we discussed whether we could fix the ordering-problem from 'inside'
> touch(), it looks like the answer is 'no'.
> 
> > Grant, could you please investigate this?
> 
> Yup, onnit.

maybe we can try select_for_update here. or just retry when deadlock is detected?

Comment 10 Hao Chang Yu 2021-12-24 09:11:44 UTC
(In reply to Hao Chang Yu from comment #9)
> (In reply to Grant Gainey from comment #8)
> > (In reply to Dennis Kliban from comment #7)
> > > The traceback looks like the code we added to fix deadlocks is not
> > > preventing them in all cases.
> > > https://github.com/pulp/pulpcore/commit/
> > > 69b2e1dcd9f2b51887d0faf856834fc820925f0d
> > 
> > Bah, we discussed whether we could fix the ordering-problem from 'inside'
> > touch(), it looks like the answer is 'no'.
> > 
> > > Grant, could you please investigate this?
> > 
> > Yup, onnit.
> 
> maybe we can try select_for_update here. or just retry when deadlock is
> detected?


select_for_update() won't help. I ran the following codes in 2 shells concurrently and still getting deadlock.
--------------------------------------------
PULP_SETTINGS=/etc/pulp/settings.py pulpcore-manager shell
import _thread
from django.db import transaction
from pulpcore.plugin.models import Content, RepositoryVersion
from django.utils.timezone import now
def update_timestamp(index):
    Content.objects.select_for_update().filter().update(timestamp_of_interest=now())
    print(">>>done")

for i in range(3):
    _thread.start_new_thread( update_timestamp, (i,) )
--------------------------------------------

Comment 11 Hao Chang Yu 2022-01-14 06:59:56 UTC
(In reply to Grant Gainey from comment #8)
> (In reply to Dennis Kliban from comment #7)
> > The traceback looks like the code we added to fix deadlocks is not
> > preventing them in all cases.
> > https://github.com/pulp/pulpcore/commit/
> > 69b2e1dcd9f2b51887d0faf856834fc820925f0d
> 
> Bah, we discussed whether we could fix the ordering-problem from 'inside'
> touch(), it looks like the answer is 'no'.
> 

After some reading and testing, I think postgres doesn't support update in order.

I tested the following:

content_q = Content.objects.filter().values_list("pk", flat=True)[0:30000]
Content.objects.filter(pk__in=content_q).order_by("pk").update(timestamp_of_interest=current)


and this is the query postgres run:

2022-01-14 16:42:06 AEST LOG:  duration: 1864.303 ms  statement: UPDATE "core_content" SET "timestamp_of_interest" = '2022-01-14T06:40:32.367074+00:00'::timestamptz WHERE "core_content"."pulp_id" IN (SELECT U0."pulp_id" FROM "core_content" U0  LIMIT 30000)

Comment 12 Hao Chang Yu 2022-01-14 07:12:50 UTC
(In reply to Hao Chang Yu from comment #10)
> (In reply to Hao Chang Yu from comment #9)
> > (In reply to Grant Gainey from comment #8)
> > > (In reply to Dennis Kliban from comment #7)
> > > > The traceback looks like the code we added to fix deadlocks is not
> > > > preventing them in all cases.
> > > > https://github.com/pulp/pulpcore/commit/
> > > > 69b2e1dcd9f2b51887d0faf856834fc820925f0d
> > > 
> > > Bah, we discussed whether we could fix the ordering-problem from 'inside'
> > > touch(), it looks like the answer is 'no'.
> > > 
> > > > Grant, could you please investigate this?
> > > 
> > > Yup, onnit.
> > 
> > maybe we can try select_for_update here. or just retry when deadlock is
> > detected?
> 
> 
> select_for_update() won't help. I ran the following codes in 2 shells
> concurrently and still getting deadlock.
> --------------------------------------------
> PULP_SETTINGS=/etc/pulp/settings.py pulpcore-manager shell
> import _thread
> from django.db import transaction
> from pulpcore.plugin.models import Content, RepositoryVersion
> from django.utils.timezone import now
> def update_timestamp(index):
>    
> Content.objects.select_for_update().filter().
> update(timestamp_of_interest=now())
>     print(">>>done")
> 

Hmm...OK. This doesn't work because it will ignore the "select_for_update" and just run the update statement. The only way to use "select_for_update()" is to iterate the queryset returned by it.

Comment 14 Hao Chang Yu 2022-01-15 02:38:17 UTC
(In reply to Sayan Das from comment #13) 
> For now, the Customer needs to have some cleanup of data performed in
> Satellite and then retry syncing on Capsule but they will eventually hit
> this again later.
> 
> 
> So wanted to check if we know,  what sort of situation could be a reason
> behind the trigger of this issue and how I should proceed with the
> Investigation here when it appears again?

You can try to reduce the number of pulpcore workers in the Capsule to reduce the chance of deadlock and number of deadlocks. So far I saw Capsules hitting this issue are having 8 workers. If it failed with deadlock again, keep re-run the complete sync until it finished successfully. This is because, when deadlock occur, postgres will cancel one transaction and let one transaction to proceed.


Command to stop 3 workers:
systemctl stop pulpcore-worker@8.service pulpcore-worker@7.service pulpcore-worker@6.service

Comment 15 Grant Gainey 2022-01-17 16:36:07 UTC
Hao - I have not been able to reproduce a deadlock on 3.14, even using your reproducer in #c10.  How much/what kind of content is sync'd to your Sat that you can force a deadlock?

RE select_for_update() - assuming we can get pgres to use it, select_for_update(skip_locked=True) might help us here. What we're doing in touch() is making sure async-orphan-cleanup doesn't inadvertently pick up Artifacts that are only temporarily not-assigned-to-a-repo-version; if a row is currently-locked, Someone Else is clearly doing *something* with it, so we can (probably?) safely ignore updating it's last-touched timestamp.

If you can give me a script that you use to force/see deadlocks in this codepath, I would be eternally grateful. Not sure why I'm having no luck recreating - might be a data-shape issue, I am going to retry with RHEL7, assuming I have enough disk on my box to get it all sync'd.

Comment 16 Hao Chang Yu 2022-01-18 03:34:01 UTC
(In reply to Grant Gainey from comment #15)
> Hao - I have not been able to reproduce a deadlock on 3.14, even using your
> reproducer in #c10.  How much/what kind of content is sync'd to your Sat
> that you can force a deadlock?
> 

Hi Grant

Thanks for looking into this bug. I am able to reproduce the deadlock every time using the script in comment 10 by running it in 2 shells concurrently. I think you need to have some large repos synced in your Satellite so that it will take some time to update the rows, such as rhel 7 server.

Try to increase the number threads in the script to increase the chance to reproduce it. I found that 8 threads is a good number.

# Run this script at the same time in 2 or more shells.
------------------------------------------------------------
PULP_SETTINGS=/etc/pulp/settings.py pulpcore-manager shell
import _thread
from django.db import transaction
from pulpcore.plugin.models import Content, RepositoryVersion
from django.utils.timezone import now
def update_timestamp(index):
    Content.objects.select_for_update().filter().update(timestamp_of_interest=now())
    print(">>>done")

for i in range(8):
    _thread.start_new_thread( update_timestamp, (i,) )
------------------------------------------------------------


> RE select_for_update() - assuming we can get pgres to use it,
> select_for_update(skip_locked=True) might help us here. What we're doing in
> touch() is making sure async-orphan-cleanup doesn't inadvertently pick up
> Artifacts that are only temporarily not-assigned-to-a-repo-version; if a row
> is currently-locked, Someone Else is clearly doing *something* with it, so
> we can (probably?) safely ignore updating it's last-touched timestamp.
> 

I think one issue of select_for_update() is in comment 12. It needs to iterate the results and update 1 row at a time so there is a performance impact doing this. Based on my test. it took about 43 seconds to update 30k records VS a few seconds.

Maybe we need to execute the "select for update" manually as raw sql (below). I amended my script with this but I can still get deadlocks so maybe I did something wrong.
---------------------------------
def update_timestamp(index):
    cursor = connection.cursor()
    with transaction.atomic():
        cursor.execute("select * from core_content where pulp_id in %s for update", (tuple(content_q), ))
        Content.objects.filter(pk__in=tuple(content_q)).update(timestamp_of_interest=current)
---------------------------------


> If you can give me a script that you use to force/see deadlocks in this
> codepath, I would be eternally grateful. Not sure why I'm having no luck
> recreating - might be a data-shape issue, I am going to retry with RHEL7,
> assuming I have enough disk on my box to get it all sync'd.


I think you can try the script provided by Jan Hutař in comment 0 to setup the reproducer.

Comment 17 Grant Gainey 2022-01-18 17:04:14 UTC
The models we want to affect with touch() can be quite large, I don't think an order of magnitude+ performance impact is acceptable. 

Given the limitations we're running into between Postgresql, Django, and select-for-update, we may need to resort to The Big Hammer of table-locking. I don't like it much - but it addresses the touch() issue, and is straightforward.

This script shows that a table-lock resolves the deadlock:

-----------------------------
import _thread
from django.db import transaction
from pulpcore.plugin.models import Content, RepositoryVersion
from django.utils.timezone import now
from django.db import connection
def update_timestamp(index):
   print(">>>in update_timedstamp index {}".format(index))
   cursor = connection.cursor()
   with transaction.atomic():
     cursor.execute("LOCK TABLE core_content IN EXCLUSIVE MODE")
     Content.objects.filter().order_by("pulp_id").update(timestamp_of_interest=now())
   print(">>>done {}".format(index))

for i in range(8):
   _thread.start_new_thread(update_timestamp, (i,))
------------------------------

(EXCLUSIVE allows select-only queries to access the table, but rejects any other type of lock on the table. See postgres lock-mode options here : https://www.postgresql.org/docs/9.4/explicit-locking.html)

(It can be tweaked to be more general, defining the affected-table-name from a passed-in model name - but specifying core_content explicitly is fine as an example).

The/an other option would be to surround the update() with a try/except that retries in the event of a deadlock-exception being thrown. This would leave deadlock-msgs in the logs, which scares users - but it's a perfectly-valid approach to concurrency if/when you can't enforce lock-ordering (which is the prob w/ postgres and update). It would have the least performance/concurrency impact, in the majority "there isn't a deadlock" case. The code would be a little baroque, and we'd want to think about whether it should be a retry-once-then-fail pattern or have a number-of-retries allowed.

What do you think, Hao?

Comment 18 Hao Chang Yu 2022-01-19 04:51:13 UTC
(In reply to Grant Gainey from comment #17)
> The models we want to affect with touch() can be quite large, I don't think
> an order of magnitude+ performance impact is acceptable. 
> 
> Given the limitations we're running into between Postgresql, Django, and
> select-for-update, we may need to resort to The Big Hammer of table-locking.
> I don't like it much - but it addresses the touch() issue, and is
> straightforward.
> 
> This script shows that a table-lock resolves the deadlock:
> 
> -----------------------------
> import _thread
> from django.db import transaction
> from pulpcore.plugin.models import Content, RepositoryVersion
> from django.utils.timezone import now
> from django.db import connection
> def update_timestamp(index):
>    print(">>>in update_timedstamp index {}".format(index))
>    cursor = connection.cursor()
>    with transaction.atomic():
>      cursor.execute("LOCK TABLE core_content IN EXCLUSIVE MODE")
>     
> Content.objects.filter().order_by("pulp_id").
> update(timestamp_of_interest=now())
>    print(">>>done {}".format(index))
> 
> for i in range(8):
>    _thread.start_new_thread(update_timestamp, (i,))
> ------------------------------
> 
> (EXCLUSIVE allows select-only queries to access the table, but rejects any
> other type of lock on the table. See postgres lock-mode options here :
> https://www.postgresql.org/docs/9.4/explicit-locking.html)
> 
> (It can be tweaked to be more general, defining the affected-table-name from
> a passed-in model name - but specifying core_content explicitly is fine as
> an example).
> 
> The/an other option would be to surround the update() with a try/except that
> retries in the event of a deadlock-exception being thrown. This would leave
> deadlock-msgs in the logs, which scares users - but it's a perfectly-valid
> approach to concurrency if/when you can't enforce lock-ordering (which is
> the prob w/ postgres and update). It would have the least
> performance/concurrency impact, in the majority "there isn't a deadlock"
> case. The code would be a little baroque, and we'd want to think about
> whether it should be a retry-once-then-fail pattern or have a
> number-of-retries allowed.
> 
> What do you think, Hao?

In environment with high concurrency, retry seems to have worse performance impact than locking the table. Running the script below with retry ability, it will still fail with many deadlock error after 10 tries and also spend longer time to complete. It is hard to determine how many retry is needed and how long to delay after each failure.

However, the script might not represent the real scenario. Retry will probably work better in the real scenario. In one of my case, I asked customer to reduce the worker to 5 and it stopped the deadlock error so maybe a few retries after deadlock is good enough for 8 workers.

One thing need to keep in mind is when syncing the Capsule, repos are syncing in order from small to large so I think it is very likely that it will sync multiple same repos at the time, such as cloned RHEL 7 repos.

Another crazy idea, what if implement both but only lock the whole table if the touch() rows are over 5000 (or maybe make it adjustable)?

----------------------------------------------
import _thread
import time
import random
from django.db import transaction
from pulpcore.plugin.models import Content
from django.utils.timezone import now
from django.db.utils import OperationalError
current = now()
def update_timestamp(index):
    tried = 0
    tries = 10
    while True:
        tried += 1
        try:
            Content.objects.filter().order_by("pk").update(timestamp_of_interest=current)
            print("Thread %d done" % index)
            break
        except OperationalError as error:
            delay = random.uniform(1, 3)
            if tried >= tries:
                 print("Thread %d: deadlock detected give up. Tried %d" % (index, tried))
                 break
            print("Thread %d: deadlock detected. trying %d in %f" % (index, tried, delay))
            time.sleep(delay)
----------------------------------------------

Comment 20 Pavel Moravec 2022-01-20 13:59:19 UTC
Also a different table (core_artifact) can be affected the same way:

  error:
    traceback: |2
        File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 317, in _perform_task
          result = func(*args, **kwargs)
        File "/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 490, in synchronize
          version = dv.create()
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create
          loop.run_until_complete(pipeline)
        File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
          return future.result()
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
          await asyncio.gather(*futures)
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
          await self.run()
        File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/artifact_stages.py", line 89, in run
          existing_artifacts.touch()
        File "/usr/lib/python3.6/site-packages/pulpcore/app/models/content.py", line 107, in touch
          return self.order_by("pk").update(timestamp_of_interest=now())
        File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 741, in update
          rows = query.get_compiler(self.db).execute_sql(CURSOR)
        File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1471, in execute_sql
          cursor = super().execute_sql(result_type)
        File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1142, in execute_sql
          cursor.execute(sql, params)
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
          return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
          return executor(sql, params, many, context)
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
          return self.cursor.execute(sql, params)
        File "/usr/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
          raise dj_exc_value.with_traceback(traceback) from exc_value
        File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
          return self.cursor.execute(sql, params)
    description: |
      deadlock detected
      DETAIL:  Process 31792 waits for ShareLock on transaction 2571598; blocked by process 31188.
      Process 31188 waits for ShareLock on transaction 2571599; blocked by process 31792.
      HINT:  See server log for query details.
      CONTEXT:  while updating tuple (4992,3) in relation "core_artifact"


I expect the root cause on background is the same (touch method somehow not deadlock-safe), am I right?

Comment 21 Grant Gainey 2022-01-20 16:38:08 UTC
@pmoravec: the problem is that update() in postgres isn't deadlock-safe, if two threads are updating overlapping rows. touch() is the primary victim, because it's updating the last-used timestamp for content in a repo that's being sync'd, and multiple threads can be syncing same/similar content at the same time during capsule-sync because of how capsule-sync orders its workload, and the existence of (potentially many) cloned copies of repos.

@hao I haven't forgotten you, just juggling a bunch the last few days. Reducing worker-count or "only lock if the affected rows is (some-number)" both only lower the window for this - which means someone out there will still hit it Every Single Time, because that's always the way with timing windows :)  I am tending towards "table lock inside of touch", here's my thoughts so far:

* Actually fixes the hole
* Avoiding "DEADLOCK DETECTED" in the logs makes monitoring tooling happy (and hence avoids false-alerts, unhappy users, and unnecessary case-reports)
* Happens in sync, where "time to touch all the records" is measured in 100s-of-milliseconds where everything else is measured in 10s-of-minutes - overhead becomes lost in the noise

The disadvantage is (obviously) that this serializes anything that wants to write-to the table(s) affected, when touch() is being called. This is happening at the end of sync, when we have the content and artifacts "in hand", and only once - so I don't think that's a terrible price to be paid, in anything like a 'real world' scenario.

As a hypothetical/made-up example: I sync 10 repos, with 10 workers, that are all independent content. The syncs take 5 minutes apiece (small repos). With no table-lock, the time to sync will be 5 minutes - each sync goes to one worker, and never collides/waits for anyone else. *With* the table-lock on/around/inside touch(), finishing the entire set of syncs will now take (at worst) 5 minutes + (9 x millseconds-to-update()). Even if the update() we're protecting is hitting tens of thousands of rows, it takes less than a second - so we're causing the system to go from 300s to sync, to 309s to sync, which feels acceptable even when syncing many tiny repos.

@pmoravec @hao - does the above example make sense to you? What am I missing?

Comment 22 Pavel Moravec 2022-01-20 21:10:59 UTC
That solution seems sounds to me and the overhead is still small and probably inevitable.

Comment 23 Hao Chang Yu 2022-01-21 05:24:38 UTC
(In reply to Grant Gainey from comment #21)
> @hao I haven't forgotten you, just juggling a bunch the last few days.
> Reducing worker-count or "only lock if the affected rows is (some-number)"
> both only lower the window for this - which means someone out there will
> still hit it Every Single Time, because that's always the way with timing
> windows :)  I am tending towards "table lock inside of touch", here's my
> thoughts so far:

Agree.

Just to clarify, what I suggested above is update the rows with retries if the affected rows count is below the specified number. Otherwise, lock the table and perform rows update.

Based on my tests using the script above, updating 5000 records with 30 threads and 2 windows concurrently seems to work well with retry. Basically, 1 to 2 retries will get through deadlocks.


> 
> * Actually fixes the hole
> * Avoiding "DEADLOCK DETECTED" in the logs makes monitoring tooling happy
> (and hence avoids false-alerts, unhappy users, and unnecessary case-reports)
> * Happens in sync, where "time to touch all the records" is measured in
> 100s-of-milliseconds where everything else is measured in 10s-of-minutes -
> overhead becomes lost in the noise
> 
> The disadvantage is (obviously) that this serializes anything that wants to
> write-to the table(s) affected, when touch() is being called. This is
> happening at the end of sync, when we have the content and artifacts "in
> hand", and only once - so I don't think that's a terrible price to be paid,
> in anything like a 'real world' scenario.
> 
> As a hypothetical/made-up example: I sync 10 repos, with 10 workers, that
> are all independent content. The syncs take 5 minutes apiece (small repos).
> With no table-lock, the time to sync will be 5 minutes - each sync goes to
> one worker, and never collides/waits for anyone else. *With* the table-lock
> on/around/inside touch(), finishing the entire set of syncs will now take
> (at worst) 5 minutes + (9 x millseconds-to-update()). Even if the update()
> we're protecting is hitting tens of thousands of rows, it takes less than a
> second - so we're causing the system to go from 300s to sync, to 309s to
> sync, which feels acceptable even when syncing many tiny repos.
> 

With my suggestion above, it probably won't impact the performance of syncing small to medium repos but yeah, some users won't be happy to see the error log.

I think syncing many large repos might at least take a few minutes more but it sounds reasonable to me so no objection.

Comment 24 Grant Gainey 2022-01-25 22:43:42 UTC
(In reply to Pavel Moravec from comment #20)
> I expect the root cause on background is the same (touch method somehow not
> deadlock-safe), am I right?

Just a note that I'm able to fix this "inside" of touch(), so we don't have to chase down every caller. See the attached github issue and its associated PR for details.

Comment 28 Grant Gainey 2022-02-09 18:43:21 UTC
@pmoravec@redhat.com @hyu@redhat.com After discussion, we're working at quantifying the impact/performance of the various options discussed in this BZ. "Locking an entire table" is....Rude, in a strongly-concurrent system like Pulp3, and we'd like to have a better answer. I'm updating the investigation and current results in a document here https://hackmd.io/@pulp/touch_deadlock - feel free to comment/contribute, as you're able.

Comment 29 Hao Chang Yu 2022-02-10 11:49:39 UTC
(In reply to Grant Gainey from comment #28)
> @pmoravec@redhat.com @hyu@redhat.com After discussion, we're working at
> quantifying the impact/performance of the various options discussed in this
> BZ. "Locking an entire table" is....Rude, in a strongly-concurrent system
> like Pulp3, and we'd like to have a better answer. I'm updating the
> investigation and current results in a document here
> https://hackmd.io/@pulp/touch_deadlock - feel free to comment/contribute, as
> you're able.

If I understand correctly the "Locking entire table" approach is still perform better than the "select-for-update" script v3, right?

Comment 30 Grant Gainey 2022-02-10 13:56:42 UTC
(In reply to Hao Chang Yu from comment #29)
> (In reply to Grant Gainey from comment #28)
> > @pmoravec@redhat.com @hyu@redhat.com After discussion, we're working at
> > quantifying the impact/performance of the various options discussed in this
> > BZ. "Locking an entire table" is....Rude, in a strongly-concurrent system
> > like Pulp3, and we'd like to have a better answer. I'm updating the
> > investigation and current results in a document here
> > https://hackmd.io/@pulp/touch_deadlock - feel free to comment/contribute, as
> > you're able.
> 
> If I understand correctly the "Locking entire table" approach is still
> perform better than the "select-for-update" script v3, right?

The table-lock performs about 3x better in the pathological-deadlock-case. In the one-thread case, **select-for-update** performs about 5x better. (See the "Running in one shell, 10 runs, 10 threads@" results from the hackmd)

Where the table-lock will really cause problems, is when it forces touch() to serialize **even when it doesn't need to** - which is the actual "main" path. If I'm syncing repos that do **not** have overlapping content, the select-for-update will let them happen in parallel, where the table-lock will force everything to serialize. So if I have 20 workers, and am syncing 20 different repos, the one at the end of the line is going to wait 20 times as long to through the gate, when it doesn't need to.

I have a few more experiments to run, but I think select-for-update is a clear winner here. Stay tuned.

Comment 31 Grant Gainey 2022-02-10 20:33:27 UTC
@hyu@redhat.com @pmoravec@redhat.com OK, I think we have a real winner. See the script/performance-results here :

https://hackmd.io/eBAa4F3lRzaV65sQ7zu_yw?view#Script-v4

I've updated https://github.com/pulp/pulpcore/pull/2159 with this approachj, and we're passing all our tests. Waiting for re-reviews before it's accepted.

Comment 32 Hao Chang Yu 2022-02-11 01:24:14 UTC
Great. I guess the key to prevent the deadlock is the "order by" in "select for update" which explain why it didn't work when I tried in comment 16?

Comment 33 Grant Gainey 2022-02-11 12:36:01 UTC
(In reply to Hao Chang Yu from comment #32)
> Great. I guess the key to prevent the deadlock is the "order by" in "select
> for update" which explain why it didn't work when I tried in comment 16?

Zacly - deadlocks always come down to lock-ordering problems :)  

You can see in my Script-v3 (https://hackmd.io/eBAa4F3lRzaV65sQ7zu_yw?view#Script-v3) that we do the SELECT FOR and the UPDATE separately, and the ORDER BY in the SELECT saves us. However, it also impacts our concurrency, as you can see in the results-sections.

In Script-v4, we do the SELECT FOR/ORDER BY with SKIP LOCKS. This locks (in order!) only the rows that aren't already locked, and returns that list. We call this inside the UPDATE, so we're only upddating the rows that "we" (as in, This Thread) were able to acquire, leaving the rest to whichever other thread(s) got to them first. So we get to have our lock-ordering, and our concurrency-performance, at the same time.

What a Fun problem!

Comment 34 Pavel Moravec 2022-02-11 14:30:25 UTC
The script-v4 / described and proposed solution sound great!

Hao++ for narrowing down the reproducer to a minimalistic one.

Hao++ for bringing some ideas of possible resolution.

Grant++ for finding the best solution and comparing the alternatives.

(++ for anybody else involved, collaboration++ )

Comment 36 Lai 2022-04-19 18:58:42 UTC
This was difficult to reproduce as a deadlock case for verification.  Instead, I have verified that the code is in 6.11 snap 16

I also performed a capsule sync test with large repos to ensure that capsule is synced successfully and content is accessible in capsule.

Verified on 6.11 snap 16 with python38-pulpcore-3.16.6-1.el8pc.noarch.

Comment 37 paul.konecny 2022-04-27 08:35:23 UTC
(In reply to Lai from comment #36)
> This was difficult to reproduce as a deadlock case for verification. 
> Instead, I have verified that the code is in 6.11 snap 16
> 
> I also performed a capsule sync test with large repos to ensure that capsule
> is synced successfully and content is accessible in capsule.
> 
> Verified on 6.11 snap 16 with python38-pulpcore-3.16.6-1.el8pc.noarch.

Hello, 
will this fix be backported to the 6.10 branch or do we have to skip from 6.9.9 directly to 6.11?
We are now experimenting with a single pulpcore worker to avoid the deadlocks on the initial sync but will this problem persist also during automatic and smaller syncs?
Thanks in advance and regards, 
Paul
Case is: 
https://access.redhat.com/support/cases/#/case/03138801

Comment 38 Sayan Das 2022-04-27 08:43:31 UTC
(In reply to Paul Konecny from comment #37)
> (In reply to Lai from comment #36)
> > This was difficult to reproduce as a deadlock case for verification. 
> > Instead, I have verified that the code is in 6.11 snap 16
> > 
> > I also performed a capsule sync test with large repos to ensure that capsule
> > is synced successfully and content is accessible in capsule.
> > 
> > Verified on 6.11 snap 16 with python38-pulpcore-3.16.6-1.el8pc.noarch.
> 
> Hello, 
> will this fix be backported to the 6.10 branch or do we have to skip from
> 6.9.9 directly to 6.11?
> We are now experimenting with a single pulpcore worker to avoid the
> deadlocks on the initial sync but will this problem persist also during
> automatic and smaller syncs?
> Thanks in advance and regards, 
> Paul
> Case is: 
> https://access.redhat.com/support/cases/#/case/03138801

Hello Paul,

The same fix for this BZ has been delivered in Satellite 6.10.3 via child BZ https://bugzilla.redhat.com/show_bug.cgi?id=2043710.


Now, there is a different type of deadlock scenario present which was reported via this new BZ :

2062526 – Another deadlock issue when syncing repos with high concurrency
https://bugzilla.redhat.com/show_bug.cgi?id=2062526


Let's continue on the support case as I have asked you some questions there and that will help to identify if you are hitting the traceback from the new BZ or not. 


-- Sayan

Comment 41 errata-xmlrpc 2022-07-05 14:30:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498


Note You need to log in before you can comment on or make changes to this bug.