Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1369463

Summary: multiple manifest uploads are deadlocking
Product: Red Hat Satellite Reporter: Lukas Pramuk <lpramuk>
Component: Subscription ManagementAssignee: Justin Sherrill <jsherril>
Status: CLOSED ERRATA QA Contact: Lukas Pramuk <lpramuk>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2.13CC: andrew.schofield, bbuckingham, bcourt, jsherril, rplevka, wpinheir
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: tfm-rubygem-katello-3.4.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-21 16:54:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Pramuk 2016-08-23 13:24:31 UTC
Description of problem:
multiple manifest uploads are deadlocking, we have race condition here.

Version-Release number of selected component (if applicable):
6.2.1

How reproducible:
reliably

Steps to Reproduce:
1. Prepare multiple organizations for manifest uploads
# for i in {1..9} ; do hammer organization create --name "Organization #$i" ; done

2. Simultaneously upload multiple manifests 
# for i in {1..9} ; do hammer subscription upload --file `clone_manifest manifest.zip fake_manifest.key` --organization "Organization #$i" & done
[10] 31917
[11] 31918
[12] 31919
[13] 31921
[14] 31923
[15] 31926
[16] 31928
[17] 31931
[18] 31936
[..............................................................................................................] [100%]
Error: Runtime Error ERROR: deadlock detected
  Detail: Process 31377 waits for ShareLock on transaction 709166; blocked by process 31889.
Process 31889 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31377.
  Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102
[..............................................................................................................] [100%]
Error: Runtime Error Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched:81
[..............................................................................................................] [100%]
Error: Runtime Error ERROR: deadlock detected
  Detail: Process 29937 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31889.
Process 31889 waits for ShareLock on transaction 709188; blocked by process 29937.
  Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102
[..............................................................................................................] [100%]
Error: Runtime Error ERROR: deadlock detected
  Detail: Process 32379 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31889.
Process 31889 waits for ShareLock on transaction 709188; blocked by process 29937.
Process 29937 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 32379.
  Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102
[..............................................................................................................] [100%]
Error: Runtime Error ERROR: deadlock detected
  Detail: Process 32432 waits for ShareLock on transaction 709329; blocked by process 31889.
Process 31889 waits for ExclusiveLock on tuple (0,50) of relation 16580 of database 16386; blocked by process 32432.
  Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102
[..............................................................................................................] [100%]
Error: Runtime Error ERROR: deadlock detected
  Detail: Process 32433 waits for ShareLock on transaction 709329; blocked by process 31889.
Process 31889 waits for ExclusiveLock on tuple (0,50) of relation 16580 of database 16386; blocked by process 32433.
  Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102
[..............................................................................................................] [100%]
Error: Runtime Error Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched:81
[                                                                                                                ] [0%][[                                                                                                                ] [0%][[                                                                                                                ] [0%][[                                                                                                                ] [0%][[                                                                                                                ] [0%][[                                                                                                                ] [0%][[                                                                                                                ] [0%][[..............................................................................................................] [100%][..............................................................................................................] [100%]
[..............................................................................................................] [100%]
Error: PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_on_cp_id"
DETAIL:  Key (cp_id)=(ff80808156a722610156b7811da51b17) already exists.
: INSERT INTO "katello_subscriptions" ("cp_id", "created_at", "updated_at") VALUES ($1, $2, $3) RETURNING "id"


Actual results:
deadlocking tasks (successful 1/9)
(With good timing you can hit it with just 2 manifests)


Expected results:
no deadlocks

Additional info:
I dont expect CUs uploading 10 manifest at the same time, but we aim to improve backend, right?
Consider 10 manifests upload as means of exposing the backend flaw...not a real scenario.

Comment 1 Lukas Pramuk 2016-08-23 13:34:53 UTC
3 different errors occurred (3 beasts captured):

1)
Runtime Error ERROR: deadlock detected
  Detail: Process 31377 waits for ShareLock on transaction 709166; blocked by process 31889.
Process 31889 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31377.
  Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102

2)
Runtime Error Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched:81

3)
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_on_cp_id"
DETAIL:  Key (cp_id)=(ff80808156a722610156b7811da51b17) already exists.
: INSERT INTO "katello_subscriptions" ("cp_id", "created_at", "updated_at") VALUES ($1, $2, $3) RETURNING "id"

Comment 5 Barnaby Court 2016-12-16 19:47:53 UTC
This has been fixed in Candlepin 2.0.

Comment 6 Lukas Pramuk 2017-06-19 12:34:29 UTC
FailedQA.

@satellite-6.3.0-15.0.beta.el7sat.noarch
foreman-1.15.0-1.el7sat.noarch
katello-3.4.1-1.el7sat.noarch
candlepin-2.0.35-1.el7.noarch

2.
# for i in {1..9} ; do hammer subscription upload --file `clone_manifest manifest.zip fake_manifest.key` --organization "Organization #$i" & done
[1] 29123
[2] 29124
[3] 29125
[4] 29126
[5] 29127
[6] 29128
[7] 29129
[8] 29130
[9] 29131
[............................                                                                                    ] [25%]
Failed to import archive
[........................................................                                                        ] [50%]
Failed to import archive
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL:  Key (subscription_id, product_id)=(4, 1) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL:  Key (subscription_id, product_id)=(7, 140) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[........................................................                                                        ] [50%]
Failed to import archive
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL:  Key (subscription_id, product_id)=(12, 21) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL:  Key (subscription_id, product_id)=(10, 139) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL:  Key (subscription_id, product_id)=(10, 139) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[........................................................                                                        ] [50%]
Failed to import archive
no implicit conversion of String into Integer
[...............................................................................................................] [100%]
PG::Error: ERROR:  duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL:  Key (subscription_id, product_id)=(16, 85) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]


4 paused error tasks / 4 stopped with warning / 1 success only

Comment 7 Barnaby Court 2017-06-19 14:56:30 UTC
re-routing to subscription management as the QA failure is in the katello_subscription_products table which is not controlled by candlepin.

Comment 8 Lukas Pramuk 2017-08-30 19:44:59 UTC
VERIFIED.

@satellite-6.3.0-16.0.beta.el7sat.noarch
foreman-1.15.3-2.el7sat.noarch
katello-3.4.4-2.el7sat.noarch
candlepin-2.0.40-1.el7.noarch

2.
# for i in {1..9} ; do hammer subscription upload --file `clone_manifest manifest.zip fake_manifest.key` --organization "Organization #$i" & done
[1] 18374
[2] 18375
[3] 18377
[4] 18379
[5] 18381
[6] 18384
[7] 18386
[8] 18390
[9] 18395
[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]

[...................................................................................................................................] [100%]


>>> all concurrent tasks finished successfully (9 of 9)

Comment 9 Satellite Program 2018-02-21 16:54:17 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> 
> For information on the advisory, and where to find the updated files, follow the link below.
> 
> If the solution does not work for you, open a new bug report.
> 
> https://access.redhat.com/errata/RHSA-2018:0336