| Summary: | multiple manifest uploads are deadlocking | ||
|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Lukas Pramuk <lpramuk> |
| Component: | Subscription Management | Assignee: | Justin Sherrill <jsherril> |
| Status: | CLOSED ERRATA | QA Contact: | Lukas Pramuk <lpramuk> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.2.13 | CC: | andrew.schofield, bbuckingham, bcourt, jsherril, rplevka, wpinheir |
| Target Milestone: | Unspecified | Keywords: | Triaged |
| Target Release: | Unused | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | tfm-rubygem-katello-3.4.4 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-02-21 16:54:17 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
3 different errors occurred (3 beasts captured):
1)
Runtime Error ERROR: deadlock detected
Detail: Process 31377 waits for ShareLock on transaction 709166; blocked by process 31889.
Process 31889 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31377.
Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102
2)
Runtime Error Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched:81
3)
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_on_cp_id"
DETAIL: Key (cp_id)=(ff80808156a722610156b7811da51b17) already exists.
: INSERT INTO "katello_subscriptions" ("cp_id", "created_at", "updated_at") VALUES ($1, $2, $3) RETURNING "id"
This has been fixed in Candlepin 2.0. FailedQA.
@satellite-6.3.0-15.0.beta.el7sat.noarch
foreman-1.15.0-1.el7sat.noarch
katello-3.4.1-1.el7sat.noarch
candlepin-2.0.35-1.el7.noarch
2.
# for i in {1..9} ; do hammer subscription upload --file `clone_manifest manifest.zip fake_manifest.key` --organization "Organization #$i" & done
[1] 29123
[2] 29124
[3] 29125
[4] 29126
[5] 29127
[6] 29128
[7] 29129
[8] 29130
[9] 29131
[............................ ] [25%]
Failed to import archive
[........................................................ ] [50%]
Failed to import archive
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL: Key (subscription_id, product_id)=(4, 1) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL: Key (subscription_id, product_id)=(7, 140) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[........................................................ ] [50%]
Failed to import archive
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL: Key (subscription_id, product_id)=(12, 21) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL: Key (subscription_id, product_id)=(10, 139) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL: Key (subscription_id, product_id)=(10, 139) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[........................................................ ] [50%]
Failed to import archive
no implicit conversion of String into Integer
[...............................................................................................................] [100%]
PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_products_on_subs_id_prod_id"
DETAIL: Key (subscription_id, product_id)=(16, 85) already exists.
: INSERT INTO "katello_subscription_products" ("subscription_id", "product_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"
[...............................................................................................................] [100%]
4 paused error tasks / 4 stopped with warning / 1 success only
re-routing to subscription management as the QA failure is in the katello_subscription_products table which is not controlled by candlepin. VERIFIED.
@satellite-6.3.0-16.0.beta.el7sat.noarch
foreman-1.15.3-2.el7sat.noarch
katello-3.4.4-2.el7sat.noarch
candlepin-2.0.40-1.el7.noarch
2.
# for i in {1..9} ; do hammer subscription upload --file `clone_manifest manifest.zip fake_manifest.key` --organization "Organization #$i" & done
[1] 18374
[2] 18375
[3] 18377
[4] 18379
[5] 18381
[6] 18384
[7] 18386
[8] 18390
[9] 18395
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
[...................................................................................................................................] [100%]
>>> all concurrent tasks finished successfully (9 of 9)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
>
> For information on the advisory, and where to find the updated files, follow the link below.
>
> If the solution does not work for you, open a new bug report.
>
> https://access.redhat.com/errata/RHSA-2018:0336
|
Description of problem: multiple manifest uploads are deadlocking, we have race condition here. Version-Release number of selected component (if applicable): 6.2.1 How reproducible: reliably Steps to Reproduce: 1. Prepare multiple organizations for manifest uploads # for i in {1..9} ; do hammer organization create --name "Organization #$i" ; done 2. Simultaneously upload multiple manifests # for i in {1..9} ; do hammer subscription upload --file `clone_manifest manifest.zip fake_manifest.key` --organization "Organization #$i" & done [10] 31917 [11] 31918 [12] 31919 [13] 31921 [14] 31923 [15] 31926 [16] 31928 [17] 31931 [18] 31936 [..............................................................................................................] [100%] Error: Runtime Error ERROR: deadlock detected Detail: Process 31377 waits for ShareLock on transaction 709166; blocked by process 31889. Process 31889 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31377. Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102 [..............................................................................................................] [100%] Error: Runtime Error Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched:81 [..............................................................................................................] [100%] Error: Runtime Error ERROR: deadlock detected Detail: Process 29937 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31889. Process 31889 waits for ShareLock on transaction 709188; blocked by process 29937. Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102 [..............................................................................................................] [100%] Error: Runtime Error ERROR: deadlock detected Detail: Process 32379 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 31889. Process 31889 waits for ShareLock on transaction 709188; blocked by process 29937. Process 29937 waits for ExclusiveLock on tuple (0,19) of relation 16580 of database 16386; blocked by process 32379. Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102 [..............................................................................................................] [100%] Error: Runtime Error ERROR: deadlock detected Detail: Process 32432 waits for ShareLock on transaction 709329; blocked by process 31889. Process 31889 waits for ExclusiveLock on tuple (0,50) of relation 16580 of database 16386; blocked by process 32432. Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102 [..............................................................................................................] [100%] Error: Runtime Error ERROR: deadlock detected Detail: Process 32433 waits for ShareLock on transaction 709329; blocked by process 31889. Process 31889 waits for ExclusiveLock on tuple (0,50) of relation 16580 of database 16386; blocked by process 32433. Hint: See server log for query details. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse:2,102 [..............................................................................................................] [100%] Error: Runtime Error Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched:81 [ ] [0%][[ ] [0%][[ ] [0%][[ ] [0%][[ ] [0%][[ ] [0%][[ ] [0%][[..............................................................................................................] [100%][..............................................................................................................] [100%] [..............................................................................................................] [100%] Error: PG::Error: ERROR: duplicate key value violates unique constraint "index_katello_subscriptions_on_cp_id" DETAIL: Key (cp_id)=(ff80808156a722610156b7811da51b17) already exists. : INSERT INTO "katello_subscriptions" ("cp_id", "created_at", "updated_at") VALUES ($1, $2, $3) RETURNING "id" Actual results: deadlocking tasks (successful 1/9) (With good timing you can hit it with just 2 manifests) Expected results: no deadlocks Additional info: I dont expect CUs uploading 10 manifest at the same time, but we aim to improve backend, right? Consider 10 manifests upload as means of exposing the backend flaw...not a real scenario.