Description of problem: Refreshing the manifest on Satellite 6.2.10 results in the following error: 2017-07-21 16:05:26 EDT ERROR: deadlock detected 2017-07-21 16:05:26 EDT DETAIL: Process 2848 waits for ShareLock on transaction 231777414; blocked by process 1382. Process 1382 waits for ShareLock on transaction 231777394; blocked by process 2848. Process 2848: insert into cp_pool_products (created, updated, pool_id, product_id, product_name, dtype, id) values ($1, $2, $3, $4, $5, 'provided', $6) Process 1382: insert into cp_entitlement (created, updated, consumer_id, dirty, endDateOverride, owner_id, pool_id, quantity, updatedOnStart, id) values ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10) Version-Release number of selected component (if applicable): satellite-6.2.10-4.0.el7sat.noarch candlepin-0.9.54.21-1.el7.noarch foreman-1.11.0.76-1.el7sat.noarch katello-3.0.0-20.el7sat.noarch How reproducible: So far only on customer Satellite
Hey Vritant, That's right - in this case subsequent manifest refreshes continue to fail.
The following workaround should enable those blocked by this issue to move forward with manifest import. Approach: Temporarily disable conflicting, external traffic to the satellite for the duration of manifest import. Important notes before getting started: This will result in all attempts to register/unregister or update entitlements from client systems to fail for the duration of manifest import. After this workaround is completed, operations should return to normal. Steps to accomplish the above: 1) Add the following to /etc/httpd/conf.d/05-foreman-ssl.d/katello.conf (less the triple quotes): """ <Location /rhsm> PassengerEnabled off </Location> """ 2) Restart httpd: `systemctl restart httpd` 3) Navigate to the manifest import page (Content -> Red Hat Subscriptions -> Click "Manage Manifest" button) 4) Select the file to import using the dialog displayed after clicking "Browse". 5) Click "Upload" 6) Wait awhile (in my case this took somewhere between 30min and 1 hour depending on hardware, it may take longer depending on the size and contents of the manifest) 7) Remove the passage added to /etc/httpd/conf.d/05-foreman-ssl.d/katello.conf in step 1. 8) Restart httpd: `systemctl restart httpd` After completing the above, the manifest import should succeed. The above worked for me on my reproducer of the deadlock issue. I believe it will work for others as well.
FYI I was able to reproduce this by simple: - calling manifest refresh - invoking many "get me consumer serials" requests in parallel. these requests are normally triggered by rhsmd activity on the client systems - _no_ virt-who call in charge at all
Pavel, Thanks for the update. To confirm, was that reproducer on 6.4?
(In reply to Brad Buckingham from comment #36) > Pavel, > > Thanks for the update. To confirm, was that reproducer on 6.4? Trying so on 6.4, I got (in approx. 10th attempt) a deadlock but in candlepin, not postgres: 2018-11-04 17:32:30,830 [thread=http-bio-8443-exec-521] [req=7e4a9f4e-71d3-46b2-9a64-5f0ab4dd621d, org=, csid=] INFO org.candlepin.common.filter.LoggingFilter - Request: verb=GET, uri=/candlepin/consumers/627e59b8-3f03-4c2f-9f67-db5d8a13e207/certificates/serials 2018-11-04 17:32:30,833 [thread=http-bio-8443-exec-647] [req=a31ce16a-d317-4204-b484-b41f77fbec7d, org=, csid=] INFO org.candlepin.common.filter.LoggingFilter - Request: verb=GET, uri=/candlepin/consumers/5c6bbc4d-c9a9-436a-a84a-4a27a63330ca/certificates/serials 2018-11-04 17:33:40,482 [thread=C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-AdminTaskTimer] [=, org=, csid=] WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@e4cebf5 -- APPARENT DEADLOCK!!! Complete Status: Managed Threads: 3 Active Threads: 3 Active Tasks: com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@6b1485df on thread: C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-HelperThread-#0 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@3f938880 on thread: C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-HelperThread-#1 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@2b608ded on thread: C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-HelperThread-#2 Pending Tasks: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@7159f200 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@3eacbb89 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@1d58f354 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@32f80af7 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@c8f375a com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@5724a5b1 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@dbc3e52 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@7fe122a0 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@a21efe1 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@785aca39 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@14f825ce com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@62fae1d3 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@67a1548a com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@49045145 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@1c87cbfe com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@29c655c4 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@44ae5172 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@2fcbaedb com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@db158e7 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@16c244e0 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@20d1a28b com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@529e6d3c com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@556f61e8 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@518ecae1 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@7ce6ffa1 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@5d309a8c com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@737ec0e3 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@5480c5af com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@1f503369 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@46ed2ef5 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@43918713 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@11ee9f73 com.mchange.v2.resourcepool.BasicResourcePool$1RefurbishCheckinResourceTask@fbb3d5c Pool thread stack traces: Thread[C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-HelperThread-#0,5,main] com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:720) Thread[C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-HelperThread-#1,5,main] com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:720) Thread[C3P0PooledConnectionPoolManager[identityToken->1hgemtd9y1mo7gay1kun9qu|8211f0a]-HelperThread-#2,5,main] com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:720) (and tomcat/candlepin was stopped :-o ) postgres logs show nothing, candlepin's error and candlepin.log contained just that info, tomcat logs nothing. note to myself: reproducer was on provisioning.usersys.redhat.com, fetching serials of all cp_consumers in a loop during the manifest refresh
Other developers on the team, support engineers, and I have all been unable to reproduce this issue. I'm going to close this bug, but if someone encounters it again and has a reproducer case, please reopen it. If you do have a reproducer, we will need a copy of the manifest you are importing as well as a dump of the current Candlepin database. It would be inappropriate to attach these to a public bug, so you will either need to open a case with support (referencing this bug) and attach the necessary manifest and DB dump to the case or provide those files to a member of the development team out-of-band.