Bug 1540952

Summary: deadlock during removed RH child channel
Product: Red Hat Satellite 5 Reporter: Pavel Studeník <pstudeni>
Component: ServerAssignee: Tomáš Kašpárek <tkasparek>
Status: CLOSED WONTFIX QA Contact: Red Hat Satellite QA List <satqe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 580CC: rdrazny, tlestach
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-13 15:01:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1482563    

Description Pavel Studeník 2018-02-01 12:04:38 UTC
Description of problem:
I found deadlock in our test which tried to remove synced channel. At the start a child channel was completed synced through proxy and after them it was removed.


>> spacewalk-remove-channel -c rhn-tools-rhel-x86_64-server-7'
Deleting package metadata (131):
                  ________________________________________
Removing:         
ERROR: unhandled exception occurred: (deadlock detected
DETAIL:  Process 27573 waits for ShareLock on transaction 70018; blocked by process 23187.
Process 23187 waits for ShareLock on transaction 70847; blocked by process 27573.
HINT:  See server log for query details.
CONTEXT:  while deleting tuple (101,7) in relation "rhnpackage"
).
Traceback (most recent call last):
  File "/usr/bin/spacewalk-remove-channel", line 182, in <module>
    sys.exit(main() or 0)
  File "/usr/bin/spacewalk-remove-channel", line 165, in main
    just_kickstart_trees=options.just_kickstart_trees)
  File "/usr/lib/python2.6/site-packages/spacewalk/satellite_tools/contentRemove.py", line 177, in delete_channels
    _delete_rpms(rpms_ids)
  File "/usr/lib/python2.6/site-packages/spacewalk/satellite_tools/contentRemove.py", line 400, in _delete_rpms
    _delete_rpm_group(toDel[:group])
  File "/usr/lib/python2.6/site-packages/spacewalk/satellite_tools/contentRemove.py", line 434, in _delete_rpm_group
    count = h.executemany(package_id=packageIds)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/sql_base.py", line 160, in executemany
    return self._execute_wrapper(self._executemany, *p, **kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 296, in _execute_wrapper
    retval = function(*p, **kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 346, in _executemany
    self._real_cursor.executemany(self.sql, all_kwargs)
TransactionRollbackError: deadlock detected
DETAIL:  Process 27573 waits for ShareLock on transaction 70018; blocked by process 23187.
Process 23187 waits for ShareLock on transaction 70847; blocked by process 27573.
HINT:  See server log for query details.
CONTEXT:  while deleting tuple (101,7) in relation "rhnpackage"

Version-Release number of selected component (if applicable):
spacewalk-backend-tools-2.5.3-160.el6sat.noarch

How reproducible:
sometimes

Steps to Reproduce:
1. cdn-sync -v -c rhn-tools-rhel-x86_64-server-7 --http-proxy squid.com:3128
2. spacewalk-remove-channel -c rhn-tools-rhel-x86_64-server-7 


Additional info:
I can't reproduce it.

Comment 1 Pavel Studeník 2018-02-07 10:56:32 UTC
Today I found deadlock during removing channel again.

>> spacewalk-remove-channel -v -c rhn-tools-rhel-x86_64-server-6
ERROR: unhandled exception occurred: (deadlock detected
DETAIL:  Process 14883 waits for ShareLock on transaction 108195; blocked by process 11509.
Process 11509 waits for ShareLock on transaction 108199; blocked by process 14883.
HINT:  See server log for query details.
CONTEXT:  while deleting tuple (4,66) in relation "rhnerratanotificationqueue"
).
Traceback (most recent call last):
  File "/usr/bin/spacewalk-remove-channel", line 182, in <module>
    sys.exit(main() or 0)
  File "/usr/bin/spacewalk-remove-channel", line 165, in main
    just_kickstart_trees=options.just_kickstart_trees)
  File "/usr/lib/python2.6/site-packages/spacewalk/satellite_tools/contentRemove.py", line 252, in delete_channels
    h.executemany(channel_id=channel_ids)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/sql_base.py", line 160, in executemany
    return self._execute_wrapper(self._executemany, *p, **kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 296, in _execute_wrapper
    retval = function(*p, **kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 346, in _executemany
    self._real_cursor.executemany(self.sql, all_kwargs)
TransactionRollbackError: deadlock detected
DETAIL:  Process 14883 waits for ShareLock on transaction 108195; blocked by process 11509.
Process 11509 waits for ShareLock on transaction 108199; blocked by process 14883.
HINT:  See server log for query details.
CONTEXT:  while deleting tuple (4,66) in relation "rhnerratanotificationqueue"


>> /var/opt/rh/rh-postgresql95/lib/pgsql/data/pg_log/postgresql-*
2018-02-06 19:06:04.210 EST ERROR:  deadlock detected
2018-02-06 19:06:04.210 EST DETAIL:  Process 14883 waits for ShareLock on transaction 108195; blocked by process 11509.
        Process 11509 waits for ShareLock on transaction 108199; blocked by process 14883.
        Process 14883: delete from rhnErrataNotificationQueue where channel_id = 217
        Process 11509: UPDATE rhnErrataNotificationQueue
              SET next_action = NULL
              WHERE errata_id = $1 AND org_id = $2 and channel_id = $3
2018-02-06 19:06:04.210 EST HINT:  See server log for query details.
2018-02-06 19:06:04.210 EST CONTEXT:  while deleting tuple (4,66) in relation "rhnerratanotificationqueue"
2018-02-06 19:06:04.210 EST STATEMENT:  delete from rhnErrataNotificationQueue where channel_id = 217

Comment 2 Tomas Lestach 2018-02-07 12:39:18 UTC
Pavel, can you confirm the sync is finished before you run spacewalk-remove-channel?

Comment 3 Pavel Studeník 2018-02-07 13:13:57 UTC
cdn sync finished correctly before the removing started.

Steps:

..syncing..
16:18:29 Total time: 0:16:42 (syncing finished)
...
16:18:33 start removing the child channel
...
16:18:33 start syncing again
...
16:18:59 Total time: 0:00:24 (syncing finished)
...
16:19:00 Start removing the child channel
<-- deadlock