Bug 862203

Summary:	spacewalk-remove-channel caused deadlock while "delete from rhnPackage where id = 1554"
Product:	[Community] Spacewalk	Reporter:	Jan Hutař <jhutar>
Component:	Server	Assignee:	Jan Pazdziora (Red Hat) <jpazdziora>
Status:	CLOSED WONTFIX	QA Contact:	Red Hat Satellite QA List <satqe-list>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	1.8	CC:	jpazdziora
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-10-13 11:14:37 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	871344

Description Jan Hutař 2012-10-02 09:27:12 UTC

Description of problem:
Deleting of channel while its repodata are generated might cause DB deadlock.


Version-Release number of selected component (if applicable):
SWnightly@PostgreSQL as of 2012-09-30


How reproducible:
rarely


Steps to Reproduce:
1. Create channel with some content
2. Attempt to delete it right after previous step using spacewalk-remove-channel


Actual results:
Traceback of spacewalk-remove-channel:

# spacewalk-remove-channel -c <some_channel_I_have_just_created> -u
[...]
Deleting package metadata (7):
                  ________________________________________
Removing:         
ERROR: unhandled exception occurred: (deadlock detected
DETAIL:  Process 19046 waits for ShareLock on transaction 62566; blocked by process 18918.
Process 18918 waits for ShareLock on transaction 62574; blocked by process 19046.
HINT:  See server log for query details.
).
Traceback (most recent call last):
  File "/usr/bin/spacewalk-remove-channel", line 560, in <module>
    sys.exit(main() or 0)
  File "/usr/bin/spacewalk-remove-channel", line 147, in main
    skip_channels=options.skip_channels)
  File "/usr/bin/spacewalk-remove-channel", line 305, in delete_channels
    _delete_rpms(rpms_ids)
  File "/usr/bin/spacewalk-remove-channel", line 477, in _delete_rpms
    _delete_rpm_group(toDel[:group])
  File "/usr/bin/spacewalk-remove-channel", line 507, in _delete_rpm_group
    count = h.executemany(package_id=packageIds)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/sql_base.py", line 172, in executemany
    return apply(self._execute_wrapper, (self._executemany, ) + p, kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 282, in _execute_wrapper
    retval = apply(function, p, kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 318, in _executemany
    self._real_cursor.executemany(self.sql, all_kwargs)
TransactionRollbackError: deadlock detected
DETAIL:  Process 19046 waits for ShareLock on transaction 62566; blocked by process 18918.
Process 18918 waits for ShareLock on transaction 62574; blocked by process 19046.
HINT:  See server log for query details.


PostgreSQL DB log:

[...]
LOG:  unexpected EOF on client connection
LOG:  unexpected EOF on client connection
ERROR:  deadlock detected
DETAIL:  Process 19046 waits for ShareLock on transaction 62566; blocked by process 18918.
        Process 18918 waits for ShareLock on transaction 62574; blocked by process 19046.
        Process 19046: delete from rhnPackage where id = 1554
        Process 18918: update rhnPackageRepodata
                        set primary_xml = $1
                where package_id = $2
HINT:  See server log for query details.
STATEMENT:  delete from rhnPackage where id = 1554
LOG:  unexpected EOF on client connection
LOG:  unexpected EOF on client connection
[...]


Expected results:
Should either print an nice error with some advice on how to proceed or manage to its job somehow.


Additional info:
Thanks to Tomas L. for identifying the cause of the issue.

Comment 3 Jan Pazdziora (Red Hat) 2012-10-13 11:14:37 UTC

(In reply to comment #0)
> 
> Expected results:
> Should either print an nice error with some advice on how to proceed or
> manage to its job somehow.

If we find ourselves refactoring the repo generation code again, we can take this situation into account. However, I don't see us changing code for this specific sequence of operations.