Bug 862203

Summary: spacewalk-remove-channel caused deadlock while "delete from rhnPackage where id = 1554"
Product: [Community] Spacewalk Reporter: Jan Hutař <jhutar>
Component: ServerAssignee: Jan Pazdziora (Red Hat) <jpazdziora>
Status: CLOSED WONTFIX QA Contact: Red Hat Satellite QA List <satqe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.8CC: jpazdziora
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-13 11:14:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 871344    

Description Jan Hutař 2012-10-02 09:27:12 UTC
Description of problem:
Deleting of channel while its repodata are generated might cause DB deadlock.


Version-Release number of selected component (if applicable):
SWnightly@PostgreSQL as of 2012-09-30


How reproducible:
rarely


Steps to Reproduce:
1. Create channel with some content
2. Attempt to delete it right after previous step using spacewalk-remove-channel


Actual results:
Traceback of spacewalk-remove-channel:

# spacewalk-remove-channel -c <some_channel_I_have_just_created> -u
[...]
Deleting package metadata (7):
                  ________________________________________
Removing:         
ERROR: unhandled exception occurred: (deadlock detected
DETAIL:  Process 19046 waits for ShareLock on transaction 62566; blocked by process 18918.
Process 18918 waits for ShareLock on transaction 62574; blocked by process 19046.
HINT:  See server log for query details.
).
Traceback (most recent call last):
  File "/usr/bin/spacewalk-remove-channel", line 560, in <module>
    sys.exit(main() or 0)
  File "/usr/bin/spacewalk-remove-channel", line 147, in main
    skip_channels=options.skip_channels)
  File "/usr/bin/spacewalk-remove-channel", line 305, in delete_channels
    _delete_rpms(rpms_ids)
  File "/usr/bin/spacewalk-remove-channel", line 477, in _delete_rpms
    _delete_rpm_group(toDel[:group])
  File "/usr/bin/spacewalk-remove-channel", line 507, in _delete_rpm_group
    count = h.executemany(package_id=packageIds)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/sql_base.py", line 172, in executemany
    return apply(self._execute_wrapper, (self._executemany, ) + p, kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 282, in _execute_wrapper
    retval = apply(function, p, kw)
  File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 318, in _executemany
    self._real_cursor.executemany(self.sql, all_kwargs)
TransactionRollbackError: deadlock detected
DETAIL:  Process 19046 waits for ShareLock on transaction 62566; blocked by process 18918.
Process 18918 waits for ShareLock on transaction 62574; blocked by process 19046.
HINT:  See server log for query details.


PostgreSQL DB log:

[...]
LOG:  unexpected EOF on client connection
LOG:  unexpected EOF on client connection
ERROR:  deadlock detected
DETAIL:  Process 19046 waits for ShareLock on transaction 62566; blocked by process 18918.
        Process 18918 waits for ShareLock on transaction 62574; blocked by process 19046.
        Process 19046: delete from rhnPackage where id = 1554
        Process 18918: update rhnPackageRepodata
                        set primary_xml = $1
                where package_id = $2
HINT:  See server log for query details.
STATEMENT:  delete from rhnPackage where id = 1554
LOG:  unexpected EOF on client connection
LOG:  unexpected EOF on client connection
[...]


Expected results:
Should either print an nice error with some advice on how to proceed or manage to its job somehow.


Additional info:
Thanks to Tomas L. for identifying the cause of the issue.

Comment 3 Jan Pazdziora (Red Hat) 2012-10-13 11:14:37 UTC
(In reply to comment #0)
> 
> Expected results:
> Should either print an nice error with some advice on how to proceed or
> manage to its job somehow.

If we find ourselves refactoring the repo generation code again, we can take this situation into account. However, I don't see us changing code for this specific sequence of operations.