Bug 108811

Summary: Oracle Deadlock caused by ContentItem.beforeSave()
Product: [Retired] Red Hat Enterprise CMS Reporter: Carsten Clasohm <clasohm>
Component: content typesAssignee: Archit Shah <archit.shah>
Status: CLOSED RAWHIDE QA Contact: Jon Orris <jorris>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-12-15 19:28:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 100952    
Attachments:
Description Flags
Deadlock and Data Corruption Scenarios
none
Log output for single setParent call none

Description Carsten Clasohm 2003-11-02 14:03:20 UTC
Description of problem:

Let's assume we have an Article object and want to change the CMS
folder it is located in. When ContentItem.setParent() is called and
the article is saved, the following happens:

1. beforeSave() calls setAncestors(getParent())

2. Although the parent is not being modified, its own beforeSave
method is called. (I'm not sure if this a bug or intended behaviour.)

3. Since the parent is a Folder, we again are in
ContentItem.beforeSave, and recursively call setAncestors for all
folders up to the root folder of the content section.

4. Persistence executes the SQL queries in this order: First,
cms_items.parent_id is set for the Article, and then
cms_items.ancestors is set starting with the root folder and ending
with the new parent folder.

This means that every thread which moves a content item or creates a
new one has to lock the root folder's row in cms_items, which hurts
scalability and can result in an Oracle deadlock (ORA-00060), which
happened when a customer did performance tests under heavy load.

I have attached a scenario description where this leads to a deadlock.
Besides the deadlock, the failure to properly lock the rows in
cms_items can result in the ancestors column getting corrupted - I
have also included a scenario for this.

Version-Release number of selected component (if applicable):
6.0

How reproducible:

Should be fairly easy to reproduce under heavy load, although I have
only a bug report from a customer. But a look at the executed SQL
queries shows that there indeed is a problem.

Steps to Reproduce:
See attached file.

Expected results:
setAncestors must not be called for the item's new parent. Either
beforeSave has to be changed so it only calls setAncestors when the
parent has changed, or persistence has to be changed so it doesn't
call beforeSave for the parent.

Additional info:

See changelist 37597 for a fix to the deadlock problem. Corruption of
the ancestors column could be prevented by locking all affected rows
before reading and modifying the ancestors column (select for update).

Comment 1 Carsten Clasohm 2003-11-02 14:04:06 UTC
Created attachment 95664 [details]
Deadlock and Data Corruption Scenarios

Comment 2 Carsten Clasohm 2003-11-02 14:04:53 UTC
Created attachment 95665 [details]
Log output for single setParent call

Comment 3 Richard Li 2003-11-02 20:58:32 UTC
Thanks. Carsten, I believe you are correct, and we're currently
testing a patch for this under load. -> ashah, who wrote the patch.

Comment 4 Vadim Nasardinov 2003-11-03 15:42:17 UTC
Carsten submits some of the best bug reports I've seen.
Well-researched, well-explained, and well-patched.


Comment 5 Archit Shah 2003-11-10 22:10:52 UTC
patch applied to 6.0.x (37880) and dev (37881)

Comment 6 Daniel Berrangé 2003-11-11 11:14:30 UTC
This path has broken the hierarchy denormalization for live items
again - see bug 109718