Bug 977456

Summary: Timeout during cartridge migrations breaks reentrancy
Product: OpenShift Online Reporter: Paul Morie <pmorie>
Component: ContainersAssignee: Paul Morie <pmorie>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: dmcphers, jhou
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-07 22:54:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Morie 2013-06-24 15:18:56 UTC
Currently, if the cartridge migration times out after the version of a cartridge is updated (or anything else goes wrong that fails the migration), re-entrancy is broken.  What the means is that although the migration hasn't finished, the need to continue migrating will not be detected.

Comment 1 Paul Morie 2013-07-26 16:50:21 UTC
Fix available in master now.

Comment 2 Jianwei Hou 2013-07-30 05:28:45 UTC
Verified on devenv_3580

Steps:
1. Create a diy application, edit its 'start' action hook, add 'sleep 600'.
2. Migrate all gears on node 
oo-admin-upgrade --version 2.0.31 --ignore-cartridge-version 

During migration, noticed that the target app failed to migrate due to timeout problem.
3. Fix the app and migrate with --rerun option
oo-admin-upgrade --version 2.0.31 --ignore-cartridge-version --rerun

The migrator only pick up the failed gear and migrated it successfully.

Mark as verified.