Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 975043

Summary: ose-upgrade tools does not detect migrate failures.
Product: OpenShift Container Platform Reporter: Johnny Liu <jialiu>
Component: Cluster Version OperatorAssignee: John W. Lamb <jolamb>
Status: CLOSED DEFERRED QA Contact: libra bugs <libra-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 1.2.0CC: bleanhar, libra-onpremise-devel, lmeyer
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-05 15:10:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Johnny Liu 2013-06-17 13:40:06 UTC
Description of problem:
When encounter migrate failure, but test report did not highlight it, and also told user 0 failures.

Version-Release number of selected component (if applicable):
1.2/2013-06-13.1

How reproducible:
Always

Steps to Reproduce:
1. Setup 1.1 env using latest ose-1.1.z puddle.
2. Create two jbosseap app.
3. Following http://etherpad.corp.redhat.com/OSE-1-2-upgrade-notes to do upgrade testing
4. Run "ose-upgrade gears"

Actual results:
The jbosseap gear will fail to be started due to BZ#972311.
<--snip-->
Starting gear with uuid 'bc9bb8d519a74eeeb0781c2f851bcd69' on node 'node.ose11test.com
'
Start gear failed with an exception: Failed to execute: 'control start' for /var/lib/openshift/bc9bb8d519a74eeeb0781c2f851bcd69/jbosseap
Marking step start_gear complete
Validating gear bc9bb8d519a74eeeb0781c2f851bcd69 post-migration
Pre-migration state: started
Post-migration response code: 503
<--snip-->

But in the end, test summary told user 0 failures.

#####################################################
Summary:
# of users: 1
# of gears: 8
# of failures: 0
Gear counts per thread: [8]
Additional timings:
    migrate_on_node_measured_from_broker=309.769s
    redeploy_httpd_proxy=0.0s
    restart=0.0s
    total_migrate_gear_measured_from_broker=309.845s
Time gathering users: 0.042s
Time gathering active gears: 20.29s
Total execution time: 334.055s
#####################################################

And about the migrate error, it is better to use colour text to highlight it.


Expected results:
ose-upgrade tools should detect failures, and use colour text to highlight them.

Additional info:

Comment 2 Luke Meyer 2013-06-25 20:48:52 UTC
This is technically not considered a migration failure, since the migration has gone well as far as we can tell, but the gear restart failed. That can happen for a variety of reasons and we don't want to assume it's due to the migration.

However, we agree it would be a good idea to let the user know that the gear failed to start after the migration. They would have to use an outside tool to detect that now (e.g. "service openshift-gears status" or check http return code on all apps) or just wait until users yelled about something; not a good experience. So we would like to introduce a return code to the migration that would be reserved just for this problem, and notify the user from the migration script which migrations had this problem and where to look for the log for each.

That's a feature request, though, and it's not likely to make it into this release. We can consider for 1.2.1 or later release. It's also not clear whether we will ever have a migration quite like this again.

Comment 4 manoj 2013-12-05 15:10:00 UTC
Jason to create a story for this in Trello