Bug 975043 - ose-upgrade tools does not detect migrate failures.
Summary: ose-upgrade tools does not detect migrate failures.
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 1.2.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: John W. Lamb
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-17 13:40 UTC by Johnny Liu
Modified: 2017-03-08 17:35 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-12-05 15:10:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Johnny Liu 2013-06-17 13:40:06 UTC
Description of problem:
When encounter migrate failure, but test report did not highlight it, and also told user 0 failures.

Version-Release number of selected component (if applicable):
1.2/2013-06-13.1

How reproducible:
Always

Steps to Reproduce:
1. Setup 1.1 env using latest ose-1.1.z puddle.
2. Create two jbosseap app.
3. Following http://etherpad.corp.redhat.com/OSE-1-2-upgrade-notes to do upgrade testing
4. Run "ose-upgrade gears"

Actual results:
The jbosseap gear will fail to be started due to BZ#972311.
<--snip-->
Starting gear with uuid 'bc9bb8d519a74eeeb0781c2f851bcd69' on node 'node.ose11test.com
'
Start gear failed with an exception: Failed to execute: 'control start' for /var/lib/openshift/bc9bb8d519a74eeeb0781c2f851bcd69/jbosseap
Marking step start_gear complete
Validating gear bc9bb8d519a74eeeb0781c2f851bcd69 post-migration
Pre-migration state: started
Post-migration response code: 503
<--snip-->

But in the end, test summary told user 0 failures.

#####################################################
Summary:
# of users: 1
# of gears: 8
# of failures: 0
Gear counts per thread: [8]
Additional timings:
    migrate_on_node_measured_from_broker=309.769s
    redeploy_httpd_proxy=0.0s
    restart=0.0s
    total_migrate_gear_measured_from_broker=309.845s
Time gathering users: 0.042s
Time gathering active gears: 20.29s
Total execution time: 334.055s
#####################################################

And about the migrate error, it is better to use colour text to highlight it.


Expected results:
ose-upgrade tools should detect failures, and use colour text to highlight them.

Additional info:

Comment 2 Luke Meyer 2013-06-25 20:48:52 UTC
This is technically not considered a migration failure, since the migration has gone well as far as we can tell, but the gear restart failed. That can happen for a variety of reasons and we don't want to assume it's due to the migration.

However, we agree it would be a good idea to let the user know that the gear failed to start after the migration. They would have to use an outside tool to detect that now (e.g. "service openshift-gears status" or check http return code on all apps) or just wait until users yelled about something; not a good experience. So we would like to introduce a return code to the migration that would be reserved just for this problem, and notify the user from the migration script which migrations had this problem and where to look for the log for each.

That's a feature request, though, and it's not likely to make it into this release. We can consider for 1.2.1 or later release. It's also not clear whether we will ever have a migration quite like this again.

Comment 4 manoj 2013-12-05 15:10:00 UTC
Jason to create a story for this in Trello


Note You need to log in before you can comment on or make changes to this bug.