Bug 1040646

Summary: Improve error message: Failed to correctly execute all parallel operations
Product: OpenShift Container Platform Reporter: Jason DeTiberus <jdetiber>
Component: NodeAssignee: Jason DeTiberus <jdetiber>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: low Docs Contact:
Priority: unspecified    
Version: 2.0.0CC: jforrest, jhou, libra-onpremise-devel, lmeyer, xiama
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-openshift-origin-controller-1.17.8-1.el6op rubygem-openshift-origin-console-1.17.6.2-1.el6op openshift-origin-msg-node-mcollective-1.17.4-1.el6op openshift-origin-console-1.15.1-3.el6op Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1031821 Environment:
Last Closed: 2014-01-17 16:20:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1031821, 1038680    
Bug Blocks:    

Description Jason DeTiberus 2013-12-11 19:12:56 UTC
+++ This bug was initially created as a clone of Bug #1031821 +++

The exception message: "Failed to correctly execute all parallel operations" should be improved to provide additional context for what operations actually failed.  This will improve both end user experience and our exception reporting in new relic.

--- Additional comment from Rajat Chopra on 2013-11-22 19:54:11 EST ---

Fixed with https://github.com/openshift/origin-server/pull/4241

The message now contains which ops failed.

Note : The rhc client tool does not give the 'debug' messages as contained in the REST output, perhaps we need the client tool to be enhanced?

e.g. if a node times out processing a parallel request, the REST response will contain a debug message like : (Gear Id: 52136834) Execution expired.
But the 'rhc' tool does not show it.

--- Additional comment from Hou Jianwei on 2013-11-25 01:42:08 EST ---

Verified on devenv_4067

Modify pending_app_op_group.rb as following:
failed_ops << tag["op_id"] # if status != 0

Then restart services and clear broker cache. 
Create one application via RESTAPI
curl -s -k -H 'Content-Type: Application/json' --user jhou:x https://ec2-23-22-167-139.compute-1.amazonaws.com/broker/rest/domains/jhou/applications/ -X POST -d '{"name":"d1", "cartridge":"diy-0.1"}'

Result:
The ops class will be displayed to tell user
{
    "api_version": 1.6,
    "data": null,
    "messages": [
        {
            "exit_code": 0,
            "field": null,
            "index": null,
            "severity": "result",
            "text": "Disclaimer: This is an experimental cartridge that provides a way to try unsupported languages, frameworks, and middleware on OpenShift.\n"
        },
        {
            "exit_code": 0,
            "field": null,
            "index": null,
            "severity": "error",
            "text": "Unable to complete the requested operation due to: Failed to correctly execute all parallel operations - [\"UpdateAppConfigOp\", \"AddBrokerAuthKeyOp\"].\nReference ID: 2a187f2946964ce241e5468576ad695d"
        }
    ],
    "status": "internal_server_error",
    "supported_api_versions": [
        1.0,
        1.1,
        1.2,
        1.3,
        1.4,
        1.5,
        1.6
    ],
    "type": null,
    "version": "1.6"
}

Comment 1 Jason DeTiberus 2013-12-11 19:17:45 UTC
https://github.com/openshift/enterprise-server/pull/168

Comment 2 Luke Meyer 2013-12-11 19:52:38 UTC
For completeness, here's the origin-server commit:

commit 6dbbed5812682277295e8592c60ff013881949c2
Author: Rajat Chopra <rchopra>
Date:   Fri Nov 22 16:46:45 2013 -0800

    fix bz1031821 - node exceptions are now propagated. Failed ops are also mentioned in the error message

Comment 4 Jason DeTiberus 2013-12-12 21:38:27 UTC
Also: https://github.com/openshift/enterprise-server/pull/169

origin-server commit:
commit 8a6aa20988aea89e2b3ce393b3055f6ca77ec8fa
Author: Clayton Coleman <ccoleman>
Date:   Wed Dec 11 17:55:40 2013 -0500

    Provide a much clearer message on simple operation failures

Comment 6 Ma xiaoqiang 2013-12-17 06:58:07 UTC
check it on puddle:[2.0.z/2013-12-16.2]

#Modify pending_app_op_group.rb as following:
failed_ops << tag["op_id"] # if status != 0
#restart broker service
#curl -s -k -H 'Content-Type: Application/json' --user xiaom1:zcxc123 https://broker.ose20-1216-com.cn/broker/rest/domains/xiaom/applications/ -X POST -d '{"name":"d3", "cartridge":"diy-0.1"}'

Output:
{
    "api_version": 1.6,
    "data": null,
    "messages": [
        {
            "exit_code": 1,
            "field": null,
            "index": null,
            "severity": "error",
            "text": "Application config change did not complete on 4 gears. AddBrokerAuthKeyOp failed on 4 gears. Please try again and contact support if the issue persists.\nReference ID: f8eed311b97476b4e4fe6ea94081fcbb"
        }
    ],
    "status": "internal_server_error",
    "supported_api_versions": [
        1.0,
        1.1,
        1.2,
        1.3,
        1.4,
        1.5,
        1.6
    ],
    "type": null,
    "version": "1.6"
}

# rhc app create diyapp diy
Using diy-0.1 (Do-It-Yourself 0.1) for 'diy'

Application Options
-------------------
  Domain:     xiaom
  Cartridges: diy-0.1
  Gear Size:  default
  Scaling:    no

Creating application 'diyapp' ... 
Application config change did not complete on 4 gears. AddBrokerAuthKeyOp failed on 4 gears. Please try again and contact support if the issue persists.
Reference ID: d6a0289735123438bf2f0688087166ed

Comment 7 Jason DeTiberus 2013-12-17 15:21:48 UTC
Upstream found a bug with this: https://github.com/openshift/enterprise-server/pull/170

Comment 8 Jason DeTiberus 2013-12-18 14:29:37 UTC
additional origin-server commits:
commit b63a147713923bdf89b2ab1fabf78fd3dac2770d
Author: Jessica Forrester <jforrest>
Date:   Thu Dec 5 10:46:10 2013 -0500

    Bug 1038680 - improve console's reporting of 500 errors from rest api

commit 6c6158c1dcb060b46a804a97e38ae24e102c1859
Author: Jessica Forrester <jforrest>
Date:   Tue Dec 17 16:59:05 2013 -0500

    Report warnings and errors from broker when console gets rest api server error

Comment 9 Jason DeTiberus 2013-12-18 14:32:23 UTC
https://github.com/openshift/enterprise-server/pull/171

Comment 10 Ma xiaoqiang 2013-12-20 01:24:28 UTC
test on puddle [2.0.1/2013-12-18], tried to access the application details via rest api and browser after stopping mongodb.

1.get output via browser
Unable to complete the requested operation due to: Could not connect to a primary node for replica set <Moped::Cluster nodes=[<Moped::Node resolved_address="127.0.0.1:27017">]>. Reference ID: 637fb1a50b479b22ef7807eb60332255

2. get output via rest api 
<--snip>
<text>Unable to complete the requested operation due to: Could not connect to a primary node for replica set &lt;Moped::Cluster nodes=[&lt;Moped::Node resolved_address="127.0.0.1:27017"&gt;]&gt;.
Reference ID: d51c8e9b24c4a15b7effbad3ae79020a</text>
      <exit-code>1</exit-code>
<--snip-->