Bug 1084292
Summary: | The app shall be rollback when it failed to be created with unknown nodename | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Anping Li <anli> | |
Component: | Node | Assignee: | Luke Meyer <lmeyer> | |
Status: | CLOSED ERRATA | QA Contact: | libra bugs <libra-bugs> | |
Severity: | medium | Docs Contact: | ||
Priority: | high | |||
Version: | 2.1.0 | CC: | abhgupta, adellape, bleanhar, jdetiber, libra-onpremise-devel | |
Target Milestone: | --- | Keywords: | Upstream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | rubygem-openshift-origin-controller-1.23.10.2-1.el6op | Doc Type: | Bug Fix | |
Doc Text: |
If a customized gear placement plug-in was incorrectly configured and returned an invalid node host name, creating a new application reported a communication error when it could not find the node on which to place gears. However, a record for the failed application was created in the MongoDB datastore, even though related gears did not exist on any nodes. This bug fix adds logic to validate the node host name returned by the gear placement plug-in. If the validation fails, the application creation is rolled back completely and datastore records for failed applications are no longer created.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1093804 (view as bug list) | Environment: | ||
Last Closed: | 2014-06-23 07:37:36 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1093804 | |||
Bug Blocks: |
Description
Anping Li
2014-04-04 05:38:10 UTC
Openshift failed to create application due to node unavaiable(the non-exist nodename is provided by the plugin). The app record stored in mongodb shall be cleared by rollback process. Abhishek, any idea what needs to be fixed? I agree we don't want applications in mongo if they weren't successfully deployed. Ideally, the plugin implementation by any customer should not have this bug. However, if this issue does happen, one option is to use the oo-admin-repair --removed-node command to detect any missing nodes and get rid of gears on those missing/removed nodes. The correct flag is "removed-nodes" oo-admin-repair --removed-nodes I guess it isn't a bug of gear replacement. Shall the rollback feature cover it? By the way, oo-admin-repair --removed-nodes can't remove this type of apps. Get error message as below: [root@br215 openshift]# oo-admin-repair --removed-nodes Started at: 2014-04-08 02:13:09 UTC Total gears found in mongo: 22 Servers that are unresponsive: Server: nd216.ose-201403281.com (district: NONE), Confirm [yes/no]: yes Some servers are unresponsive: nd216.ose-201403281.com Found 1 unresponsive unscalable apps: lessnode (id: 534359df307b9b0d13000001) These apps can not be recovered. Do you want to delete all of them [yes/no]: yes Finished at: 2014-04-08 02:14:35 UTC Total time: 86.694s Unable to delete application with id: 534359df307b9b0d13000001, error: Unable to perform action on app object. Another operation is already running. FAILED (In reply to Anping Li from comment #6) > Unable to delete application with id: 534359df307b9b0d13000001, error: > Unable to perform action on app object. Another operation is already running. That indicates there is a lock on the application or domain, which is in Mongo. It expires after (I think) half an hour. Would be nice if oo-admin-repair could knock that out too. This may be a reasonable fix to prevent this issue. https://github.com/openshift/origin-server/pull/5366/files Cherry-picked from origin-server: commit c2264b5c4adad5a8ac91102492484674efba6000 Author: Abhishek Gupta <abhgupta> Date: Thu May 1 14:17:51 2014 -0700 Bug 1093804: Validating the node returned by the gear-placement plugin Verified and pass on OSE-2.1.z-2014-06-12.2 1) Customizing the Gear Placement Algorithm and create one app. hanli1@broker ~]$ rhc apps|grep '@ h' php @ http://php-hanli1dom.example.com/ (uuid: 539a8e19be1f289f88000009) 2) Modify gear_placement_plugin.rb and return an invalid node name cat /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-gear-placement-0.1/lib/openshift/gear_placement_plugin.rb|grep return # return server_infos.first return NodeProperties.new("Hostname") 3) service openshift-broker restart and oo-admin-broker-cache -c 4) create new app, app failed due to invalid node. [hanli1@broker ~]$ rhc app create php54 php-5.4 Application Options ------------------- Domain: hanli1dom Cartridges: php-5.4 Gear Size: default Scaling: no Creating application 'php54' ... Unable to complete the requested operation due to: Invalid node selected Reference ID: 05a2c8b1bc4d359ce3f393d5da98a6ad 5) No residual data are left in mongodb and dns server. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0781.html |