Bug 1171815
| Summary: | Cannot create Jenkins cartridge | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Miheer Salunke <misalunk> |
| Component: | Node | Assignee: | Timothy Williams <tiwillia> |
| Status: | CLOSED ERRATA | QA Contact: | libra bugs <libra-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.2.0 | CC: | abhgupta, adellape, agoldste, agrimm, bleanhar, erich, jhou, jialiu, jokerman, lars.moeberg, libra-bugs, libra-onpremise-devel, lxia, mfojtik, mmccomas, nicholas_schuetz, ssantos, tiwillia, xiama |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openshift-origin-broker-util-1.36.2.2-1.el6op | Doc Type: | Bug Fix |
| Doc Text: |
It is possible that when a Jenkins application fails to create and is rolled back, its domain environment variables still exist. These map to a non-existing gear component, and any new Jenkins applications cannot be created since the environment variables already exist. This bug fix updates the `oo-admin-repair` command to add the ability to clean up domains that have Jenkins environment variables with missing components. An administrator can use the `--orphaned-envs` switch with `oo-admin-repair` to clean environment variables from domains that do not have a related component. An administrator can also use `--domain <domain>` to specify a specific domain to repair.
|
Story Points: | --- |
| Clone Of: | 1126826 | Environment: | |
| Last Closed: | 2015-09-30 16:36:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1126826 | ||
| Bug Blocks: | 1131011 | ||
|
Comment 5
Luke Meyer
2014-12-12 19:31:34 UTC
We are also seeing odd behavior from `oo-admin-ctl-domain`. Unsure if this is related or should be another bug. Specifically, oo-admin-ctl-domain will not remove the jeknins environment variables: $ sudo oo-admin-ctl-domain -c env_del -e JENKINS_PASSWORD -n exdomain -l exuser Environment variable JENKINS_PASSWORD not found withing domain exdomain $ sudo oo-admin-ctl-domain -c env_add -e JENKINS_PASSWORD -v secret -n exdomain -l exuser Environment variable with name JENKINS_PASSWORD already exists in the domain. Please remove it first We can check the database and see that the environment variables are still present. The only way we were able to remove them was manually through the rails console: $ oo-exec-ruby irb > require '/var/www/openshift/broker/config/environment' > my_domain = Domain.find_by(:canonical_namespace => "exdomain") > my_domain.update_attributes(:env_vars => []) Removing the env variables successfully works around the issue. Check on puddle[2.2.4/2015-02-02.1] 1. create a jenkins app 2. stop the ruby193-mcollective service as soon as possible, if found env_vars is written in domain 3. clear the pending op #oo-admin-clear-pending-ops 0 applications were cleaned up. 0 users were cleaned up. 1 domains were cleaned up. 0 teams were cleaned up. 4. create jenkins app again fail to create app: Unexpected error: Cartridge attempted to override the following gear environment variables: JENKINS_USERNAME, JENKINS_PASSWORD 5. check the domain date in mongo > db.domains.findOne() "env_vars" : [ { "key" : "JENKINS_URL", "value" : "https://jenkins-xiaom.ose22-manual.com.cn/", "component_id" : ObjectId("54d18db8e5fed51569000088"), "unique" : false }, { "key" : "JENKINS_USERNAME", "value" : "system_builder", "component_id" : ObjectId("54d18db8e5fed51569000088"), "unique" : false }, { "key" : "JENKINS_PASSWORD", "value" : "p79_rXfWBluw", "component_id" : ObjectId("54d18db8e5fed51569000088"), "unique" : false } ], The env_vars is not cleared. (In reply to Ma xiaoqiang from comment #9) > Check on puddle[2.2.4/2015-02-02.1] > > 1. create a jenkins app > 2. stop the ruby193-mcollective service as soon as possible, if found > env_vars is written in domain > 3. clear the pending op > #oo-admin-clear-pending-ops > 0 applications were cleaned up. 0 users were cleaned up. 1 domains were > cleaned up. 0 teams were cleaned up. > > 4. create jenkins app again > fail to create app: Unexpected error: Cartridge attempted to override the > following gear environment variables: > JENKINS_USERNAME, JENKINS_PASSWORD > > 5. check the domain date in mongo > > db.domains.findOne() > "env_vars" : [ > { > "key" : "JENKINS_URL", > "value" : "https://jenkins-xiaom.ose22-manual.com.cn/", > "component_id" : ObjectId("54d18db8e5fed51569000088"), > "unique" : false > }, > { > "key" : "JENKINS_USERNAME", > "value" : "system_builder", > "component_id" : ObjectId("54d18db8e5fed51569000088"), > "unique" : false > }, > { > "key" : "JENKINS_PASSWORD", > "value" : "p79_rXfWBluw", > "component_id" : ObjectId("54d18db8e5fed51569000088"), > "unique" : false > } > ], > > The env_vars is not cleared. run 'oo-admin-ctl-domain -c env_del -e JENKINS_USERNAME -n xiaom -l xiaom', then create jenkins app again, still fail to create it. The jenkins envs in mongo are not deleted. In order to delete Jenkins env vars from the domain, I found that I had to patch the script to delete by component ID: --- /usr/sbin/oo-admin-ctl-domain 2015-02-19 14:33:38.000000000 -0500 +++ oo-admin-ctl-domain.bz1171815 2015-04-10 13:47:09.962283575 -0400 @@ -136,7 +136,11 @@ exit 1 end - domain.remove_env_variables([{"key" => env["key"], "value" => env["value"]}]) + if env['component_id'].nil? + domain.remove_env_variables([{"key" => env["key"], "value" => env["value"]}]) + else + domain.remove_env_variables(env['component_id']) + end domain.reload domain.run_jobs when "create" I should mention that we may want some sort of command-line switch for the patch above, to avoid having admins delete several related variables at once when that was not their intent. Andy, I made a PR with your patch and I added --remove_all_env option to confirm this action: https://github.com/openshift/origin-server/pull/6134 Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/584a456d0135cbc0d43995b0407e6f1d16531a57 Add fixing orphaned domain environment variables Bug 1171815 Bugzilla link https://bugzilla.redhat.com/show_bug.cgi?id=1171815 It is possible, in some cases, for a domain environment variable to remain part of the domain after the related component has been removed. oo-admin-repair should be able to detect and clear these orphaned environment variables Check on openshift-origin-broker-util-1.35.2.4-1.el6op.noarch # oo-admin-repair --orphaned-envs --domain xiaom /usr/sbin/oo-admin-repair: unrecognized option `--orphaned-envs Check on openshift-origin-broker-util-1.36.2-1.el6oso.noarch.rpm After repairing the env. QE can create jenkins app again. If dev add the package to OSE puddle, QE will move it to VERIFIED as soon as possible. Just as what is said in comment 25, according to our work flow, if want QE to verify this bug fix, please create OSE puddle for QE's verification. So I will move this bug into "MODIFIED" status. BTW, the fix package in comment 24, "oso" is in its name, I guess it is target for origin, not OSE. I think need build a new "ose" rpm package, and include it in puddle. Check on puddle [2.2.7/2015-09-21.1] After orphaning envs in the domain, create jenkins app successfully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1844.html |