This particular bug is about domain env vars being left around in Mongo when a Jenkins creation failed to complete and was rolled back, which blocks the creation of another Jenkins app. That problem should be fixed upstream and in OSE 2.2 (though the last few comments on the upstream bug leave room for some doubt). (In reply to Miheer Salunke from comment #2) > I gave the customer 2 workarounds - > > 1. > Try > oo-admin-ctl-domain -c env_del -e JENKINS_USER -l <username> -n <domain> ; > oo-admin-ctl-domain -c env_del -e JENKINS_PASSWORD -l <username> -n <domain> > > > After trying option 1, the results were as follows - > Still get same error when adding Jenkins: > > Unexpected error: Cartridge attempted to override the following gear > environment variables: JENKINS_USERNAME, JENKINS_PASSWORD I might add JENKINS_URL to the list of vars to delete, although the error isn't complaining about it. So, that should have worked; given that it didn't, I would be interested to see the domain record for the faulty domain after deleting with oo-admin-ctl-domain, similar to https://bugzilla.redhat.com/show_bug.cgi?id=1126826#c12 > 2. > Try deleting and recreating the domain using the rhc tools which should > clear things up. > > Trying option 2 gives the following result and error: > > Jenkins created successfully. Please make note of these credentials: > [...] > Unable to complete the requested operation. Show less > > Shell command '/sbin/runuser -s /bin/sh 5486e2883431a02b0e000001 -c > "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c5,c591' /bin/sh > -c \"/var/lib/openshift/5486e2883431a02b0e000001/jenkins-client/bin/install > --version 1\""' returned an error. rc=1 > Could not disable job 'appname-build' in Jenkins server: Continuing > anyway. > Could not add job 'appname-build' in Jenkins server: create_job > status: 000 You'll need to correct this error before attempting to embed the > Jenkins client again. application is not scalable This seems like an unrelated bug. The Jenkins app was actually successfully created given a clean domain. However it didn't handle a client correctly. Could you file that as a separate bug? > The above customer has following jenkins installed- > > openshift-origin-cartridge-jenkins-1.20.3.6-1.el6op.noarch Wed Nov 12 > 16:20:10 2014 > openshift-origin-cartridge-jenkins-client-1.19.3.5-1.el6op.noarch Wed Nov 12 > 16:23:16 2014 These indicate the use of OSE 2.1 which does not have the fix for this particular bug. They ought to be able to work around it as you indicated but it will potentially recur whenever a Jenkins app creation fails partway through.
We are also seeing odd behavior from `oo-admin-ctl-domain`. Unsure if this is related or should be another bug. Specifically, oo-admin-ctl-domain will not remove the jeknins environment variables: $ sudo oo-admin-ctl-domain -c env_del -e JENKINS_PASSWORD -n exdomain -l exuser Environment variable JENKINS_PASSWORD not found withing domain exdomain $ sudo oo-admin-ctl-domain -c env_add -e JENKINS_PASSWORD -v secret -n exdomain -l exuser Environment variable with name JENKINS_PASSWORD already exists in the domain. Please remove it first We can check the database and see that the environment variables are still present. The only way we were able to remove them was manually through the rails console: $ oo-exec-ruby irb > require '/var/www/openshift/broker/config/environment' > my_domain = Domain.find_by(:canonical_namespace => "exdomain") > my_domain.update_attributes(:env_vars => []) Removing the env variables successfully works around the issue.
Check on puddle[2.2.4/2015-02-02.1] 1. create a jenkins app 2. stop the ruby193-mcollective service as soon as possible, if found env_vars is written in domain 3. clear the pending op #oo-admin-clear-pending-ops 0 applications were cleaned up. 0 users were cleaned up. 1 domains were cleaned up. 0 teams were cleaned up. 4. create jenkins app again fail to create app: Unexpected error: Cartridge attempted to override the following gear environment variables: JENKINS_USERNAME, JENKINS_PASSWORD 5. check the domain date in mongo > db.domains.findOne() "env_vars" : [ { "key" : "JENKINS_URL", "value" : "https://jenkins-xiaom.ose22-manual.com.cn/", "component_id" : ObjectId("54d18db8e5fed51569000088"), "unique" : false }, { "key" : "JENKINS_USERNAME", "value" : "system_builder", "component_id" : ObjectId("54d18db8e5fed51569000088"), "unique" : false }, { "key" : "JENKINS_PASSWORD", "value" : "p79_rXfWBluw", "component_id" : ObjectId("54d18db8e5fed51569000088"), "unique" : false } ], The env_vars is not cleared.
(In reply to Ma xiaoqiang from comment #9) > Check on puddle[2.2.4/2015-02-02.1] > > 1. create a jenkins app > 2. stop the ruby193-mcollective service as soon as possible, if found > env_vars is written in domain > 3. clear the pending op > #oo-admin-clear-pending-ops > 0 applications were cleaned up. 0 users were cleaned up. 1 domains were > cleaned up. 0 teams were cleaned up. > > 4. create jenkins app again > fail to create app: Unexpected error: Cartridge attempted to override the > following gear environment variables: > JENKINS_USERNAME, JENKINS_PASSWORD > > 5. check the domain date in mongo > > db.domains.findOne() > "env_vars" : [ > { > "key" : "JENKINS_URL", > "value" : "https://jenkins-xiaom.ose22-manual.com.cn/", > "component_id" : ObjectId("54d18db8e5fed51569000088"), > "unique" : false > }, > { > "key" : "JENKINS_USERNAME", > "value" : "system_builder", > "component_id" : ObjectId("54d18db8e5fed51569000088"), > "unique" : false > }, > { > "key" : "JENKINS_PASSWORD", > "value" : "p79_rXfWBluw", > "component_id" : ObjectId("54d18db8e5fed51569000088"), > "unique" : false > } > ], > > The env_vars is not cleared. run 'oo-admin-ctl-domain -c env_del -e JENKINS_USERNAME -n xiaom -l xiaom', then create jenkins app again, still fail to create it. The jenkins envs in mongo are not deleted.
In order to delete Jenkins env vars from the domain, I found that I had to patch the script to delete by component ID: --- /usr/sbin/oo-admin-ctl-domain 2015-02-19 14:33:38.000000000 -0500 +++ oo-admin-ctl-domain.bz1171815 2015-04-10 13:47:09.962283575 -0400 @@ -136,7 +136,11 @@ exit 1 end - domain.remove_env_variables([{"key" => env["key"], "value" => env["value"]}]) + if env['component_id'].nil? + domain.remove_env_variables([{"key" => env["key"], "value" => env["value"]}]) + else + domain.remove_env_variables(env['component_id']) + end domain.reload domain.run_jobs when "create"
I should mention that we may want some sort of command-line switch for the patch above, to avoid having admins delete several related variables at once when that was not their intent.
Andy, I made a PR with your patch and I added --remove_all_env option to confirm this action: https://github.com/openshift/origin-server/pull/6134
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/584a456d0135cbc0d43995b0407e6f1d16531a57 Add fixing orphaned domain environment variables Bug 1171815 Bugzilla link https://bugzilla.redhat.com/show_bug.cgi?id=1171815 It is possible, in some cases, for a domain environment variable to remain part of the domain after the related component has been removed. oo-admin-repair should be able to detect and clear these orphaned environment variables
Check on openshift-origin-broker-util-1.35.2.4-1.el6op.noarch # oo-admin-repair --orphaned-envs --domain xiaom /usr/sbin/oo-admin-repair: unrecognized option `--orphaned-envs
Check on openshift-origin-broker-util-1.36.2-1.el6oso.noarch.rpm After repairing the env. QE can create jenkins app again. If dev add the package to OSE puddle, QE will move it to VERIFIED as soon as possible.
Just as what is said in comment 25, according to our work flow, if want QE to verify this bug fix, please create OSE puddle for QE's verification. So I will move this bug into "MODIFIED" status. BTW, the fix package in comment 24, "oso" is in its name, I guess it is target for origin, not OSE. I think need build a new "ose" rpm package, and include it in puddle.
Check on puddle [2.2.7/2015-09-21.1] After orphaning envs in the domain, create jenkins app successfully.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1844.html