Description of problem: Try to start a stopped aerogear app or git push the app, got the error "Could not connect to WildFly management interface, skipping deployment verification" [root@Daphne test]# rhc app stop push1s RESULT: push1s stopped [root@Daphne test]# rhc app start push1s Could not connect to WildFly management interface, skipping deployment verification RESULT: push1s started remote: Stopping aerogear-push cart remote: Sending SIGTERM to wildfly:32435 ... remote: Stopping MySQL 5.5 cartridge remote: /usr/bin/oo-exec-ruby: line 8: /bin/rpm: Permission denied remote: Building git ref 'master', commit d66a4b6 remote: Preparing build for deployment remote: Deployment id is e3425bda remote: Activating deployment remote: Starting MySQL 5.5 cartridge remote: /usr/bin/oo-exec-ruby: line 8: /bin/rpm: Permission denied remote: Deploying WildFly remote: ls: cannot access /var/lib/openshift/5440be1dd20b7de1e2000003/app-root/runtime/repo//deployments: No such file or directory remote: Starting aerogear-push cart remote: Found 127.1.249.1:8080 listening port remote: Found 127.1.249.1:9990 listening port remote: CLIENT_MESSAGE: Could not connect to WildFly management interface, skipping deployment verification remote: ------------------------- remote: Git Post-Receive Result: success remote: Activation status: success remote: Deployment completed with status: success To ssh://5440be1dd20b7de1e2000003.rhcloud.com/~/git/push1.git/ 22f9be2..d66a4b6 master -> master Version-Release number of selected component (if applicable): devenv_5242 How reproducible: always Steps to Reproduce: 1. Create aerogear app from website 2. rhc app stop $app 3. rhc app start $app 4. Make some change and git push Actual results: Same as description Expected results: App could start normally after stopped Additional info:
remote: /usr/bin/oo-exec-ruby: line 8: /bin/rpm: Permission denied I think this is due to this: https://github.com/openshift/origin-server/blob/master/util-scl/oo-exec-ruby#L8 We also call oo-exec-ruby when we do oo-erb, which is used to process a lot of gear ERB files. Adam, can we use something else than RPM to get the system ruby version? (/usr/bin/ruby -v?)
This might be related to the bug above, QA can you please re-test?
Retest on devenv_5248, the issue still can be reproduced after the "remote: /usr/bin/oo-exec-ruby: line 8: /bin/rpm: Permission denied" bug fixed. (https://bugzilla.redhat.com/show_bug.cgi?id=1153889) remote: Stopping aerogear-push cart remote: Sending SIGTERM to wildfly:701 ... remote: Stopping MySQL 5.5 cartridge remote: Building git ref 'master', commit fd897c9 remote: Preparing build for deployment remote: Deployment id is 3a212416 remote: Activating deployment remote: Starting MySQL 5.5 cartridge remote: Deploying WildFly remote: ls: cannot access /var/lib/openshift/5444b91f6f9958d456000003/app-root/runtime/repo//deployments: No such file or directory remote: Starting aerogear-push cart remote: Found 127.1.245.1:8080 listening port remote: Found 127.1.245.1:9990 listening port remote: CLIENT_MESSAGE: Could not connect to WildFly management interface, skipping deployment verification remote: ------------------------- remote: Git Post-Receive Result: success remote: Activation status: success remote: Deployment completed with status: success To ssh://5444b91f6f9958d456000003.rhcloud.com/~/git/push1.git/
Seems like the commit was not there yet, can you please re-test: This seems to be merged now: https://github.com/openshift/origin-server/pull/5883
Sorry, wrong BZ ;-)
PR (to fix the 'ls' error message). https://github.com/aerogear/openshift-origin-cartridge-aerogear-push/pull/9 I'm not sure about the Wildfly error. Farah?
The error message [1] just indicates that the deployment scanner hasn't finished running yet. Notice that the getscanconfig method [2] only attempts to get the deployment scanner configuration a certain number of times and if it hasn't finished running by then, the deployment verification step gets skipped. Note though that ag-push.war and auth-server.war do still get deployed successfully though. It looks like increasing the number of attempts made in the getscanconfig method should improve things but it might take some testing to figure out what number of attempts would be good to use. [1] https://github.com/aerogear/openshift-origin-cartridge-aerogear-push/blob/master/bin/control#L40 [2] https://github.com/aerogear/openshift-origin-cartridge-aerogear-push/blob/master/bin/control#L18
I'm fine with increasing the number of scans for now (to fix this issue). Do you want me to do a PR for this? Also we can make it configurable on the top of the control file: DEPLOYMENT_SCAN_TIMEOUT=N Also if this is not an error and the war files get deployed anyway, perhaps we should not show the error to users, just give them warning about scanner was not able to deploy the wars in time.
Thanks, Michal - a PR would be great. A warning message instead of an error message is a good idea as well.
Merged Michal's PR: https://github.com/aerogear/openshift-origin-cartridge-aerogear-push/pull/9
Test on devenv_5256 The PR seems only fixed the issue for git push remote: Stopping aerogear-push cart remote: Sending SIGTERM to wildfly:20854 ... remote: Syncing git content to other proxy gears remote: Building git ref 'master', commit 21a6dc8 remote: Preparing build for deployment remote: Deployment id is 87dda726 remote: Activating deployment remote: HAProxy already running remote: HAProxy instance is started remote: Deploying WildFly remote: WARNING: The ./deployments directory not found, skipping sync. remote: Starting aerogear-push cart remote: Found 127.1.245.129:8080 listening port remote: Found 127.1.245.129:9990 listening port remote: /var/lib/openshift/54475ea051153f0b85000058/aerogear-push/standalone/deployments /var/lib/openshift/54475ea051153f0b85000058/aerogear-push remote: /var/lib/openshift/54475ea051153f0b85000058/aerogear-push remote: CLIENT_MESSAGE: Artifacts deployed: ./auth-server.war ./ag-push.war remote: ------------------------- remote: Git Post-Receive Result: success remote: Activation status: success remote: Deployment completed with status: success To ssh://54475ea051153f0b85000058.rhcloud.com/~/git/push1s.git/ When start a stopped app, the error still shown. [root@openshift test]# rhc app stop push1a RESULT: push1a stopped [root@openshift test]# rhc app start push1a Could not connect to WildFly management interface, skipping deployment verification RESULT: push1a started
Farah, I think we can consider the above as not an error, right? I don't get why the message is printed out, it seems to use the same logic as for git push. Maybe increase the timeout more than 20 seconds?
Yes, I agree that we can consider the above as not an error. The logic being used for git push does seem the same as stopping and starting an app. However, if the deployment verification step consistently gets skipped when stopping and starting an app, increasing the timeout seems reasonable.
Yan Du: Given the above, I think we can move this bug to VERIFIED as the stop/start is not a bug (and might be fixed with increasing the timeout).
Move bug to verified according the above comments.