| Summary: | oo-admin-move uses wrong UUID when doing a move | ||
|---|---|---|---|
| Product: | OpenShift Online | Reporter: | Troy Dawson <tdawson> |
| Component: | Containers | Assignee: | Rob Millner <rmillner> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | libra bugs <libra-bugs> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 2.x | CC: | bmeng, mfisher, rmillner |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-09-19 16:48:46 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Troy Dawson
2013-08-30 14:01:33 UTC
Evidently the <WRONG APP UUID> is a red herring. I just retried this on some of the failed moves. One of them successfully moved the correct app, even with the <WRONG APP UUID>. One of them continued to fail with the same failures, but they had <CORRECT APP UUID> Further Investigation: The node is not returning the right information. If we dont' try the move, but just do a status, we get the following. Normal App: # oo-admin-ctl-app -l <LOGIN ID> -a <APP NAME> -c status Application is either stopped or inaccessible # echo $? 0 Failing App: # oo-admin-ctl-app -l <LOGIN ID> -a <APP NAME> -c status DEBUG OUTPUT: Failed to execute: 'control status' for /var/lib/openshift/<UUID>/python Command return code: 7 Success # echo $? 0 Side Note: The vast majority of these are python, but not all. All of these are python apps. I have managed to move all of the non-python apps. There have only been a handful (10 out of 2000+) python apps that were successfully moved, while all the rest give this "Command return code: 7" error. The following two commits (master, stage) fix the "Command return code: 7" error. It was due to the exit code of the curl commands interacting poorly with the control script running with "-e". https://github.com/openshift/origin-server/pull/3529 https://github.com/openshift/origin-server/pull/3530 The above commits should have resolved the underlying issue in the ticket. Moving to Q/E. From the IRC discussion, it appears as though the problem is not resolved. Release ticket updated to request an mcollective reload. Tested on devenv-stage_461 with mutli-node.
Moved about 50 gears with all python version combined. No error found.
And checking status for the python app can get the correct result.
[root@ip-10-152-133-8 ~]# oo-admin-ctl-app -l bmeng -a py271 -c status
Application is running
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
178 178 178 178 0 0 12241 0 --:--:-- --:--:-- --:--:-- 13692
Total Accesses: 0
Total kBytes: 0
Uptime: 5199
ReqPerSec: 0
BytesPerSec: 0
BusyWorkers: 1
IdleWorkers: 0
Scoreboard: W...........................................................
[root@ip-10-152-133-8 ~]# echo $?
0
Move bug to verified.
|