Description of problem: A gear failed to move due to a time out in mongo. This left the gear in a bad state as DNS was still pointed at the new server. Fri May 3 15:30:41 EDT 2013 URL: http://project-liveC91f6895f9ef.rhcloud.com Login: liveC91f6895f9ef App UUID: 515792ec4382ec6f120000f3 Gear UUID: 515792ec4382ec6f120000f3 DEBUG: Source district uuid: cc37a161477b4ca2a68b331ac138c4ba DEBUG: Destination district uuid: cc37a161477b4ca2a68b331ac138c4ba DEBUG: Getting existing app 'project' status before moving DEBUG: Gear component 'diy-0.1' was stopped DEBUG: Creating new account for gear 'project' on ex-c9-node24.prod.rhcloud.com DEBUG: Moving content for app 'project', gear 'project' to ex-c9-node24.prod.rhcloud.com Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa) Write failed: Broken pipe Agent pid 14357 unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; echo Agent pid 14357 killed; DEBUG: Moving system components for app 'project', gear 'project' to ex-c9-node24.prod.rhcloud.com Identity added: /var/www/openshift/broker/config/keys/rsync_id_rsa (/var/www/openshift/broker/config/keys/rsync_id_rsa) Agent pid 30361 unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; echo Agent pid 30361 killed; DEBUG: Fixing DNS and mongo for gear 'project' after move DEBUG: Changing server identity of 'project' from 'ex-c9-node23.prod.rhcloud.com' to 'ex-c9-node24.prod.rhcloud.com' DEBUG: Moving failed. Rolling back gear 'project' 'project' with remove-httpd-proxy on 'ex-c9-node24.prod.rhcloud.com' DEBUG: Moving failed. Rolling back gear 'project' in 'project' with destroy on 'ex-c9-node24.prod.rhcloud.com' /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/sockets/connectable.rb:45:in `read': Connection timed out (Errno::ETIMEDOUT) from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/sockets/connectable.rb:45:in `block in read' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/sockets/connectable.rb:78:in `handle_socket_errors' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/sockets/connectable.rb:45:in `read' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:177:in `read_data' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:99:in `block in read' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:202:in `with_connection' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:97:in `read' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/protocol/query.rb:148:in `receive_replies' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:135:in `block in receive_replies' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:134:in `map' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/connection.rb:134:in `receive_replies' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:561:in `block (2 levels) in flush' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:129:in `ensure_connected' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:559:in `block in flush' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:574:in `logging' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:558:in `flush' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:547:in `process' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:71:in `command' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/node.rb:400:in `refresh' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/cluster.rb:168:in `block in refresh' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/cluster.rb:181:in `each' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/cluster.rb:181:in `refresh' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/cluster.rb:134:in `nodes' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/cluster.rb:202:in `with_primary' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/session/context.rb:108:in `with_node' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/session/context.rb:50:in `command' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/database.rb:76:in `command' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/session.rb:78:in `command' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/query.rb:239:in `block in modify' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/session.rb:312:in `with' from /opt/rh/ruby193/root/usr/share/gems/gems/moped-1.3.2/lib/moped/query.rb:238:in `modify' from /opt/rh/ruby193/root/usr/share/gems/gems/mongoid-3.0.21/lib/mongoid/contextual/find_and_modify.rb:44:in `result' from /opt/rh/ruby193/root/usr/share/gems/gems/mongoid-3.0.21/lib/mongoid/contextual/mongo.rb:185:in `find_and_modify' from /opt/rh/ruby193/root/usr/share/gems/gems/mongoid-3.0.21/lib/mongoid/contextual.rb:18:in `find_and_modify' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.7.11/app/models/lock.rb:127:in `unlock_application' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.7.11/app/models/application.rb:1148:in `ensure in run_in_application_lock' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.7.11/app/models/application.rb:1148:in `run_in_application_lock' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-msg-broker-mcollective-1.7.5/lib/openshift/mcollective_application_container_proxy.rb:1696:in `move_gear_secure' from /usr/sbin/oo-admin-move:112:in `<main>' Version-Release number of selected component (if applicable): Current. How reproducible: This would be difficult to get the exact timing correct but I believe it is reproducible. Steps to Reproduce: 1. Create an application. 2. Move the application. 3. During the move, stop the mongo database. 4. Verify the application cleans itself up properly or flags itself as in a bad state. Actual results: The application is in a bad state without any ability to recover without manual intervention. Expected results: Should attempt a retry, a sleep, or some sort of a back off algorithm in order to leave the application in a working state. Additional info: Rerunning the move fixed this application immediately.
Lowering severity since re-running the oo-admin-move fixed the issue.
This happened because the connection to mongo failed. Not sure what can be done in such cases. Marking this as not-reproducible for now. If this happens more, often will dig deeper at that time.
if the mongoDB have shutdown during the move, this issue will be reproduced. this situation is a very small degree,please developer help check if need to fix this issue or close it.thx
I don't think there is anything that can be done about this outside of manual intervention. The gear move does throw an error and Admin/Ops will be required to investigate and fix.