Description of problem: "git push origin" dies while pushing changes to leaf gears of scaled Rails app. Version-Release number of selected component (if applicable): www.openshift.com as of May 13, 2013 How reproducible: Very often, especially when new gems / new version of gems are introduced to Gemfile Steps to Reproduce: 1. create scaled Rails app on openshift.com (preferably with small gears) 2. change Gemfile & Gemfile.lock in your local to one in attachment 3. git push origin Actual results: Deployment dies with following snippet: --------------------snip ------------------------ remote: Your bundle is complete! It was installed into ./vendor/bundle remote: Precompiling with 'bundle exec rake assets:precompile' remote: Running .openshift/action_hooks/build remote: Running .openshift/action_hooks/build remote: MySQL already running remote: Running .openshift/action_hooks/deploy remote: Database server found at 518ad7bb4382ec74f8000005-ag47.rhcloud.com. initializing... remote: SSH_CMD: ssh 518c6b5f5973caa19000003d.70.78 remote: SSH_CMD: ssh 518c6b5f5973caa19000003e.147.104 Read from remote host rails-ag47.rhcloud.com: Connection reset by peer fatal: The remote end hung up unexpectedly error: error in sideband demultiplexer To ssh://518ad7bb4382ec74f8000002.com/~/git/rails.git/ fdf3694..cab3939 master -> master error: failed to push some refs to 'ssh://518ad7bb4382ec74f8000002.com/~/git/rails.git/' My ~/.ssh/config: Host rails-ag47.rhcloud.com User 518ad7bb4382ec74f8000002 IdentityFile ~/.ssh/id_rsa.OpenShift_AG47 ServerAliveInterval 180 ServerAliveCountMax 10 Expected results: Successful deployment of application Additional info: I believe the issue is SSH configuration on head gear. I do not have acceess to ~/.ssh/config there, hence can not be sure. When I login to head gear after failure and execute "rsync ruby-1.9/repo/ <my leaf gear>:reuby-1.9/repo/" there are still some files that are not syncronized between gears. Since my vendor/bundle is bigger than 100Mb it takes some significant amount of time to zip its content, ship it and unzip on leaf gear. rsync to both leaf grears in this case is started in parallel therefore 2 gzip sessions are zipping 100Mb at the same time on head gear.
Created attachment 747694 [details] Gemfile before change
Created attachment 747695 [details] Gemfile.lock before change
Created attachment 747696 [details] Gemfile after change
Created attachment 747697 [details] Gemfile.lock after change
Did you modify the post-receive / pre-receive hooks on the gear? Also, another possibility is corrupted repository https://help.github.com/articles/fixing-egit-corruption
Hi Mrunal, 1) No I did not modify those hooks. 2) My repository seems to be OK (not corrupted) because I can use "push" and "pull" from my desktop. I believe, the issue is delays between gears when there is no SSH traffic between them for longer then default interval. Hence, solution could be as simple as adding 2 lines into ~/.ssh/config on head gear for communication with all leaf gears: ServerAliveInterval 180 ServerAliveCountMax 10 Of course, exact values should be in agreement with OS admins. Unfortunately, I do not have access to this file and can not prove or disprove the theory. Thanks, Boris
I modified the delayed_job calls to be nohup'ed and git push worked fine for the app. Could you git clone from the app and try a push again?
I did it without success. See my report in bug #962807 Thanks, Boris
Boris, It looks like git push itself is failing because of a timeout. Could you take out ServerAliveInterval and ServerAliveCountMax or set them to a much higher value? I don't have those settings and git push worked fine for me. It did take ~7min for it to complete - much higher than 180 seconds. remote: https://www.openshift.com/legal remote: remote: ********************************************************************* remote: remote: Welcome to OpenShift shell remote: remote: This shell will assist you in managing OpenShift applications. remote: remote: !!! IMPORTANT !!! IMPORTANT !!! IMPORTANT !!! remote: Shell access is quite powerful and it is possible for you to remote: accidentally damage your application. Proceed with care! remote: If worse comes to worst, destroy your application with 'rhc app delete' remote: and recreate it remote: !!! IMPORTANT !!! IMPORTANT !!! IMPORTANT !!! remote: remote: Type "help" for more info. remote: remote: Starting services remote: WARNING: This ssh terminal was started without a tty. remote: It is highly recommended to login with: ssh -t remote: Done remote: + for rpccall in '"${OPENSHIFT_SYNC_GEARS_POST[@]}"' remote: + ssh 518c6b5f5973caa19000003e.147.104 post_deploy.sh remote: Running .openshift/action_hooks/post_deploy remote: Exit code: 0 remote: hot_deploy_added=false remote: Done remote: Running .openshift/action_hooks/post_deploy To ssh://518ad7bb4382ec74f8000002.com/~/git/rails.git/ dc83e85..a75ad63 master -> master real 7m8.676s user 0m0.033s sys 0m0.039s [mrunal@localhost rails]$ Thanks, Mrunal
Hi Mrunal, Thanks for update. I have updated my SSH configuration (~/.ssh/config) to: ServerAliveInterval 60 ServerAliveCountMax 15 and tested it twice. Both "git push origin" were successful. Let me check it few more times from different locations (network configurations: eg, wired, wireless, ...). I also found that yui-compressor gem requires java JDK. This could cause a slowdown on OpenShift side when it starts on small gear that is short on memory. Thanks again, Boris
Hi Mrunal, I tested "git push" 2 more times (even over celluar network by pairing with iPhone). Both times it was successful. So, now I have tried it 4 times with 100% successful rate. Please do not close the bug for another week, so I can try it even more times. Thanks, Boris
Hi Mrunal, Could you please check same deployment (as in Comment #9) against scaled Rails application deployed on 2 small gears? rails-ag47.rhcloud.com uses 3 medium gears. Very often "git push" was dying during "rake assets:precomile" phase. This is when Rails application gets all static assets (JS, CSS) minimized, compiled and zipped. On a background 'yui-compressor' gem starts Java JDK to carry some tasks. On a small gear with 512Mb RAM this could take significant amount of time. During this phase there is no communication between my desktop and OpenShift gear. This is why SSH channel would timeout after 3 minutes (default configuration). By setting those ServerAlive* parameters I can rectify it. Another "Achille's Heel" of the deployment is phase of syncing between head and leaf gears. Here application code gets pushed by "rsync" between gears. If it is new deployment (or significant change in Gemfile) with a number of gems, then it could take significant amount of time. For example, in my application there is more than 100Mb of gems files in "vendor/bundle/ruby/1.9.1/gems/" directory. Current "rsync" is configured to compress all data between gears, hence it should compress 100Mb of files, ship it over to another gear and decompress it there. All is done via SSH, hence add encryption & decryption of network data. If I have 3 gear-application, then this task is done in 2 parallel streams, one for each leaf gear. Needless to say, that it can take more than 3 minutes and here SSH channel between gears will timeout. To make long story short. Would you consider setting keepalive pings into default SSH gear configuration? It can be done as SSH server configuration, or as SSH client (~/.ssh/config) configuration. I would prefer 2nd one. Thanks, Boris
Some of the enhancements that we are making to the platform will handle these cases better. Lowering severity as this isn't considered a blocker for the next release.
Move it back to assigned status since there's still some enhancements.
Boris, Are you seeing any issues with your application? Thanks, Mrunal
I am going to close the bug for now. Please re-open or open a new bug if you see the issue again.