Hide Forgot
Description of problem: After creating 1755 scalable php apps on a node, reboot the node and found not all of the apps get started correctly. Some of them are not available. Version-Release number of selected component (if applicable): 2.0/2013-11-26.1 (Before reboot the node, update package rubygem-openshift-origin-node to the latest version: rubygem-openshift-origin-node-1.17.5-3.el6op.noarch.rpm) How reproducible: always Steps to Reproduce: 1. Create scalable php apps as many as possible on a node, reboot the node. Finally, we got 1600 apps available after the node's restart, which means there're 155 apps fail to start successfully. After check the log, found there're 68 errors when starting the app, the error log in /var/log/openshift/node/platform.log: ... December 18 18:41:19 INFO Shell command '/sbin/runuser -s /bin/sh 5295dacf78e213a7d600247e -c "exec /usr/bin/runcon 'unconfined_u:system_r:openshift_t:s0:c1,c187' /bin/sh -c \"set -e; /var/lib/openshift/5295dacf78e213a7d600247e/haproxy/bin/control start \""' ran. rc=0 out=/opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/application.rb:330:in `kill': Operation not permitted (Errno::EPERM) from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/application.rb:330:in `stop' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/application_group.rb:135:in `block in stop_all' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/application_group.rb:131:in `each' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/application_group.rb:131:in `stop_all' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/controller.rb:74:in `run' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons.rb:139:in `block in run' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/cmdline.rb:105:in `call' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons/cmdline.rb:105:in `catch_exceptions' from /opt/rh/ruby193/root/usr/share/gems/gems/daemons-1.0.10/lib/daemons.rb:138:in `run' from /var/lib/openshift/5295dacf78e213a7d600247e/haproxy/usr/bin/haproxy_ctld_daemon.rb:21:in `<main>' ERROR: there is already one or more instance(s) of the program running HAProxy instance is started ... Log into some apps which are unavailable, check its processes. Some of them did not have the httpd process: [app1722-name1722.scalability.com 52af1c0d78e213ff39000b04]\> ps -ef UID PID PPID C STIME TTY TIME CMD 2721 21972 1 0 10:03 ? 00:00:06 haproxy_ctld.rb 2721 24602 1 0 Dec18 ? 00:00:07 /usr/sbin/haproxy -f /var/lib/openshift/52af1c0d78e213ff39000b04/haproxy//conf/haproxy.cfg 2721 29035 29015 1 12:54 ? 00:00:00 sshd: 52af1c0d78e213ff39000b04@pts/5 2721 29038 29035 5 12:54 pts/5 00:00:00 /bin/bash --init-file /usr/bin/rhcsh -i 2721 29352 29038 0 12:54 pts/5 00:00:01 ps -ef while some of them seems to get stuck in the "start" process: [app1709-name1709.scalability.com 52aeead378e213ff39000837]\> ps -ef UID PID PPID C STIME TTY TIME CMD 2708 8247 1 0 Dec18 ? 00:00:00 /bin/bash -e /var/lib/openshift/52aeead378e213ff39000837/haproxy/bin/control start 2708 27363 27326 0 12:53 ? 00:00:00 sshd: 52aeead378e213ff39000837@pts/5 2708 27366 27363 6 12:53 pts/5 00:00:00 /bin/bash --init-file /usr/bin/rhcsh -i 2708 27686 27366 0 12:53 pts/5 00:00:01 ps -ef 2708 27687 8247 0 12:53 ? 00:00:00 /bin/bash -e /var/lib/openshift/52aeead378e213ff39000837/haproxy/bin/control start 2708 27688 27687 0 12:53 ? 00:00:00 /bin/bash -e /var/lib/openshift/52aeead378e213ff39000837/haproxy/bin/control start 2708 27689 27687 0 12:53 ? 00:00:00 cut -f 3 -d , 2708 27690 27688 0 12:53 ? 00:00:00 scl enable ruby193 ruby /usr/bin/oo-gear-registry web 2708 27692 27690 0 12:53 ? 00:00:00 /bin/bash /var/tmp/sclstWYWd 2708 27703 27692 0 12:53 ? 00:00:00 ruby /usr/bin/oo-gear-registry web After restart the unavailable apps, they come back to normal and become available again. Actual results: Expected results: All the apps should get started after the node rebooting Additional info:
OpenShift Enterprise v2 has officially reached EoL. This product is no longer supported and bugs will be closed. Please look into the replacement enterprise-grade container option, OpenShift Container Platform v3. https://www.openshift.com/container-platform/ More information can be found here: https://access.redhat.com/support/policy/updates/openshift/